October 19, 2019

2664 words 13 mins read

Paper Group ANR 124

Edit Probability for Scene Text Recognition. Learning Graph Embeddings from WordNet-based Similarity Measures. Features for Multi-Target Multi-Camera Tracking and Re-Identification. Convolutional neural networks in phase space and inverse problems. Learn and Pick Right Nodes to Offload. Attention-based sequence-to-sequence model for speech recognit …

Edit Probability for Scene Text Recognition


Title	Edit Probability for Scene Text Recognition
Authors	Fan Bai, Zhanzhan Cheng, Yi Niu, Shiliang Pu, Shuigeng Zhou
Abstract	We consider the scene text recognition problem under the attention-based encoder-decoder framework, which is the state of the art. The existing methods usually employ a frame-wise maximal likelihood loss to optimize the models. When we train the model, the misalignment between the ground truth strings and the attention’s output sequences of probability distribution, which is caused by missing or superfluous characters, will confuse and mislead the training process, and consequently make the training costly and degrade the recognition accuracy. To handle this problem, we propose a novel method called edit probability (EP) for scene text recognition. EP tries to effectively estimate the probability of generating a string from the output sequence of probability distribution conditioned on the input image, while considering the possible occurrences of missing/superfluous characters. The advantage lies in that the training process can focus on the missing, superfluous and unrecognized characters, and thus the impact of the misalignment problem can be alleviated or even overcome. We conduct extensive experiments on standard benchmarks, including the IIIT-5K, Street View Text and ICDAR datasets. Experimental results show that the EP can substantially boost scene text recognition performance.
Tasks	Scene Text Recognition
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03384v1
PDF	http://arxiv.org/pdf/1805.03384v1.pdf
PWC	https://paperswithcode.com/paper/edit-probability-for-scene-text-recognition
Repo
Framework

Learning Graph Embeddings from WordNet-based Similarity Measures


Title	Learning Graph Embeddings from WordNet-based Similarity Measures
Authors	Andrey Kutuzov, Mohammad Dorgham, Oleksiy Oliynyk, Chris Biemann, Alexander Panchenko
Abstract	We present path2vec, a new approach for learning graph embeddings that relies on structural measures of pairwise node similarities. The model learns representations for nodes in a dense space that approximate a given user-defined graph distance measure, such as e.g. the shortest path distance or distance measures that take information beyond the graph structure into account. Evaluation of the proposed model on semantic similarity and word sense disambiguation tasks, using various WordNet-based similarity measures, show that our approach yields competitive results, outperforming strong graph embedding baselines. The model is computationally efficient, being orders of magnitude faster than the direct computation of graph-based distances.
Tasks	Graph Embedding, Semantic Similarity, Semantic Textual Similarity, Word Sense Disambiguation
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05611v4
PDF	http://arxiv.org/pdf/1808.05611v4.pdf
PWC	https://paperswithcode.com/paper/learning-graph-embeddings-from-wordnet-based
Repo
Framework

Features for Multi-Target Multi-Camera Tracking and Re-Identification


Title	Features for Multi-Target Multi-Camera Tracking and Re-Identification
Authors	Ergys Ristani, Carlo Tomasi
Abstract	Multi-Target Multi-Camera Tracking (MTMCT) tracks many people through video taken from several cameras. Person Re-Identification (Re-ID) retrieves from a gallery images of people similar to a person query image. We learn good features for both MTMCT and Re-ID with a convolutional neural network. Our contributions include an adaptive weighted triplet loss for training and a new technique for hard-identity mining. Our method outperforms the state of the art both on the DukeMTMC benchmarks for tracking, and on the Market-1501 and DukeMTMC-ReID benchmarks for Re-ID. We examine the correlation between good Re-ID and good MTMCT scores, and perform ablation studies to elucidate the contributions of the main components of our system. Code is available.
Tasks	Person Re-Identification
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10859v1
PDF	http://arxiv.org/pdf/1803.10859v1.pdf
PWC	https://paperswithcode.com/paper/features-for-multi-target-multi-camera
Repo
Framework

Convolutional neural networks in phase space and inverse problems


Title	Convolutional neural networks in phase space and inverse problems
Authors	Gunther Uhlmann, Yiran Wang
Abstract	We study inverse problems consisting on determining medium properties using the responses to probing waves from the machine learning point of view. Based on the understanding of propagation of waves and their nonlinear interactions, we construct a deep convolutional neural network in which the parameters are used to classify and reconstruct the coefficients of nonlinear wave equations that model the medium properties. Furthermore, for given approximation accuracy, we obtain the depth and number of units of the network and their quantitative dependence on the complexity of the medium.
Tasks
Published	2018-11-09
URL	http://arxiv.org/abs/1811.04022v1
PDF	http://arxiv.org/pdf/1811.04022v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-in-phase-space
Repo
Framework

Learn and Pick Right Nodes to Offload


Title	Learn and Pick Right Nodes to Offload
Authors	Zhaowei Zhu, Ting Liu, Shengda Jin, Xiliang Luo
Abstract	Task offloading is a promising technology to exploit the benefits of fog computing. An effective task offloading strategy is needed to utilize the computational resources efficiently. In this paper, we endeavor to seek an online task offloading strategy to minimize the long-term latency. In particular, we formulate a stochastic programming problem, where the expectations of the system parameters change abruptly at unknown time instants. Meanwhile, we consider the fact that the queried nodes can only feed back the processing results after finishing the tasks. We then put forward an effective algorithm to solve this challenging stochastic programming under the non-stationary bandit model. We further prove that our proposed algorithm is asymptotically optimal in a non-stationary fog-enabled network. Numerical simulations are carried out to corroborate our designs.
Tasks
Published	2018-04-20
URL	http://arxiv.org/abs/1804.08416v2
PDF	http://arxiv.org/pdf/1804.08416v2.pdf
PWC	https://paperswithcode.com/paper/learn-and-pick-right-nodes-to-offload
Repo
Framework

Attention-based sequence-to-sequence model for speech recognition: development of state-of-the-art system on LibriSpeech and its application to non-native English


Title	Attention-based sequence-to-sequence model for speech recognition: development of state-of-the-art system on LibriSpeech and its application to non-native English
Authors	Yan Yin, Ramon Prieto, Bin Wang, Jianwei Zhou, Yiwei Gu, Yang Liu, Hui Lin
Abstract	Recent research has shown that attention-based sequence-to-sequence models such as Listen, Attend, and Spell (LAS) yield comparable results to state-of-the-art ASR systems on various tasks. In this paper, we describe the development of such a system and demonstrate its performance on two tasks: first we achieve a new state-of-the-art word error rate of 3.43% on the test clean subset of LibriSpeech English data; second on non-native English speech, including both read speech and spontaneous speech, we obtain very competitive results compared to a conventional system built with the most updated Kaldi recipe.
Tasks	Speech Recognition
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13088v2
PDF	http://arxiv.org/pdf/1810.13088v2.pdf
PWC	https://paperswithcode.com/paper/attention-based-sequence-to-sequence-model
Repo
Framework

Vulnerability of Deep Learning


Title	Vulnerability of Deep Learning
Authors	Richard Kenway
Abstract	The Renormalisation Group (RG) provides a framework in which it is possible to assess whether a deep-learning network is sensitive to small changes in the input data and hence prone to error, or susceptible to adversarial attack. Distinct classification outputs are associated with different RG fixed points and sensitivity to small changes in the input data is due to the presence of relevant operators at a fixed point. A numerical scheme, based on Monte Carlo RG ideas, is proposed for identifying the existence of relevant operators and the corresponding directions of greatest sensitivity in the input data. Thus, a trained deep-learning network may be tested for its robustness and, if it is vulnerable to attack, dangerous perturbations of the input data identified.
Tasks	Adversarial Attack
Published	2018-03-16
URL	http://arxiv.org/abs/1803.06111v1
PDF	http://arxiv.org/pdf/1803.06111v1.pdf
PWC	https://paperswithcode.com/paper/vulnerability-of-deep-learning
Repo
Framework

Online Influence Maximization with Local Observations


Title	Online Influence Maximization with Local Observations
Authors	Julia Olkhovskaya, Gergely Neu, Gábor Lugosi
Abstract	We consider an online influence maximization problem in which a decision maker selects a node among a large number of possibilities and places a piece of information at the node. The node transmits the information to some others that are in the same connected component in a random graph. The goal of the decision maker is to reach as many nodes as possible, with the added complication that feedback is only available about the degree of the selected node. Our main result shows that such local observations can be sufficient for maximizing global influence in two broadly studied families of random graph models: stochastic block models and Chung–Lu models. With this insight, we propose a bandit algorithm that aims at maximizing local (and thus global) influence, and provide its theoretical analysis in both the subcritical and supercritical regimes of both considered models. Notably, our performance guarantees show no explicit dependence on the total number of nodes in the network, making our approach well-suited for large-scale applications.
Tasks
Published	2018-05-28
URL	http://arxiv.org/abs/1805.11022v1
PDF	http://arxiv.org/pdf/1805.11022v1.pdf
PWC	https://paperswithcode.com/paper/online-influence-maximization-with-local
Repo
Framework

Neural Lattice Language Models


Title	Neural Lattice Language Models
Authors	Jacob Buckman, Graham Neubig
Abstract	In this work, we propose a new language modeling paradigm that has the ability to perform both prediction and moderation of information flow at multiple granularities: neural lattice language models. These models construct a lattice of possible paths through a sentence and marginalize across this lattice to calculate sequence probabilities or optimize parameters. This approach allows us to seamlessly incorporate linguistic intuitions - including polysemy and existence of multi-word lexical items - into our language model. Experiments on multiple language modeling tasks show that English neural lattice language models that utilize polysemous embeddings are able to improve perplexity by 9.95% relative to a word-level baseline, and that a Chinese model that handles multi-character tokens is able to improve perplexity by 20.94% relative to a character-level baseline.
Tasks	Language Modelling
Published	2018-03-13
URL	http://arxiv.org/abs/1803.05071v1
PDF	http://arxiv.org/pdf/1803.05071v1.pdf
PWC	https://paperswithcode.com/paper/neural-lattice-language-models
Repo
Framework

Discovering space - Grounding spatial topology and metric regularity in a naive agent’s sensorimotor experience


Title	Discovering space - Grounding spatial topology and metric regularity in a naive agent’s sensorimotor experience
Authors	Alban Laflaquière, J. Kevin O’Regan, Bruno Gas, Alexander Terekhov
Abstract	In line with the sensorimotor contingency theory, we investigate the problem of the perception of space from a fundamental sensorimotor perspective. Despite its pervasive nature in our perception of the world, the origin of the concept of space remains largely mysterious. For example in the context of artificial perception, this issue is usually circumvented by having engineers pre-define the spatial structure of the problem the agent has to face. We here show that the structure of space can be autonomously discovered by a naive agent in the form of sensorimotor regularities, that correspond to so called compensable sensory experiences: these are experiences that can be generated either by the agent or its environment. By detecting such compensable experiences the agent can infer the topological and metric structure of the external space in which its body is moving. We propose a theoretical description of the nature of these regularities and illustrate the approach on a simulated robotic arm equipped with an eye-like sensor, and which interacts with an object. Finally we show how these regularities can be used to build an internal representation of the sensor’s external spatial configuration.
Tasks
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02739v2
PDF	http://arxiv.org/pdf/1806.02739v2.pdf
PWC	https://paperswithcode.com/paper/discovering-space-grounding-spatial-topology
Repo
Framework

Siamese Neural Networks with Random Forest for detecting duplicate question pairs


Title	Siamese Neural Networks with Random Forest for detecting duplicate question pairs
Authors	Ameya Godbole, Aman Dalmia, Sunil Kumar Sahu
Abstract	Determining whether two given questions are semantically similar is a fairly challenging task given the different structures and forms that the questions can take. In this paper, we use Gated Recurrent Units(GRU) in combination with other highly used machine learning algorithms like Random Forest, Adaboost and SVM for the similarity prediction task on a dataset released by Quora, consisting of about 400k labeled question pairs. We got the best result by using the Siamese adaptation of a Bidirectional GRU with a Random Forest classifier, which landed us among the top 24% in the competition Quora Question Pairs hosted on Kaggle.
Tasks
Published	2018-01-22
URL	http://arxiv.org/abs/1801.07288v3
PDF	http://arxiv.org/pdf/1801.07288v3.pdf
PWC	https://paperswithcode.com/paper/siamese-neural-networks-with-random-forest
Repo
Framework

Practical optimal registration of terrestrial LiDAR scan pairs


Title	Practical optimal registration of terrestrial LiDAR scan pairs
Authors	Zhipeng Cai, Tat-Jun Chin, Alvaro Parra Bustos, Konrad Schindler
Abstract	Point cloud registration is a fundamental problem in 3D scanning. In this paper, we address the frequent special case of registering terrestrial LiDAR scans (or, more generally, levelled point clouds). Many current solutions still rely on the Iterative Closest Point (ICP) method or other heuristic procedures, which require good initializations to succeed and/or provide no guarantees of success. On the other hand, exact or optimal registration algorithms can compute the best possible solution without requiring initializations; however, they are currently too slow to be practical in realistic applications. Existing optimal approaches ignore the fact that in routine use the relative rotations between scans are constrained to the azimuth, via the built-in level compensation in LiDAR scanners. We propose a novel, optimal and computationally efficient registration method for this 4DOF scenario. Our approach operates on candidate 3D keypoint correspondences, and contains two main steps: (1) a deterministic selection scheme that significantly reduces the candidate correspondence set in a way that is guaranteed to preserve the optimal solution; and (2) a fast branch-and-bound (BnB) algorithm with a novel polynomial-time subroutine for 1D rotation search, that quickly finds the optimal alignment for the reduced set. We demonstrate the practicality of our method on realistic point clouds from multiple LiDAR surveys.
Tasks	Point Cloud Registration
Published	2018-11-25
URL	http://arxiv.org/abs/1811.09962v3
PDF	http://arxiv.org/pdf/1811.09962v3.pdf
PWC	https://paperswithcode.com/paper/practical-optimal-registration-of-terrestrial
Repo
Framework

Predicting Cyber Events by Leveraging Hacker Sentiment


Title	Predicting Cyber Events by Leveraging Hacker Sentiment
Authors	Ashok Deb, Kristina Lerman, Emilio Ferrara
Abstract	Recent high-profile cyber attacks exemplify why organizations need better cyber defenses. Cyber threats are hard to accurately predict because attackers usually try to mask their traces. However, they often discuss exploits and techniques on hacking forums. The community behavior of the hackers may provide insights into groups’ collective malicious activity. We propose a novel approach to predict cyber events using sentiment analysis. We test our approach using cyber attack data from 2 major business organizations. We consider 3 types of events: malicious software installation, malicious destination visits, and malicious emails that surpassed the target organizations’ defenses. We construct predictive signals by applying sentiment analysis on hacker forum posts to better understand hacker behavior. We analyze over 400K posts generated between January 2016 and January 2018 on over 100 hacking forums both on surface and Dark Web. We find that some forums have significantly more predictive power than others. Sentiment-based models that leverage specific forums can outperform state-of-the-art deep learning and time-series models on forecasting cyber attacks weeks ahead of the events.
Tasks	Sentiment Analysis, Time Series
Published	2018-04-14
URL	http://arxiv.org/abs/1804.05276v1
PDF	http://arxiv.org/pdf/1804.05276v1.pdf
PWC	https://paperswithcode.com/paper/predicting-cyber-events-by-leveraging-hacker
Repo
Framework

NIMFA: A Python Library for Nonnegative Matrix Factorization


Title	NIMFA: A Python Library for Nonnegative Matrix Factorization
Authors	Marinka Zitnik, Blaz Zupan
Abstract	NIMFA is an open-source Python library that provides a unified interface to nonnegative matrix factorization algorithms. It includes implementations of state-of-the-art factorization methods, initialization approaches, and quality scoring. It supports both dense and sparse matrix representation. NIMFA’s component-based implementation and hierarchical design should help the users to employ already implemented techniques or design and code new strategies for matrix factorization tasks.
Tasks
Published	2018-08-06
URL	http://arxiv.org/abs/1808.01743v1
PDF	http://arxiv.org/pdf/1808.01743v1.pdf
PWC	https://paperswithcode.com/paper/nimfa-a-python-library-for-nonnegative-matrix
Repo
Framework

You Look Twice: GaterNet for Dynamic Filter Selection in CNNs


Title	You Look Twice: GaterNet for Dynamic Filter Selection in CNNs
Authors	Zhourong Chen, Yang Li, Samy Bengio, Si Si
Abstract	The concept of conditional computation for deep nets has been proposed previously to improve model performance by selectively using only parts of the model conditioned on the sample it is processing. In this paper, we investigate input-dependent dynamic filter selection in deep convolutional neural networks (CNNs). The problem is interesting because the idea of forcing different parts of the model to learn from different types of samples may help us acquire better filters in CNNs, improve the model generalization performance and potentially increase the interpretability of model behavior. We propose a novel yet simple framework called GaterNet, which involves a backbone and a gater network. The backbone network is a regular CNN that performs the major computation needed for making a prediction, while a global gater network is introduced to generate binary gates for selectively activating filters in the backbone network based on each input. Extensive experiments on CIFAR and ImageNet datasets show that our models consistently outperform the original models with a large margin. On CIFAR-10, our model also improves upon state-of-the-art results.
Tasks
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11205v2
PDF	http://arxiv.org/pdf/1811.11205v2.pdf
PWC	https://paperswithcode.com/paper/gaternet-dynamic-filter-selection-in
Repo
Framework