October 16, 2019

3059 words 15 mins read

Paper Group ANR 1136

POL-LWIR Vehicle Detection: Convolutional Neural Networks Meet Polarised Infrared Sensors. Thompson Sampling for Combinatorial Semi-Bandits. Forex trading and Twitter: Spam, bots, and reputation manipulation. Transferable Pedestrian Motion Prediction Models at Intersections. A Big Data Architecture for Log Data Storage and Analysis. Accelerating Em …

POL-LWIR Vehicle Detection: Convolutional Neural Networks Meet Polarised Infrared Sensors


Title	POL-LWIR Vehicle Detection: Convolutional Neural Networks Meet Polarised Infrared Sensors
Authors	Marcel Sheeny, Andrew Wallace, Mehryar Emambakhsh, Sen Wang, Barry Connor
Abstract	For vehicle autonomy, driver assistance and situational awareness, it is necessary to operate at day and night, and in all weather conditions. In particular, long wave infrared (LWIR) sensors that receive predominantly emitted radiation have the capability to operate at night as well as during the day. In this work, we employ a polarised LWIR (POL-LWIR) camera to acquire data from a mobile vehicle, to compare and contrast four different convolutional neural network (CNN) configurations to detect other vehicles in video sequences. We evaluate two distinct and promising approaches, two-stage detection (Faster-RCNN) and one-stage detection (SSD), in four different configurations. We also employ two different image decompositions: the first based on the polarisation ellipse and the second on the Stokes parameters themselves. To evaluate our approach, the experimental trials were quantified by mean average precision (mAP) and processing time, showing a clear trade-off between the two factors. For example, the best mAP result of 80.94% was achieved using Faster-RCNN, but at a frame rate of 6.4 fps. In contrast, MobileNet SSD achieved only 64.51% mAP, but at 53.4 fps.
Tasks
Published	2018-04-07
URL	http://arxiv.org/abs/1804.02576v1
PDF	http://arxiv.org/pdf/1804.02576v1.pdf
PWC	https://paperswithcode.com/paper/pol-lwir-vehicle-detection-convolutional
Repo
Framework

Thompson Sampling for Combinatorial Semi-Bandits


Title	Thompson Sampling for Combinatorial Semi-Bandits
Authors	Siwei Wang, Wei Chen
Abstract	We study the application of the Thompson sampling (TS) methodology to the stochastic combinatorial multi-armed bandit (CMAB) framework. We analyze the standard TS algorithm for the general CMAB, and obtain the first distribution-dependent regret bound of $O(mK_{\max}\log T / \Delta_{\min})$, where $m$ is the number of arms, $K_{\max}$ is the size of the largest super arm, $T$ is the time horizon, and $\Delta_{\min}$ is the minimum gap between the expected reward of the optimal solution and any non-optimal solution. We also show that one cannot directly replace the exact offline oracle with an approximation oracle in TS algorithm for even the classical MAB problem. Then we expand the analysis to two special cases: the linear reward case and the matroid bandit case. When the reward function is linear, the regret of the TS algorithm achieves a better bound $O(m\sqrt{K_{\max}}\log T / \Delta_{\min})$. For matroid bandit, we could remove the independence assumption across arms and achieve a regret upper bound that matches the lower bound for the matroid case. Finally, we use some experiments to show the comparison between regrets of TS and other existing algorithms like CUCB and ESCB.
Tasks
Published	2018-03-13
URL	https://arxiv.org/abs/1803.04623v3
PDF	https://arxiv.org/pdf/1803.04623v3.pdf
PWC	https://paperswithcode.com/paper/thompson-sampling-for-combinatorial-semi
Repo
Framework

Forex trading and Twitter: Spam, bots, and reputation manipulation


Title	Forex trading and Twitter: Spam, bots, and reputation manipulation
Authors	Igor Mozetič, Peter Gabrovšek, Petra Kralj Novak
Abstract	Currency trading (Forex) is the largest world market in terms of volume. We analyze trading and tweeting about the EUR-USD currency pair over a period of three years. First, a large number of tweets were manually labeled, and a Twitter stance classification model is constructed. The model then classifies all the tweets by the trading stance signal: buy, hold, or sell (EUR vs. USD). The Twitter stance is compared to the actual currency rates by applying the event study methodology, well-known in financial economics. It turns out that there are large differences in Twitter stance distribution and potential trading returns between the four groups of Twitter users: trading robots, spammers, trading companies, and individual traders. Additionally, we observe attempts of reputation manipulation by post festum removal of tweets with poor predictions, and deleting/reposting of identical tweets to increase the visibility without tainting one’s Twitter timeline.
Tasks
Published	2018-04-06
URL	http://arxiv.org/abs/1804.02233v2
PDF	http://arxiv.org/pdf/1804.02233v2.pdf
PWC	https://paperswithcode.com/paper/forex-trading-and-twitter-spam-bots-and
Repo
Framework

Transferable Pedestrian Motion Prediction Models at Intersections


Title	Transferable Pedestrian Motion Prediction Models at Intersections
Authors	Macheng Shen, Golnaz Habibi, Jonathan P. How
Abstract	One desirable capability of autonomous cars is to accurately predict the pedestrian motion near intersections for safe and efficient trajectory planning. We are interested in developing transfer learning algorithms that can be trained on the pedestrian trajectories collected at one intersection and yet still provide accurate predictions of the trajectories at another, previously unseen intersection. We first discussed the feature selection for transferable pedestrian motion models in general. Following this discussion, we developed one transferable pedestrian motion prediction algorithm based on Inverse Reinforcement Learning (IRL) that infers pedestrian intentions and predicts future trajectories based on observed trajectory. We evaluated our algorithm on a dataset collected at two intersections, trained at one intersection and tested at the other intersection. We used the accuracy of augmented semi-nonnegative sparse coding (ASNSC), trained and tested at the same intersection as a baseline. The result shows that the proposed algorithm improves the baseline accuracy by 40% in the non-transfer task, and 16% in the transfer task.
Tasks	Feature Selection, motion prediction, Transfer Learning
Published	2018-03-15
URL	https://arxiv.org/abs/1804.00495v2
PDF	https://arxiv.org/pdf/1804.00495v2.pdf
PWC	https://paperswithcode.com/paper/transferable-pedestrian-motion-prediction
Repo
Framework

A Big Data Architecture for Log Data Storage and Analysis


Title	A Big Data Architecture for Log Data Storage and Analysis
Authors	Swapneel Mehta, Prasanth Kothuri, Daniel Lanza Garcia
Abstract	We propose an architecture for analysing database connection logs across different instances of databases within an intranet comprising over 10,000 users and associated devices. Our system uses Flume agents to send notifications to a Hadoop Distributed File System for long-term storage and ElasticSearch and Kibana for short-term visualisation, effectively creating a data lake for the extraction of log data. We adopt machine learning models with an ensemble of approaches to filter and process the indicators within the data and aim to predict anomalies or outliers using feature vectors built from this log data.
Tasks
Published	2018-12-01
URL	http://arxiv.org/abs/1812.00111v1
PDF	http://arxiv.org/pdf/1812.00111v1.pdf
PWC	https://paperswithcode.com/paper/a-big-data-architecture-for-log-data-storage
Repo
Framework

Accelerating Empowerment Computation with UCT Tree Search


Title	Accelerating Empowerment Computation with UCT Tree Search
Authors	Christoph Salge, Christian Guckelsberger, Rodrigo Canaan, Tobias Mahlmann
Abstract	Models of intrinsic motivation present an important means to produce sensible behaviour in the absence of extrinsic rewards. Applications in video games are varied, and range from intrinsically motivated general game-playing agents to non-player characters such as companions and enemies. The information-theoretic quantity of Empowerment is a particularly promising candidate motivation to produce believable, generic and robust behaviour. However, while it can be used in the absence of external reward functions that would need to be crafted and learned, empowerment is computationally expensive. In this paper, we propose a modified UCT tree search method to mitigate empowerment’s computational complexity in discrete and deterministic scenarios. We demonstrate how to modify a Monte-Carlo Search Tree with UCT to realise empowerment maximisation, and discuss three additional modifications that facilitate better sampling. We evaluate the approach both quantitatively, by analysing how close our approach gets to the baseline of exhaustive empowerment computation with varying amounts of computational resources, and qualitatively, by analysing the resulting behaviour in a Minecraft-like scenario.
Tasks
Published	2018-03-27
URL	http://arxiv.org/abs/1803.09866v1
PDF	http://arxiv.org/pdf/1803.09866v1.pdf
PWC	https://paperswithcode.com/paper/accelerating-empowerment-computation-with-uct
Repo
Framework

FineFool: Fine Object Contour Attack via Attention


Title	FineFool: Fine Object Contour Attack via Attention
Authors	Jinyin Chen, Haibin Zheng, Hui Xiong, Mengmeng Su
Abstract	Machine learning models have been shown vulnerable to adversarial attacks launched by adversarial examples which are carefully crafted by attacker to defeat classifiers. Deep learning models cannot escape the attack either. Most of adversarial attack methods are focused on success rate or perturbations size, while we are more interested in the relationship between adversarial perturbation and the image itself. In this paper, we put forward a novel adversarial attack based on contour, named FineFool. Finefool not only has better attack performance compared with other state-of-art white-box attacks in aspect of higher attack success rate and smaller perturbation, but also capable of visualization the optimal adversarial perturbation via attention on object contour. To the best of our knowledge, Finefool is for the first time combines the critical feature of the original clean image with the optimal perturbations in a visible manner. Inspired by the correlations between adversarial perturbations and object contour, slighter perturbations is produced via focusing on object contour features, which is more imperceptible and difficult to be defended, especially network add-on defense methods with the trade-off between perturbations filtering and contour feature loss. Compared with existing state-of-art attacks, extensive experiments are conducted to show that Finefool is capable of efficient attack against defensive deep models.
Tasks	Adversarial Attack
Published	2018-12-01
URL	http://arxiv.org/abs/1812.01713v1
PDF	http://arxiv.org/pdf/1812.01713v1.pdf
PWC	https://paperswithcode.com/paper/finefool-fine-object-contour-attack-via
Repo
Framework

Catastrophic Importance of Catastrophic Forgetting


Title	Catastrophic Importance of Catastrophic Forgetting
Authors	Albert Ierusalem
Abstract	This paper describes some of the possibilities of artificial neural networks that open up after solving the problem of catastrophic forgetting. A simple model and reinforcement learning applications of existing methods are also proposed.
Tasks
Published	2018-08-20
URL	http://arxiv.org/abs/1808.07049v1
PDF	http://arxiv.org/pdf/1808.07049v1.pdf
PWC	https://paperswithcode.com/paper/catastrophic-importance-of-catastrophic
Repo
Framework

Multi-encoder multi-resolution framework for end-to-end speech recognition


Title	Multi-encoder multi-resolution framework for end-to-end speech recognition
Authors	Ruizhi Li, Xiaofei Wang, Sri Harish Mallidi, Takaaki Hori, Shinji Watanabe, Hynek Hermansky
Abstract	Attention-based methods and Connectionist Temporal Classification (CTC) network have been promising research directions for end-to-end Automatic Speech Recognition (ASR). The joint CTC/Attention model has achieved great success by utilizing both architectures during multi-task training and joint decoding. In this work, we present a novel Multi-Encoder Multi-Resolution (MEMR) framework based on the joint CTC/Attention model. Two heterogeneous encoders with different architectures, temporal resolutions and separate CTC networks work in parallel to extract complimentary acoustic information. A hierarchical attention mechanism is then used to combine the encoder-level information. To demonstrate the effectiveness of the proposed model, experiments are conducted on Wall Street Journal (WSJ) and CHiME-4, resulting in relative Word Error Rate (WER) reduction of 18.0-32.1%. Moreover, the proposed MEMR model achieves 3.6% WER in the WSJ eval92 test set, which is the best WER reported for an end-to-end system on this benchmark.
Tasks	End-To-End Speech Recognition, Speech Recognition
Published	2018-11-12
URL	http://arxiv.org/abs/1811.04897v1
PDF	http://arxiv.org/pdf/1811.04897v1.pdf
PWC	https://paperswithcode.com/paper/multi-encoder-multi-resolution-framework-for
Repo
Framework

Improving End-to-end Speech Recognition with Pronunciation-assisted Sub-word Modeling


Title	Improving End-to-end Speech Recognition with Pronunciation-assisted Sub-word Modeling
Authors	Hainan Xu, Shuoyang Ding, Shinji Watanabe
Abstract	Most end-to-end speech recognition systems model text directly as a sequence of characters or sub-words. Current approaches to sub-word extraction only consider character sequence frequencies, which at times produce inferior sub-word segmentation that might lead to erroneous speech recognition output. We propose pronunciation-assisted sub-word modeling (PASM), a sub-word extraction method that leverages the pronunciation information of a word. Experiments show that the proposed method can greatly improve upon the character-based baseline, and also outperform commonly used byte-pair encoding methods.
Tasks	End-To-End Speech Recognition, Speech Recognition
Published	2018-11-10
URL	http://arxiv.org/abs/1811.04284v2
PDF	http://arxiv.org/pdf/1811.04284v2.pdf
PWC	https://paperswithcode.com/paper/improving-end-to-end-speech-recognition-with
Repo
Framework

Sparse Adversarial Perturbations for Videos


Title	Sparse Adversarial Perturbations for Videos
Authors	Xingxing Wei, Jun Zhu, Hang Su
Abstract	Although adversarial samples of deep neural networks (DNNs) have been intensively studied on static images, their extensions in videos are never explored. Compared with images, attacking a video needs to consider not only spatial cues but also temporal cues. Moreover, to improve the imperceptibility as well as reduce the computation cost, perturbations should be added on as fewer frames as possible, i.e., adversarial perturbations are temporally sparse. This further motivates the propagation of perturbations, which denotes that perturbations added on the current frame can transfer to the next frames via their temporal interactions. Thus, no (or few) extra perturbations are needed for these frames to misclassify them. To this end, we propose an l2,1-norm based optimization algorithm to compute the sparse adversarial perturbations for videos. We choose the action recognition as the targeted task, and networks with a CNN+RNN architecture as threat models to verify our method. Thanks to the propagation, we can compute perturbations on a shortened version video, and then adapt them to the long version video to fool DNNs. Experimental results on the UCF101 dataset demonstrate that even only one frame in a video is perturbed, the fooling rate can still reach 59.7%.
Tasks	Temporal Action Localization
Published	2018-03-07
URL	http://arxiv.org/abs/1803.02536v1
PDF	http://arxiv.org/pdf/1803.02536v1.pdf
PWC	https://paperswithcode.com/paper/sparse-adversarial-perturbations-for-videos
Repo
Framework

Learning Two Layer Rectified Neural Networks in Polynomial Time


Title	Learning Two Layer Rectified Neural Networks in Polynomial Time
Authors	Ainesh Bakshi, Rajesh Jayaram, David P. Woodruff
Abstract	Consider the following fundamental learning problem: given input examples $x \in \mathbb{R}^d$ and their vector-valued labels, as defined by an underlying generative neural network, recover the weight matrices of this network. We consider two-layer networks, mapping $\mathbb{R}^d$ to $\mathbb{R}^m$, with $k$ non-linear activation units $f(\cdot)$, where $f(x) = \max {x , 0}$ is the ReLU. Such a network is specified by two weight matrices, $\mathbf{U}^* \in \mathbb{R}^{m \times k}, \mathbf{V}^* \in \mathbb{R}^{k \times d}$, such that the label of an example $x \in \mathbb{R}^{d}$ is given by $\mathbf{U}^* f(\mathbf{V}^* x)$, where $f(\cdot)$ is applied coordinate-wise. Given $n$ samples as a matrix $\mathbf{X} \in \mathbb{R}^{d \times n}$ and the (possibly noisy) labels $\mathbf{U}^* f(\mathbf{V}^* \mathbf{X}) + \mathbf{E}$ of the network on these samples, where $\mathbf{E}$ is a noise matrix, our goal is to recover the weight matrices $\mathbf{U}^$ and $\mathbf{V}^$. In this work, we develop algorithms and hardness results under varying assumptions on the input and noise. Although the problem is NP-hard even for $k=2$, by assuming Gaussian marginals over the input $\mathbf{X}$ we are able to develop polynomial time algorithms for the approximate recovery of $\mathbf{U}^$ and $\mathbf{V}^$. Perhaps surprisingly, in the noiseless case our algorithms recover $\mathbf{U}^,\mathbf{V}^$ exactly, i.e., with no error. To the best of the our knowledge, this is the first algorithm to accomplish exact recovery. For the noisy case, we give the first polynomial time algorithm that approximately recovers the weights in the presence of mean-zero noise $\mathbf{E}$. Our algorithms generalize to a larger class of rectified activation functions, $f(x) = 0$ when $x\leq 0$, and $f(x) > 0$ otherwise.
Tasks
Published	2018-11-05
URL	http://arxiv.org/abs/1811.01885v1
PDF	http://arxiv.org/pdf/1811.01885v1.pdf
PWC	https://paperswithcode.com/paper/learning-two-layer-rectified-neural-networks
Repo
Framework

Semi-Supervised Learning on Graphs Based on Local Label Distributions


Title	Semi-Supervised Learning on Graphs Based on Local Label Distributions
Authors	Evgeniy Faerman, Felix Borutta, Julian Busch, Matthias Schubert
Abstract	Most approaches that tackle the problem of node classification consider nodes to be similar, if they have shared neighbors or are close to each other in the graph. Recent methods for attributed graphs additionally take attributes of neighboring nodes into account. We argue that the class labels of the neighbors bear important information and considering them helps to improve classification quality. Two nodes which are similar based on class labels in their neighborhood do not need to be close-by in the graph and may even belong to different connected components. In this work, we propose a novel approach for the semi-supervised node classification. Precisely, we propose a new node embedding which is based on the class labels in the local neighborhood of a node. We show that this is a different setting from attribute-based embeddings and thus, we propose a new method to learn label-based node embeddings which can mirror a variety of relations between the class labels of neighboring nodes. Our experimental evaluation demonstrates that our new methods can significantly improve the prediction quality on real world data sets.
Tasks	Node Classification
Published	2018-02-15
URL	http://arxiv.org/abs/1802.05563v2
PDF	http://arxiv.org/pdf/1802.05563v2.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-learning-on-graphs-based-on
Repo
Framework

Dynamical Isometry is Achieved in Residual Networks in a Universal Way for any Activation Function


Title	Dynamical Isometry is Achieved in Residual Networks in a Universal Way for any Activation Function
Authors	Wojciech Tarnowski, Piotr Warchoł, Stanisław Jastrzębski, Jacek Tabor, Maciej A. Nowak
Abstract	We demonstrate that in residual neural networks (ResNets) dynamical isometry is achievable irrespectively of the activation function used. We do that by deriving, with the help of Free Probability and Random Matrix Theories, a universal formula for the spectral density of the input-output Jacobian at initialization, in the large network width and depth limit. The resulting singular value spectrum depends on a single parameter, which we calculate for a variety of popular activation functions, by analyzing the signal propagation in the artificial neural network. We corroborate our results with numerical simulations of both random matrices and ResNets applied to the CIFAR-10 classification problem. Moreover, we study the consequence of this universal behavior for the initial and late phases of the learning processes. We conclude by drawing attention to the simple fact, that initialization acts as a confounding factor between the choice of activation function and the rate of learning. We propose that in ResNets this can be resolved based on our results, by ensuring the same level of dynamical isometry at initialization.
Tasks
Published	2018-09-24
URL	http://arxiv.org/abs/1809.08848v3
PDF	http://arxiv.org/pdf/1809.08848v3.pdf
PWC	https://paperswithcode.com/paper/dynamical-isometry-is-achieved-in-residual
Repo
Framework

Efficient Evaluation of the Number of False Alarm Criterion


Title	Efficient Evaluation of the Number of False Alarm Criterion
Authors	Sylvie Le Hégarat-Mascle, Emanuel Aldea, Jennifer Vandoni
Abstract	This paper proposes a method for computing efficiently the significance of a parametric pattern inside a binary image. On the one hand, a-contrario strategies avoid the user involvement for tuning detection thresholds, and allow one to account fairly for different pattern sizes. On the other hand, a-contrario criteria become intractable when the pattern complexity in terms of parametrization increases. In this work, we introduce a strategy which relies on the use of a cumulative space of reduced dimensionality, derived from the coupling of a classic (Hough) cumulative space with an integral histogram trick. This space allows us to store partial computations which are required by the a-contrario criterion, and to evaluate the significance with a lower computational cost than by following a straightforward approach. The method is illustrated on synthetic examples on patterns with various parametrizations up to five dimensions. In order to demonstrate how to apply this generic concept in a real scenario, we consider a difficult crack detection task in still images, which has been addressed in the literature with various local and global detection strategies. We model cracks as bounded segments, detected by the proposed a-contrario criterion, which allow us to introduce additional spatial constraints based on their relative alignment. On this application, the proposed strategy yields state-of the-art results, and underlines its potential for handling complex pattern detection tasks.
Tasks
Published	2018-07-10
URL	http://arxiv.org/abs/1807.03594v1
PDF	http://arxiv.org/pdf/1807.03594v1.pdf
PWC	https://paperswithcode.com/paper/efficient-evaluation-of-the-number-of-false
Repo
Framework