Paper Group ANR 1136
POL-LWIR Vehicle Detection: Convolutional Neural Networks Meet Polarised Infrared Sensors. Thompson Sampling for Combinatorial Semi-Bandits. Forex trading and Twitter: Spam, bots, and reputation manipulation. Transferable Pedestrian Motion Prediction Models at Intersections. A Big Data Architecture for Log Data Storage and Analysis. Accelerating Em …
POL-LWIR Vehicle Detection: Convolutional Neural Networks Meet Polarised Infrared Sensors
Title | POL-LWIR Vehicle Detection: Convolutional Neural Networks Meet Polarised Infrared Sensors |
Authors | Marcel Sheeny, Andrew Wallace, Mehryar Emambakhsh, Sen Wang, Barry Connor |
Abstract | For vehicle autonomy, driver assistance and situational awareness, it is necessary to operate at day and night, and in all weather conditions. In particular, long wave infrared (LWIR) sensors that receive predominantly emitted radiation have the capability to operate at night as well as during the day. In this work, we employ a polarised LWIR (POL-LWIR) camera to acquire data from a mobile vehicle, to compare and contrast four different convolutional neural network (CNN) configurations to detect other vehicles in video sequences. We evaluate two distinct and promising approaches, two-stage detection (Faster-RCNN) and one-stage detection (SSD), in four different configurations. We also employ two different image decompositions: the first based on the polarisation ellipse and the second on the Stokes parameters themselves. To evaluate our approach, the experimental trials were quantified by mean average precision (mAP) and processing time, showing a clear trade-off between the two factors. For example, the best mAP result of 80.94% was achieved using Faster-RCNN, but at a frame rate of 6.4 fps. In contrast, MobileNet SSD achieved only 64.51% mAP, but at 53.4 fps. |
Tasks | |
Published | 2018-04-07 |
URL | http://arxiv.org/abs/1804.02576v1 |
http://arxiv.org/pdf/1804.02576v1.pdf | |
PWC | https://paperswithcode.com/paper/pol-lwir-vehicle-detection-convolutional |
Repo | |
Framework | |
Thompson Sampling for Combinatorial Semi-Bandits
Title | Thompson Sampling for Combinatorial Semi-Bandits |
Authors | Siwei Wang, Wei Chen |
Abstract | We study the application of the Thompson sampling (TS) methodology to the stochastic combinatorial multi-armed bandit (CMAB) framework. We analyze the standard TS algorithm for the general CMAB, and obtain the first distribution-dependent regret bound of $O(mK_{\max}\log T / \Delta_{\min})$, where $m$ is the number of arms, $K_{\max}$ is the size of the largest super arm, $T$ is the time horizon, and $\Delta_{\min}$ is the minimum gap between the expected reward of the optimal solution and any non-optimal solution. We also show that one cannot directly replace the exact offline oracle with an approximation oracle in TS algorithm for even the classical MAB problem. Then we expand the analysis to two special cases: the linear reward case and the matroid bandit case. When the reward function is linear, the regret of the TS algorithm achieves a better bound $O(m\sqrt{K_{\max}}\log T / \Delta_{\min})$. For matroid bandit, we could remove the independence assumption across arms and achieve a regret upper bound that matches the lower bound for the matroid case. Finally, we use some experiments to show the comparison between regrets of TS and other existing algorithms like CUCB and ESCB. |
Tasks | |
Published | 2018-03-13 |
URL | https://arxiv.org/abs/1803.04623v3 |
https://arxiv.org/pdf/1803.04623v3.pdf | |
PWC | https://paperswithcode.com/paper/thompson-sampling-for-combinatorial-semi |
Repo | |
Framework | |
Forex trading and Twitter: Spam, bots, and reputation manipulation
Title | Forex trading and Twitter: Spam, bots, and reputation manipulation |
Authors | Igor Mozetič, Peter Gabrovšek, Petra Kralj Novak |
Abstract | Currency trading (Forex) is the largest world market in terms of volume. We analyze trading and tweeting about the EUR-USD currency pair over a period of three years. First, a large number of tweets were manually labeled, and a Twitter stance classification model is constructed. The model then classifies all the tweets by the trading stance signal: buy, hold, or sell (EUR vs. USD). The Twitter stance is compared to the actual currency rates by applying the event study methodology, well-known in financial economics. It turns out that there are large differences in Twitter stance distribution and potential trading returns between the four groups of Twitter users: trading robots, spammers, trading companies, and individual traders. Additionally, we observe attempts of reputation manipulation by post festum removal of tweets with poor predictions, and deleting/reposting of identical tweets to increase the visibility without tainting one’s Twitter timeline. |
Tasks | |
Published | 2018-04-06 |
URL | http://arxiv.org/abs/1804.02233v2 |
http://arxiv.org/pdf/1804.02233v2.pdf | |
PWC | https://paperswithcode.com/paper/forex-trading-and-twitter-spam-bots-and |
Repo | |
Framework | |
Transferable Pedestrian Motion Prediction Models at Intersections
Title | Transferable Pedestrian Motion Prediction Models at Intersections |
Authors | Macheng Shen, Golnaz Habibi, Jonathan P. How |
Abstract | One desirable capability of autonomous cars is to accurately predict the pedestrian motion near intersections for safe and efficient trajectory planning. We are interested in developing transfer learning algorithms that can be trained on the pedestrian trajectories collected at one intersection and yet still provide accurate predictions of the trajectories at another, previously unseen intersection. We first discussed the feature selection for transferable pedestrian motion models in general. Following this discussion, we developed one transferable pedestrian motion prediction algorithm based on Inverse Reinforcement Learning (IRL) that infers pedestrian intentions and predicts future trajectories based on observed trajectory. We evaluated our algorithm on a dataset collected at two intersections, trained at one intersection and tested at the other intersection. We used the accuracy of augmented semi-nonnegative sparse coding (ASNSC), trained and tested at the same intersection as a baseline. The result shows that the proposed algorithm improves the baseline accuracy by 40% in the non-transfer task, and 16% in the transfer task. |
Tasks | Feature Selection, motion prediction, Transfer Learning |
Published | 2018-03-15 |
URL | https://arxiv.org/abs/1804.00495v2 |
https://arxiv.org/pdf/1804.00495v2.pdf | |
PWC | https://paperswithcode.com/paper/transferable-pedestrian-motion-prediction |
Repo | |
Framework | |
A Big Data Architecture for Log Data Storage and Analysis
Title | A Big Data Architecture for Log Data Storage and Analysis |
Authors | Swapneel Mehta, Prasanth Kothuri, Daniel Lanza Garcia |
Abstract | We propose an architecture for analysing database connection logs across different instances of databases within an intranet comprising over 10,000 users and associated devices. Our system uses Flume agents to send notifications to a Hadoop Distributed File System for long-term storage and ElasticSearch and Kibana for short-term visualisation, effectively creating a data lake for the extraction of log data. We adopt machine learning models with an ensemble of approaches to filter and process the indicators within the data and aim to predict anomalies or outliers using feature vectors built from this log data. |
Tasks | |
Published | 2018-12-01 |
URL | http://arxiv.org/abs/1812.00111v1 |
http://arxiv.org/pdf/1812.00111v1.pdf | |
PWC | https://paperswithcode.com/paper/a-big-data-architecture-for-log-data-storage |
Repo | |
Framework | |
Accelerating Empowerment Computation with UCT Tree Search
Title | Accelerating Empowerment Computation with UCT Tree Search |
Authors | Christoph Salge, Christian Guckelsberger, Rodrigo Canaan, Tobias Mahlmann |
Abstract | Models of intrinsic motivation present an important means to produce sensible behaviour in the absence of extrinsic rewards. Applications in video games are varied, and range from intrinsically motivated general game-playing agents to non-player characters such as companions and enemies. The information-theoretic quantity of Empowerment is a particularly promising candidate motivation to produce believable, generic and robust behaviour. However, while it can be used in the absence of external reward functions that would need to be crafted and learned, empowerment is computationally expensive. In this paper, we propose a modified UCT tree search method to mitigate empowerment’s computational complexity in discrete and deterministic scenarios. We demonstrate how to modify a Monte-Carlo Search Tree with UCT to realise empowerment maximisation, and discuss three additional modifications that facilitate better sampling. We evaluate the approach both quantitatively, by analysing how close our approach gets to the baseline of exhaustive empowerment computation with varying amounts of computational resources, and qualitatively, by analysing the resulting behaviour in a Minecraft-like scenario. |
Tasks | |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1803.09866v1 |
http://arxiv.org/pdf/1803.09866v1.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-empowerment-computation-with-uct |
Repo | |
Framework | |
FineFool: Fine Object Contour Attack via Attention
Title | FineFool: Fine Object Contour Attack via Attention |
Authors | Jinyin Chen, Haibin Zheng, Hui Xiong, Mengmeng Su |
Abstract | Machine learning models have been shown vulnerable to adversarial attacks launched by adversarial examples which are carefully crafted by attacker to defeat classifiers. Deep learning models cannot escape the attack either. Most of adversarial attack methods are focused on success rate or perturbations size, while we are more interested in the relationship between adversarial perturbation and the image itself. In this paper, we put forward a novel adversarial attack based on contour, named FineFool. Finefool not only has better attack performance compared with other state-of-art white-box attacks in aspect of higher attack success rate and smaller perturbation, but also capable of visualization the optimal adversarial perturbation via attention on object contour. To the best of our knowledge, Finefool is for the first time combines the critical feature of the original clean image with the optimal perturbations in a visible manner. Inspired by the correlations between adversarial perturbations and object contour, slighter perturbations is produced via focusing on object contour features, which is more imperceptible and difficult to be defended, especially network add-on defense methods with the trade-off between perturbations filtering and contour feature loss. Compared with existing state-of-art attacks, extensive experiments are conducted to show that Finefool is capable of efficient attack against defensive deep models. |
Tasks | Adversarial Attack |
Published | 2018-12-01 |
URL | http://arxiv.org/abs/1812.01713v1 |
http://arxiv.org/pdf/1812.01713v1.pdf | |
PWC | https://paperswithcode.com/paper/finefool-fine-object-contour-attack-via |
Repo | |
Framework | |
Catastrophic Importance of Catastrophic Forgetting
Title | Catastrophic Importance of Catastrophic Forgetting |
Authors | Albert Ierusalem |
Abstract | This paper describes some of the possibilities of artificial neural networks that open up after solving the problem of catastrophic forgetting. A simple model and reinforcement learning applications of existing methods are also proposed. |
Tasks | |
Published | 2018-08-20 |
URL | http://arxiv.org/abs/1808.07049v1 |
http://arxiv.org/pdf/1808.07049v1.pdf | |
PWC | https://paperswithcode.com/paper/catastrophic-importance-of-catastrophic |
Repo | |
Framework | |
Multi-encoder multi-resolution framework for end-to-end speech recognition
Title | Multi-encoder multi-resolution framework for end-to-end speech recognition |
Authors | Ruizhi Li, Xiaofei Wang, Sri Harish Mallidi, Takaaki Hori, Shinji Watanabe, Hynek Hermansky |
Abstract | Attention-based methods and Connectionist Temporal Classification (CTC) network have been promising research directions for end-to-end Automatic Speech Recognition (ASR). The joint CTC/Attention model has achieved great success by utilizing both architectures during multi-task training and joint decoding. In this work, we present a novel Multi-Encoder Multi-Resolution (MEMR) framework based on the joint CTC/Attention model. Two heterogeneous encoders with different architectures, temporal resolutions and separate CTC networks work in parallel to extract complimentary acoustic information. A hierarchical attention mechanism is then used to combine the encoder-level information. To demonstrate the effectiveness of the proposed model, experiments are conducted on Wall Street Journal (WSJ) and CHiME-4, resulting in relative Word Error Rate (WER) reduction of 18.0-32.1%. Moreover, the proposed MEMR model achieves 3.6% WER in the WSJ eval92 test set, which is the best WER reported for an end-to-end system on this benchmark. |
Tasks | End-To-End Speech Recognition, Speech Recognition |
Published | 2018-11-12 |
URL | http://arxiv.org/abs/1811.04897v1 |
http://arxiv.org/pdf/1811.04897v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-encoder-multi-resolution-framework-for |
Repo | |
Framework | |
Improving End-to-end Speech Recognition with Pronunciation-assisted Sub-word Modeling
Title | Improving End-to-end Speech Recognition with Pronunciation-assisted Sub-word Modeling |
Authors | Hainan Xu, Shuoyang Ding, Shinji Watanabe |
Abstract | Most end-to-end speech recognition systems model text directly as a sequence of characters or sub-words. Current approaches to sub-word extraction only consider character sequence frequencies, which at times produce inferior sub-word segmentation that might lead to erroneous speech recognition output. We propose pronunciation-assisted sub-word modeling (PASM), a sub-word extraction method that leverages the pronunciation information of a word. Experiments show that the proposed method can greatly improve upon the character-based baseline, and also outperform commonly used byte-pair encoding methods. |
Tasks | End-To-End Speech Recognition, Speech Recognition |
Published | 2018-11-10 |
URL | http://arxiv.org/abs/1811.04284v2 |
http://arxiv.org/pdf/1811.04284v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-end-to-end-speech-recognition-with |
Repo | |
Framework | |
Sparse Adversarial Perturbations for Videos
Title | Sparse Adversarial Perturbations for Videos |
Authors | Xingxing Wei, Jun Zhu, Hang Su |
Abstract | Although adversarial samples of deep neural networks (DNNs) have been intensively studied on static images, their extensions in videos are never explored. Compared with images, attacking a video needs to consider not only spatial cues but also temporal cues. Moreover, to improve the imperceptibility as well as reduce the computation cost, perturbations should be added on as fewer frames as possible, i.e., adversarial perturbations are temporally sparse. This further motivates the propagation of perturbations, which denotes that perturbations added on the current frame can transfer to the next frames via their temporal interactions. Thus, no (or few) extra perturbations are needed for these frames to misclassify them. To this end, we propose an l2,1-norm based optimization algorithm to compute the sparse adversarial perturbations for videos. We choose the action recognition as the targeted task, and networks with a CNN+RNN architecture as threat models to verify our method. Thanks to the propagation, we can compute perturbations on a shortened version video, and then adapt them to the long version video to fool DNNs. Experimental results on the UCF101 dataset demonstrate that even only one frame in a video is perturbed, the fooling rate can still reach 59.7%. |
Tasks | Temporal Action Localization |
Published | 2018-03-07 |
URL | http://arxiv.org/abs/1803.02536v1 |
http://arxiv.org/pdf/1803.02536v1.pdf | |
PWC | https://paperswithcode.com/paper/sparse-adversarial-perturbations-for-videos |
Repo | |
Framework | |
Learning Two Layer Rectified Neural Networks in Polynomial Time
Title | Learning Two Layer Rectified Neural Networks in Polynomial Time |
Authors | Ainesh Bakshi, Rajesh Jayaram, David P. Woodruff |
Abstract | Consider the following fundamental learning problem: given input examples $x \in \mathbb{R}^d$ and their vector-valued labels, as defined by an underlying generative neural network, recover the weight matrices of this network. We consider two-layer networks, mapping $\mathbb{R}^d$ to $\mathbb{R}^m$, with $k$ non-linear activation units $f(\cdot)$, where $f(x) = \max {x , 0}$ is the ReLU. Such a network is specified by two weight matrices, $\mathbf{U}^* \in \mathbb{R}^{m \times k}, \mathbf{V}^* \in \mathbb{R}^{k \times d}$, such that the label of an example $x \in \mathbb{R}^{d}$ is given by $\mathbf{U}^* f(\mathbf{V}^* x)$, where $f(\cdot)$ is applied coordinate-wise. Given $n$ samples as a matrix $\mathbf{X} \in \mathbb{R}^{d \times n}$ and the (possibly noisy) labels $\mathbf{U}^* f(\mathbf{V}^* \mathbf{X}) + \mathbf{E}$ of the network on these samples, where $\mathbf{E}$ is a noise matrix, our goal is to recover the weight matrices $\mathbf{U}^*$ and $\mathbf{V}^*$. In this work, we develop algorithms and hardness results under varying assumptions on the input and noise. Although the problem is NP-hard even for $k=2$, by assuming Gaussian marginals over the input $\mathbf{X}$ we are able to develop polynomial time algorithms for the approximate recovery of $\mathbf{U}^*$ and $\mathbf{V}^*$. Perhaps surprisingly, in the noiseless case our algorithms recover $\mathbf{U}^*,\mathbf{V}^*$ exactly, i.e., with no error. To the best of the our knowledge, this is the first algorithm to accomplish exact recovery. For the noisy case, we give the first polynomial time algorithm that approximately recovers the weights in the presence of mean-zero noise $\mathbf{E}$. Our algorithms generalize to a larger class of rectified activation functions, $f(x) = 0$ when $x\leq 0$, and $f(x) > 0$ otherwise. |
Tasks | |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.01885v1 |
http://arxiv.org/pdf/1811.01885v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-two-layer-rectified-neural-networks |
Repo | |
Framework | |
Semi-Supervised Learning on Graphs Based on Local Label Distributions
Title | Semi-Supervised Learning on Graphs Based on Local Label Distributions |
Authors | Evgeniy Faerman, Felix Borutta, Julian Busch, Matthias Schubert |
Abstract | Most approaches that tackle the problem of node classification consider nodes to be similar, if they have shared neighbors or are close to each other in the graph. Recent methods for attributed graphs additionally take attributes of neighboring nodes into account. We argue that the class labels of the neighbors bear important information and considering them helps to improve classification quality. Two nodes which are similar based on class labels in their neighborhood do not need to be close-by in the graph and may even belong to different connected components. In this work, we propose a novel approach for the semi-supervised node classification. Precisely, we propose a new node embedding which is based on the class labels in the local neighborhood of a node. We show that this is a different setting from attribute-based embeddings and thus, we propose a new method to learn label-based node embeddings which can mirror a variety of relations between the class labels of neighboring nodes. Our experimental evaluation demonstrates that our new methods can significantly improve the prediction quality on real world data sets. |
Tasks | Node Classification |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1802.05563v2 |
http://arxiv.org/pdf/1802.05563v2.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-learning-on-graphs-based-on |
Repo | |
Framework | |
Dynamical Isometry is Achieved in Residual Networks in a Universal Way for any Activation Function
Title | Dynamical Isometry is Achieved in Residual Networks in a Universal Way for any Activation Function |
Authors | Wojciech Tarnowski, Piotr Warchoł, Stanisław Jastrzębski, Jacek Tabor, Maciej A. Nowak |
Abstract | We demonstrate that in residual neural networks (ResNets) dynamical isometry is achievable irrespectively of the activation function used. We do that by deriving, with the help of Free Probability and Random Matrix Theories, a universal formula for the spectral density of the input-output Jacobian at initialization, in the large network width and depth limit. The resulting singular value spectrum depends on a single parameter, which we calculate for a variety of popular activation functions, by analyzing the signal propagation in the artificial neural network. We corroborate our results with numerical simulations of both random matrices and ResNets applied to the CIFAR-10 classification problem. Moreover, we study the consequence of this universal behavior for the initial and late phases of the learning processes. We conclude by drawing attention to the simple fact, that initialization acts as a confounding factor between the choice of activation function and the rate of learning. We propose that in ResNets this can be resolved based on our results, by ensuring the same level of dynamical isometry at initialization. |
Tasks | |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.08848v3 |
http://arxiv.org/pdf/1809.08848v3.pdf | |
PWC | https://paperswithcode.com/paper/dynamical-isometry-is-achieved-in-residual |
Repo | |
Framework | |
Efficient Evaluation of the Number of False Alarm Criterion
Title | Efficient Evaluation of the Number of False Alarm Criterion |
Authors | Sylvie Le Hégarat-Mascle, Emanuel Aldea, Jennifer Vandoni |
Abstract | This paper proposes a method for computing efficiently the significance of a parametric pattern inside a binary image. On the one hand, a-contrario strategies avoid the user involvement for tuning detection thresholds, and allow one to account fairly for different pattern sizes. On the other hand, a-contrario criteria become intractable when the pattern complexity in terms of parametrization increases. In this work, we introduce a strategy which relies on the use of a cumulative space of reduced dimensionality, derived from the coupling of a classic (Hough) cumulative space with an integral histogram trick. This space allows us to store partial computations which are required by the a-contrario criterion, and to evaluate the significance with a lower computational cost than by following a straightforward approach. The method is illustrated on synthetic examples on patterns with various parametrizations up to five dimensions. In order to demonstrate how to apply this generic concept in a real scenario, we consider a difficult crack detection task in still images, which has been addressed in the literature with various local and global detection strategies. We model cracks as bounded segments, detected by the proposed a-contrario criterion, which allow us to introduce additional spatial constraints based on their relative alignment. On this application, the proposed strategy yields state-of the-art results, and underlines its potential for handling complex pattern detection tasks. |
Tasks | |
Published | 2018-07-10 |
URL | http://arxiv.org/abs/1807.03594v1 |
http://arxiv.org/pdf/1807.03594v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-evaluation-of-the-number-of-false |
Repo | |
Framework | |