October 17, 2019

3241 words 16 mins read

Paper Group ANR 804

Paper Group ANR 804

It was the training data pruning too!. Using Autoencoders To Learn Interesting Features For Detecting Surveillance Aircraft. Real-time Faulted Line Localization and PMU Placement in Power Systems through Convolutional Neural Networks. The Role of Conditional Independence in the Evolution of Intelligent Systems. Measuring the Temporal Behavior of Re …

It was the training data pruning too!

Title It was the training data pruning too!
Authors Pramod Kaushik Mudrakarta, Ankur Taly, Mukund Sundararajan, Kedar Dhamdhere
Abstract We study the current best model (KDG) for question answering on tabular data evaluated over the WikiTableQuestions dataset. Previous ablation studies performed against this model attributed the model’s performance to certain aspects of its architecture. In this paper, we find that the model’s performance also crucially depends on a certain pruning of the data used to train the model. Disabling the pruning step drops the accuracy of the model from 43.3% to 36.3%. The large impact on the performance of the KDG model suggests that the pruning may be a useful pre-processing step in training other semantic parsers as well.
Tasks Question Answering
Published 2018-03-12
URL http://arxiv.org/abs/1803.04579v1
PDF http://arxiv.org/pdf/1803.04579v1.pdf
PWC https://paperswithcode.com/paper/it-was-the-training-data-pruning-too
Repo
Framework

Using Autoencoders To Learn Interesting Features For Detecting Surveillance Aircraft

Title Using Autoencoders To Learn Interesting Features For Detecting Surveillance Aircraft
Authors Teresa Nicole Brooks
Abstract This paper explores using a Long short-term memory (LSTM) based sequence autoencoder to learn interesting features for detecting surveillance aircraft using ADS-B flight data. An aircraft periodically broadcasts ADS-B (Automatic Dependent Surveillance - Broadcast) data to ground receivers. The ability of LSTM networks to model varying length time series data and remember dependencies that span across events makes it an ideal candidate for implementing a sequence autoencoder for ADS-B data because of its possible variable length time series, irregular sampling and dependencies that span across events.
Tasks Time Series
Published 2018-09-27
URL http://arxiv.org/abs/1809.10333v1
PDF http://arxiv.org/pdf/1809.10333v1.pdf
PWC https://paperswithcode.com/paper/using-autoencoders-to-learn-interesting
Repo
Framework

Real-time Faulted Line Localization and PMU Placement in Power Systems through Convolutional Neural Networks

Title Real-time Faulted Line Localization and PMU Placement in Power Systems through Convolutional Neural Networks
Authors Wenting Li, Deepjyoti Deka, Michael Chertkov, Meng Wang
Abstract Diverse fault types, fast re-closures, and complicated transient states after a fault event make real-time fault location in power grids challenging. Existing localization techniques in this area rely on simplistic assumptions, such as static loads, or require much higher sampling rates or total measurement availability. This paper proposes a faulted line localization method based on a Convolutional Neural Network (CNN) classifier using bus voltages. Unlike prior data-driven methods, the proposed classifier is based on features with physical interpretations that improve the robustness of the location performance. The accuracy of our CNN based localization tool is demonstrably superior to other machine learning classifiers in the literature. To further improve the location performance, a joint phasor measurement units (PMU) placement strategy is proposed and validated against other methods. A significant aspect of our methodology is that under very low observability (7% of buses), the algorithm is still able to localize the faulted line to a small neighborhood with high probability. The performance of our scheme is validated through simulations of faults of various types in the IEEE 39-bus and 68-bus power systems under varying uncertain conditions, system observability, and measurement quality.
Tasks
Published 2018-10-11
URL https://arxiv.org/abs/1810.05247v2
PDF https://arxiv.org/pdf/1810.05247v2.pdf
PWC https://paperswithcode.com/paper/real-time-fault-localization-in-power-grids
Repo
Framework

The Role of Conditional Independence in the Evolution of Intelligent Systems

Title The Role of Conditional Independence in the Evolution of Intelligent Systems
Authors Jory Schossau, Larissa Albantakis, Arend Hintze
Abstract Systems are typically made from simple components regardless of their complexity. While the function of each part is easily understood, higher order functions are emergent properties and are notoriously difficult to explain. In networked systems, both digital and biological, each component receives inputs, performs a simple computation, and creates an output. When these components have multiple outputs, we intuitively assume that the outputs are causally dependent on the inputs but are themselves independent of each other given the state of their shared input. However, this intuition can be violated for components with probabilistic logic, as these typically cannot be decomposed into separate logic gates with one output each. This violation of conditional independence on the past system state is equivalent to instantaneous interaction — the idea is that some information between the outputs is not coming from the inputs and thus must have been created instantaneously. Here we compare evolved artificial neural systems with and without instantaneous interaction across several task environments. We show that systems without instantaneous interactions evolve faster, to higher final levels of performance, and require fewer logic components to create a densely connected cognitive machinery.
Tasks
Published 2018-01-16
URL http://arxiv.org/abs/1801.05462v1
PDF http://arxiv.org/pdf/1801.05462v1.pdf
PWC https://paperswithcode.com/paper/the-role-of-conditional-independence-in-the
Repo
Framework

Measuring the Temporal Behavior of Real-World Person Re-Identification

Title Measuring the Temporal Behavior of Real-World Person Re-Identification
Authors Meng Zheng, Srikrishna Karanam, Richard J. Radke
Abstract Designing real-world person re-identification (re-id) systems requires attention to operational aspects not typically considered in academic research. Typically, the probe image or image sequence is matched to a gallery set with a fixed candidate list. On the other hand, in real-world applications of re-id, we would search for a person of interest in a gallery set that is continuously populated by new candidates over time. A key question of interest for the operator of such a system is: how long is a correct match to a probe likely to remain in a rank-k shortlist of candidates? In this paper, we propose to distill this information into what we call a Rank Persistence Curve (RPC), which unlike a conventional cumulative match characteristic (CMC) curve helps directly compare the temporal performance of different re-id algorithms. To carefully illustrate the concept, we collected a new multi-shot person re-id dataset called RPIfield. The RPIfield dataset is constructed using a network of 12 cameras with 112 explicitly time-stamped actor paths among about 4000 distractors. We then evaluate the temporal performance of different re-id algorithms using the proposed RPCs using single and pairwise camera videos from RPIfield, and discuss considerations for future research.
Tasks Person Re-Identification
Published 2018-08-16
URL http://arxiv.org/abs/1808.05499v1
PDF http://arxiv.org/pdf/1808.05499v1.pdf
PWC https://paperswithcode.com/paper/measuring-the-temporal-behavior-of-real-world
Repo
Framework

Ensemble One-dimensional Convolution Neural Networks for Skeleton-based Action Recognition

Title Ensemble One-dimensional Convolution Neural Networks for Skeleton-based Action Recognition
Authors Yangyang Xu, Lei Wang
Abstract In this paper, we proposed a effective but extensible residual one-dimensional convolution neural network as base network, based on the this network, we proposed four subnets to explore the features of skeleton sequences from each aspect. Given a skeleton sequences, the spatial information are encoded into the skeleton joints coordinate in a frame and the temporal information are present by multiple frames. Limited by the skeleton sequence representations, two-dimensional convolution neural network cannot be used directly, we chose one-dimensional convolution layer as the basic layer. Each sub network could extract discriminative features from different aspects. Our first subnet is a two-stream network which could explore both temporal and spatial information. The second is a body-parted network, which could gain micro spatial features and macro temporal features. The third one is an attention network, the main contribution of which is to focus the key frames and feature channels which high related with the action classes in a skeleton sequence. One frame-difference network, as the last subnet, mainly processes the joints changes between the consecutive frames. Four subnets ensemble together by late fusion, the key problem of ensemble method is each subnet should have a certain performance and between the subnets, there are diversity existing. Each subnet shares a wellperformance basenet and differences between subnets guaranteed the diversity. Experimental results show that the ensemble network gets a state-of-the-art performance on three widely used datasets.
Tasks Skeleton Based Action Recognition, Temporal Action Localization
Published 2018-01-08
URL http://arxiv.org/abs/1801.02475v2
PDF http://arxiv.org/pdf/1801.02475v2.pdf
PWC https://paperswithcode.com/paper/ensemble-one-dimensional-convolution-neural
Repo
Framework

Scalable Learning in Reproducing Kernel Krein Spaces

Title Scalable Learning in Reproducing Kernel Krein Spaces
Authors Dino Oglic, Thomas Gärtner
Abstract We provide the first mathematically complete derivation of the Nystr"om method for low-rank approximation of indefinite kernels and propose an efficient method for finding an approximate eigendecomposition of such kernel matrices. Building on this result, we devise highly scalable methods for learning in reproducing kernel Kre\u{\i}n spaces. The devised approaches provide a principled and theoretically well-founded means to tackle large scale learning problems with indefinite kernels. The main motivation for our work comes from problems with structured representations (e.g., graphs, strings, time-series), where it is relatively easy to devise a pairwise (dis)similarity function based on intuition and/or knowledge of domain experts. Such functions are typically not positive definite and it is often well beyond the expertise of practitioners to verify this condition. The effectiveness of the devised approaches is evaluated empirically using indefinite kernels defined on structured and vectorial data representations.
Tasks Time Series
Published 2018-09-06
URL https://arxiv.org/abs/1809.02157v2
PDF https://arxiv.org/pdf/1809.02157v2.pdf
PWC https://paperswithcode.com/paper/large-scale-learning-with-krein-kernels
Repo
Framework

Compact and Computationally Efficient Representation of Deep Neural Networks

Title Compact and Computationally Efficient Representation of Deep Neural Networks
Authors Simon Wiedemann, Klaus-Robert Müller, Wojciech Samek
Abstract At the core of any inference procedure in deep neural networks are dot product operations, which are the component that require the highest computational resources. A common approach to reduce the cost of inference is to reduce its memory complexity by lowering the entropy of the weight matrices of the neural network, e.g., by pruning and quantizing their elements. However, the quantized weight matrices are then usually represented either by a dense or sparse matrix storage format, whose associated dot product complexity is not bounded by the entropy of the matrix. This means that the associated inference complexity ultimately depends on the implicit statistical assumptions that these matrix representations make about the weight distribution, which can be in many cases suboptimal. In this paper we address this issue and present new efficient representations for matrices with low entropy statistics. These new matrix formats have the novel property that their memory and algorithmic complexity are implicitly bounded by the entropy of the matrix, consequently implying that they are guaranteed to become more efficient as the entropy of the matrix is being reduced. In our experiments we show that performing the dot product under these new matrix formats can indeed be more energy and time efficient under practically relevant assumptions. For instance, we are able to attain up to x42 compression ratios, x5 speed ups and x90 energy savings when we convert in a lossless manner the weight matrices of state-of-the-art networks such as AlexNet, VGG-16, ResNet152 and DenseNet into the new matrix formats and benchmark their respective dot product operation.
Tasks
Published 2018-05-27
URL http://arxiv.org/abs/1805.10692v2
PDF http://arxiv.org/pdf/1805.10692v2.pdf
PWC https://paperswithcode.com/paper/compact-and-computationally-efficient
Repo
Framework

Scalable and Robust Sparse Subspace Clustering Using Randomized Clustering and Multilayer Graphs

Title Scalable and Robust Sparse Subspace Clustering Using Randomized Clustering and Multilayer Graphs
Authors Maryam Abdolali, Nicolas Gillis, Mohammad Rahmati
Abstract Sparse subspace clustering (SSC) is one of the current state-of-the-art methods for partitioning data points into the union of subspaces, with strong theoretical guarantees. However, it is not practical for large data sets as it requires solving a LASSO problem for each data point, where the number of variables in each LASSO problem is the number of data points. To improve the scalability of SSC, we propose to select a few sets of anchor points using a randomized hierarchical clustering method, and, for each set of anchor points, solve the LASSO problems for each data point allowing only anchor points to have a non-zero weight (this reduces drastically the number of variables). This generates a multilayer graph where each layer corresponds to a different set of anchor points. Using the Grassmann manifold of orthogonal matrices, the shared connectivity among the layers is summarized within a single subspace. Finally, we use $k$-means clustering within that subspace to cluster the data points, similarly as done by spectral clustering in SSC. We show on both synthetic and real-world data sets that the proposed method not only allows SSC to scale to large-scale data sets, but that it is also much more robust as it performs significantly better on noisy data and on data with close susbspaces and outliers, while it is not prone to oversegmentation.
Tasks
Published 2018-02-21
URL http://arxiv.org/abs/1802.07648v2
PDF http://arxiv.org/pdf/1802.07648v2.pdf
PWC https://paperswithcode.com/paper/scalable-and-robust-sparse-subspace
Repo
Framework
Title Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search
Authors Lars Buesing, Theophane Weber, Yori Zwols, Sebastien Racaniere, Arthur Guez, Jean-Baptiste Lespiau, Nicolas Heess
Abstract Learning policies on data synthesized by models can in principle quench the thirst of reinforcement learning algorithms for large amounts of real experience, which is often costly to acquire. However, simulating plausible experience de novo is a hard problem for many complex environments, often resulting in biases for model-based policy evaluation and search. Instead of de novo synthesis of data, here we assume logged, real experience and model alternative outcomes of this experience under counterfactual actions, actions that were not actually taken. Based on this, we propose the Counterfactually-Guided Policy Search (CF-GPS) algorithm for learning policies in POMDPs from off-policy experience. It leverages structural causal models for counterfactual evaluation of arbitrary policies on individual off-policy episodes. CF-GPS can improve on vanilla model-based RL algorithms by making use of available logged data to de-bias model predictions. In contrast to off-policy algorithms based on Importance Sampling which re-weight data, CF-GPS leverages a model to explicitly consider alternative outcomes, allowing the algorithm to make better use of experience data. We find empirically that these advantages translate into improved policy evaluation and search results on a non-trivial grid-world task. Finally, we show that CF-GPS generalizes the previously proposed Guided Policy Search and that reparameterization-based algorithms such Stochastic Value Gradient can be interpreted as counterfactual methods.
Tasks
Published 2018-11-15
URL http://arxiv.org/abs/1811.06272v1
PDF http://arxiv.org/pdf/1811.06272v1.pdf
PWC https://paperswithcode.com/paper/woulda-coulda-shoulda-counterfactually-guided
Repo
Framework

Driving Digital Rock towards Machine Learning: predicting permeability with Gradient Boosting and Deep Neural Networks

Title Driving Digital Rock towards Machine Learning: predicting permeability with Gradient Boosting and Deep Neural Networks
Authors Oleg Sudakov, Evgeny Burnaev, Dmitry Koroteev
Abstract We present a research study aimed at testing of applicability of machine learning techniques for prediction of permeability of digitized rock samples. We prepare a training set containing 3D images of sandstone samples imaged with X-ray microtomography and corresponding permeability values simulated with Pore Network approach. We also use Minkowski functionals and Deep Learning-based descriptors of 3D images and 2D slices as input features for predictive model training and prediction. We compare predictive power of various feature sets and methods. The later include Gradient Boosting and various architectures of Deep Neural Networks (DNN). The results demonstrate applicability of machine learning for image-based permeability prediction and open a new area of Digital Rock research.
Tasks
Published 2018-03-02
URL http://arxiv.org/abs/1803.00758v2
PDF http://arxiv.org/pdf/1803.00758v2.pdf
PWC https://paperswithcode.com/paper/driving-digital-rock-towards-machine-learning
Repo
Framework

Physics-Based Generative Adversarial Models for Image Restoration and Beyond

Title Physics-Based Generative Adversarial Models for Image Restoration and Beyond
Authors Jinshan Pan, Jiangxin Dong, Yang Liu, Jiawei Zhang, Jimmy Ren, Jinhui Tang, Yu-Wing Tai, Ming-Hsuan Yang
Abstract We present an algorithm to directly solve numerous image restoration problems (e.g., image deblurring, image dehazing, image deraining, etc.). These problems are highly ill-posed, and the common assumptions for existing methods are usually based on heuristic image priors. In this paper, we find that these problems can be solved by generative models with adversarial learning. However, the basic formulation of generative adversarial networks (GANs) does not generate realistic images, and some structures of the estimated images are usually not preserved well. Motivated by an interesting observation that the estimated results should be consistent with the observed inputs under the physics models, we propose a physics model constrained learning algorithm so that it can guide the estimation of the specific task in the conventional GAN framework. The proposed algorithm is trained in an end-to-end fashion and can be applied to a variety of image restoration and related low-level vision problems. Extensive experiments demonstrate that our method performs favorably against the state-of-the-art algorithms.
Tasks Deblurring, Image Dehazing, Image Restoration, Rain Removal
Published 2018-08-02
URL https://arxiv.org/abs/1808.00605v2
PDF https://arxiv.org/pdf/1808.00605v2.pdf
PWC https://paperswithcode.com/paper/physics-based-generative-adversarial-models
Repo
Framework

Modeling Taxi Drivers’ Behaviour for the Next Destination Prediction

Title Modeling Taxi Drivers’ Behaviour for the Next Destination Prediction
Authors Alberto Rossi, Gianni Barlacchi, Monica Bianchini, Bruno Lepri
Abstract In this paper, we study how to model taxi drivers’ behaviour and geographical information for an interesting and challenging task: the next destination prediction in a taxi journey. Predicting the next location is a well studied problem in human mobility, which finds several applications in real-world scenarios, from optimizing the efficiency of electronic dispatching systems to predicting and reducing the traffic jam. This task is normally modeled as a multiclass classification problem, where the goal is to select, among a set of already known locations, the next taxi destination. We present a Recurrent Neural Network (RNN) approach that models the taxi drivers’ behaviour and encodes the semantics of visited locations by using geographical information from Location-Based Social Networks (LBSNs). In particular, RNNs are trained to predict the exact coordinates of the next destination, overcoming the problem of producing, in output, a limited set of locations, seen during the training phase. The proposed approach was tested on the ECML/PKDD Discovery Challenge 2015 dataset - based on the city of Porto -, obtaining better results with respect to the competition winner, whilst using less information, and on Manhattan and San Francisco datasets.
Tasks
Published 2018-07-21
URL http://arxiv.org/abs/1807.08173v2
PDF http://arxiv.org/pdf/1807.08173v2.pdf
PWC https://paperswithcode.com/paper/modeling-taxi-drivers-behaviour-for-the-next
Repo
Framework

Quantized Compressive K-Means

Title Quantized Compressive K-Means
Authors Vincent Schellekens, Laurent Jacques
Abstract The recent framework of compressive statistical learning aims at designing tractable learning algorithms that use only a heavily compressed representation-or sketch-of massive datasets. Compressive K-Means (CKM) is such a method: it estimates the centroids of data clusters from pooled, non-linear, random signatures of the learning examples. While this approach significantly reduces computational time on very large datasets, its digital implementation wastes acquisition resources because the learning examples are compressed only after the sensing stage. The present work generalizes the sketching procedure initially defined in Compressive K-Means to a large class of periodic nonlinearities including hardware-friendly implementations that compressively acquire entire datasets. This idea is exemplified in a Quantized Compressive K-Means procedure, a variant of CKM that leverages 1-bit universal quantization (i.e. retaining the least significant bit of a standard uniform quantizer) as the periodic sketch nonlinearity. Trading for this resource-efficient signature (standard in most acquisition schemes) has almost no impact on the clustering performances, as illustrated by numerical experiments.
Tasks Quantization
Published 2018-04-26
URL http://arxiv.org/abs/1804.10109v2
PDF http://arxiv.org/pdf/1804.10109v2.pdf
PWC https://paperswithcode.com/paper/quantized-compressive-k-means
Repo
Framework

Learning to Speed Up Structured Output Prediction

Title Learning to Speed Up Structured Output Prediction
Authors Xingyuan Pan, Vivek Srikumar
Abstract Predicting structured outputs can be computationally onerous due to the combinatorially large output spaces. In this paper, we focus on reducing the prediction time of a trained black-box structured classifier without losing accuracy. To do so, we train a speedup classifier that learns to mimic a black-box classifier under the learning-to-search approach. As the structured classifier predicts more examples, the speedup classifier will operate as a learned heuristic to guide search to favorable regions of the output space. We present a mistake bound for the speedup classifier and identify inference situations where it can independently make correct judgments without input features. We evaluate our method on the task of entity and relation extraction and show that the speedup classifier outperforms even greedy search in terms of speed without loss of accuracy.
Tasks Relation Extraction
Published 2018-06-11
URL http://arxiv.org/abs/1806.04245v1
PDF http://arxiv.org/pdf/1806.04245v1.pdf
PWC https://paperswithcode.com/paper/learning-to-speed-up-structured-output
Repo
Framework
comments powered by Disqus