October 17, 2019

3241 words 16 mins read

Paper Group ANR 804

It was the training data pruning too!. Using Autoencoders To Learn Interesting Features For Detecting Surveillance Aircraft. Real-time Faulted Line Localization and PMU Placement in Power Systems through Convolutional Neural Networks. The Role of Conditional Independence in the Evolution of Intelligent Systems. Measuring the Temporal Behavior of Re …

It was the training data pruning too!


Title	It was the training data pruning too!
Authors	Pramod Kaushik Mudrakarta, Ankur Taly, Mukund Sundararajan, Kedar Dhamdhere
Abstract	We study the current best model (KDG) for question answering on tabular data evaluated over the WikiTableQuestions dataset. Previous ablation studies performed against this model attributed the model’s performance to certain aspects of its architecture. In this paper, we find that the model’s performance also crucially depends on a certain pruning of the data used to train the model. Disabling the pruning step drops the accuracy of the model from 43.3% to 36.3%. The large impact on the performance of the KDG model suggests that the pruning may be a useful pre-processing step in training other semantic parsers as well.
Tasks	Question Answering
Published	2018-03-12
URL	http://arxiv.org/abs/1803.04579v1
PDF	http://arxiv.org/pdf/1803.04579v1.pdf
PWC	https://paperswithcode.com/paper/it-was-the-training-data-pruning-too
Repo
Framework

Using Autoencoders To Learn Interesting Features For Detecting Surveillance Aircraft


Title	Using Autoencoders To Learn Interesting Features For Detecting Surveillance Aircraft
Authors	Teresa Nicole Brooks
Abstract	This paper explores using a Long short-term memory (LSTM) based sequence autoencoder to learn interesting features for detecting surveillance aircraft using ADS-B flight data. An aircraft periodically broadcasts ADS-B (Automatic Dependent Surveillance - Broadcast) data to ground receivers. The ability of LSTM networks to model varying length time series data and remember dependencies that span across events makes it an ideal candidate for implementing a sequence autoencoder for ADS-B data because of its possible variable length time series, irregular sampling and dependencies that span across events.
Tasks	Time Series
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10333v1
PDF	http://arxiv.org/pdf/1809.10333v1.pdf
PWC	https://paperswithcode.com/paper/using-autoencoders-to-learn-interesting
Repo
Framework

Real-time Faulted Line Localization and PMU Placement in Power Systems through Convolutional Neural Networks


Title	Real-time Faulted Line Localization and PMU Placement in Power Systems through Convolutional Neural Networks
Authors	Wenting Li, Deepjyoti Deka, Michael Chertkov, Meng Wang
Abstract	Diverse fault types, fast re-closures, and complicated transient states after a fault event make real-time fault location in power grids challenging. Existing localization techniques in this area rely on simplistic assumptions, such as static loads, or require much higher sampling rates or total measurement availability. This paper proposes a faulted line localization method based on a Convolutional Neural Network (CNN) classifier using bus voltages. Unlike prior data-driven methods, the proposed classifier is based on features with physical interpretations that improve the robustness of the location performance. The accuracy of our CNN based localization tool is demonstrably superior to other machine learning classifiers in the literature. To further improve the location performance, a joint phasor measurement units (PMU) placement strategy is proposed and validated against other methods. A significant aspect of our methodology is that under very low observability (7% of buses), the algorithm is still able to localize the faulted line to a small neighborhood with high probability. The performance of our scheme is validated through simulations of faults of various types in the IEEE 39-bus and 68-bus power systems under varying uncertain conditions, system observability, and measurement quality.
Tasks
Published	2018-10-11
URL	https://arxiv.org/abs/1810.05247v2
PDF	https://arxiv.org/pdf/1810.05247v2.pdf
PWC	https://paperswithcode.com/paper/real-time-fault-localization-in-power-grids
Repo
Framework

The Role of Conditional Independence in the Evolution of Intelligent Systems


Title	The Role of Conditional Independence in the Evolution of Intelligent Systems
Authors	Jory Schossau, Larissa Albantakis, Arend Hintze
Abstract	Systems are typically made from simple components regardless of their complexity. While the function of each part is easily understood, higher order functions are emergent properties and are notoriously difficult to explain. In networked systems, both digital and biological, each component receives inputs, performs a simple computation, and creates an output. When these components have multiple outputs, we intuitively assume that the outputs are causally dependent on the inputs but are themselves independent of each other given the state of their shared input. However, this intuition can be violated for components with probabilistic logic, as these typically cannot be decomposed into separate logic gates with one output each. This violation of conditional independence on the past system state is equivalent to instantaneous interaction — the idea is that some information between the outputs is not coming from the inputs and thus must have been created instantaneously. Here we compare evolved artificial neural systems with and without instantaneous interaction across several task environments. We show that systems without instantaneous interactions evolve faster, to higher final levels of performance, and require fewer logic components to create a densely connected cognitive machinery.
Tasks
Published	2018-01-16
URL	http://arxiv.org/abs/1801.05462v1
PDF	http://arxiv.org/pdf/1801.05462v1.pdf
PWC	https://paperswithcode.com/paper/the-role-of-conditional-independence-in-the
Repo
Framework

Measuring the Temporal Behavior of Real-World Person Re-Identification


Title	Measuring the Temporal Behavior of Real-World Person Re-Identification
Authors	Meng Zheng, Srikrishna Karanam, Richard J. Radke
Abstract	Designing real-world person re-identification (re-id) systems requires attention to operational aspects not typically considered in academic research. Typically, the probe image or image sequence is matched to a gallery set with a fixed candidate list. On the other hand, in real-world applications of re-id, we would search for a person of interest in a gallery set that is continuously populated by new candidates over time. A key question of interest for the operator of such a system is: how long is a correct match to a probe likely to remain in a rank-k shortlist of candidates? In this paper, we propose to distill this information into what we call a Rank Persistence Curve (RPC), which unlike a conventional cumulative match characteristic (CMC) curve helps directly compare the temporal performance of different re-id algorithms. To carefully illustrate the concept, we collected a new multi-shot person re-id dataset called RPIfield. The RPIfield dataset is constructed using a network of 12 cameras with 112 explicitly time-stamped actor paths among about 4000 distractors. We then evaluate the temporal performance of different re-id algorithms using the proposed RPCs using single and pairwise camera videos from RPIfield, and discuss considerations for future research.
Tasks	Person Re-Identification
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05499v1
PDF	http://arxiv.org/pdf/1808.05499v1.pdf
PWC	https://paperswithcode.com/paper/measuring-the-temporal-behavior-of-real-world
Repo
Framework

Ensemble One-dimensional Convolution Neural Networks for Skeleton-based Action Recognition


Title	Ensemble One-dimensional Convolution Neural Networks for Skeleton-based Action Recognition
Authors	Yangyang Xu, Lei Wang
Abstract	In this paper, we proposed a effective but extensible residual one-dimensional convolution neural network as base network, based on the this network, we proposed four subnets to explore the features of skeleton sequences from each aspect. Given a skeleton sequences, the spatial information are encoded into the skeleton joints coordinate in a frame and the temporal information are present by multiple frames. Limited by the skeleton sequence representations, two-dimensional convolution neural network cannot be used directly, we chose one-dimensional convolution layer as the basic layer. Each sub network could extract discriminative features from different aspects. Our first subnet is a two-stream network which could explore both temporal and spatial information. The second is a body-parted network, which could gain micro spatial features and macro temporal features. The third one is an attention network, the main contribution of which is to focus the key frames and feature channels which high related with the action classes in a skeleton sequence. One frame-difference network, as the last subnet, mainly processes the joints changes between the consecutive frames. Four subnets ensemble together by late fusion, the key problem of ensemble method is each subnet should have a certain performance and between the subnets, there are diversity existing. Each subnet shares a wellperformance basenet and differences between subnets guaranteed the diversity. Experimental results show that the ensemble network gets a state-of-the-art performance on three widely used datasets.
Tasks	Skeleton Based Action Recognition, Temporal Action Localization
Published	2018-01-08
URL	http://arxiv.org/abs/1801.02475v2
PDF	http://arxiv.org/pdf/1801.02475v2.pdf
PWC	https://paperswithcode.com/paper/ensemble-one-dimensional-convolution-neural
Repo
Framework

Scalable Learning in Reproducing Kernel Krein Spaces


Title	Scalable Learning in Reproducing Kernel Krein Spaces
Authors	Dino Oglic, Thomas Gärtner
Abstract	We provide the first mathematically complete derivation of the Nystr"om method for low-rank approximation of indefinite kernels and propose an efficient method for finding an approximate eigendecomposition of such kernel matrices. Building on this result, we devise highly scalable methods for learning in reproducing kernel Kre\u{\i}n spaces. The devised approaches provide a principled and theoretically well-founded means to tackle large scale learning problems with indefinite kernels. The main motivation for our work comes from problems with structured representations (e.g., graphs, strings, time-series), where it is relatively easy to devise a pairwise (dis)similarity function based on intuition and/or knowledge of domain experts. Such functions are typically not positive definite and it is often well beyond the expertise of practitioners to verify this condition. The effectiveness of the devised approaches is evaluated empirically using indefinite kernels defined on structured and vectorial data representations.
Tasks	Time Series
Published	2018-09-06
URL	https://arxiv.org/abs/1809.02157v2
PDF	https://arxiv.org/pdf/1809.02157v2.pdf
PWC	https://paperswithcode.com/paper/large-scale-learning-with-krein-kernels
Repo
Framework

Compact and Computationally Efficient Representation of Deep Neural Networks


Title	Compact and Computationally Efficient Representation of Deep Neural Networks
Authors	Simon Wiedemann, Klaus-Robert Müller, Wojciech Samek
Abstract	At the core of any inference procedure in deep neural networks are dot product operations, which are the component that require the highest computational resources. A common approach to reduce the cost of inference is to reduce its memory complexity by lowering the entropy of the weight matrices of the neural network, e.g., by pruning and quantizing their elements. However, the quantized weight matrices are then usually represented either by a dense or sparse matrix storage format, whose associated dot product complexity is not bounded by the entropy of the matrix. This means that the associated inference complexity ultimately depends on the implicit statistical assumptions that these matrix representations make about the weight distribution, which can be in many cases suboptimal. In this paper we address this issue and present new efficient representations for matrices with low entropy statistics. These new matrix formats have the novel property that their memory and algorithmic complexity are implicitly bounded by the entropy of the matrix, consequently implying that they are guaranteed to become more efficient as the entropy of the matrix is being reduced. In our experiments we show that performing the dot product under these new matrix formats can indeed be more energy and time efficient under practically relevant assumptions. For instance, we are able to attain up to x42 compression ratios, x5 speed ups and x90 energy savings when we convert in a lossless manner the weight matrices of state-of-the-art networks such as AlexNet, VGG-16, ResNet152 and DenseNet into the new matrix formats and benchmark their respective dot product operation.
Tasks
Published	2018-05-27
URL	http://arxiv.org/abs/1805.10692v2
PDF	http://arxiv.org/pdf/1805.10692v2.pdf
PWC	https://paperswithcode.com/paper/compact-and-computationally-efficient
Repo
Framework

Scalable and Robust Sparse Subspace Clustering Using Randomized Clustering and Multilayer Graphs


Title	Scalable and Robust Sparse Subspace Clustering Using Randomized Clustering and Multilayer Graphs
Authors	Maryam Abdolali, Nicolas Gillis, Mohammad Rahmati
Abstract	Sparse subspace clustering (SSC) is one of the current state-of-the-art methods for partitioning data points into the union of subspaces, with strong theoretical guarantees. However, it is not practical for large data sets as it requires solving a LASSO problem for each data point, where the number of variables in each LASSO problem is the number of data points. To improve the scalability of SSC, we propose to select a few sets of anchor points using a randomized hierarchical clustering method, and, for each set of anchor points, solve the LASSO problems for each data point allowing only anchor points to have a non-zero weight (this reduces drastically the number of variables). This generates a multilayer graph where each layer corresponds to a different set of anchor points. Using the Grassmann manifold of orthogonal matrices, the shared connectivity among the layers is summarized within a single subspace. Finally, we use $k$-means clustering within that subspace to cluster the data points, similarly as done by spectral clustering in SSC. We show on both synthetic and real-world data sets that the proposed method not only allows SSC to scale to large-scale data sets, but that it is also much more robust as it performs significantly better on noisy data and on data with close susbspaces and outliers, while it is not prone to oversegmentation.
Tasks
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07648v2
PDF	http://arxiv.org/pdf/1802.07648v2.pdf
PWC	https://paperswithcode.com/paper/scalable-and-robust-sparse-subspace
Repo
Framework

Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search


Title	Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search
Authors	Lars Buesing, Theophane Weber, Yori Zwols, Sebastien Racaniere, Arthur Guez, Jean-Baptiste Lespiau, Nicolas Heess
Abstract	Learning policies on data synthesized by models can in principle quench the thirst of reinforcement learning algorithms for large amounts of real experience, which is often costly to acquire. However, simulating plausible experience de novo is a hard problem for many complex environments, often resulting in biases for model-based policy evaluation and search. Instead of de novo synthesis of data, here we assume logged, real experience and model alternative outcomes of this experience under counterfactual actions, actions that were not actually taken. Based on this, we propose the Counterfactually-Guided Policy Search (CF-GPS) algorithm for learning policies in POMDPs from off-policy experience. It leverages structural causal models for counterfactual evaluation of arbitrary policies on individual off-policy episodes. CF-GPS can improve on vanilla model-based RL algorithms by making use of available logged data to de-bias model predictions. In contrast to off-policy algorithms based on Importance Sampling which re-weight data, CF-GPS leverages a model to explicitly consider alternative outcomes, allowing the algorithm to make better use of experience data. We find empirically that these advantages translate into improved policy evaluation and search results on a non-trivial grid-world task. Finally, we show that CF-GPS generalizes the previously proposed Guided Policy Search and that reparameterization-based algorithms such Stochastic Value Gradient can be interpreted as counterfactual methods.
Tasks
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06272v1
PDF	http://arxiv.org/pdf/1811.06272v1.pdf
PWC	https://paperswithcode.com/paper/woulda-coulda-shoulda-counterfactually-guided
Repo
Framework

Driving Digital Rock towards Machine Learning: predicting permeability with Gradient Boosting and Deep Neural Networks


Title	Driving Digital Rock towards Machine Learning: predicting permeability with Gradient Boosting and Deep Neural Networks
Authors	Oleg Sudakov, Evgeny Burnaev, Dmitry Koroteev
Abstract	We present a research study aimed at testing of applicability of machine learning techniques for prediction of permeability of digitized rock samples. We prepare a training set containing 3D images of sandstone samples imaged with X-ray microtomography and corresponding permeability values simulated with Pore Network approach. We also use Minkowski functionals and Deep Learning-based descriptors of 3D images and 2D slices as input features for predictive model training and prediction. We compare predictive power of various feature sets and methods. The later include Gradient Boosting and various architectures of Deep Neural Networks (DNN). The results demonstrate applicability of machine learning for image-based permeability prediction and open a new area of Digital Rock research.
Tasks
Published	2018-03-02
URL	http://arxiv.org/abs/1803.00758v2
PDF	http://arxiv.org/pdf/1803.00758v2.pdf
PWC	https://paperswithcode.com/paper/driving-digital-rock-towards-machine-learning
Repo
Framework

Physics-Based Generative Adversarial Models for Image Restoration and Beyond


Title	Physics-Based Generative Adversarial Models for Image Restoration and Beyond
Authors	Jinshan Pan, Jiangxin Dong, Yang Liu, Jiawei Zhang, Jimmy Ren, Jinhui Tang, Yu-Wing Tai, Ming-Hsuan Yang
Abstract	We present an algorithm to directly solve numerous image restoration problems (e.g., image deblurring, image dehazing, image deraining, etc.). These problems are highly ill-posed, and the common assumptions for existing methods are usually based on heuristic image priors. In this paper, we find that these problems can be solved by generative models with adversarial learning. However, the basic formulation of generative adversarial networks (GANs) does not generate realistic images, and some structures of the estimated images are usually not preserved well. Motivated by an interesting observation that the estimated results should be consistent with the observed inputs under the physics models, we propose a physics model constrained learning algorithm so that it can guide the estimation of the specific task in the conventional GAN framework. The proposed algorithm is trained in an end-to-end fashion and can be applied to a variety of image restoration and related low-level vision problems. Extensive experiments demonstrate that our method performs favorably against the state-of-the-art algorithms.
Tasks	Deblurring, Image Dehazing, Image Restoration, Rain Removal
Published	2018-08-02
URL	https://arxiv.org/abs/1808.00605v2
PDF	https://arxiv.org/pdf/1808.00605v2.pdf
PWC	https://paperswithcode.com/paper/physics-based-generative-adversarial-models
Repo
Framework

Modeling Taxi Drivers’ Behaviour for the Next Destination Prediction


Title	Modeling Taxi Drivers’ Behaviour for the Next Destination Prediction
Authors	Alberto Rossi, Gianni Barlacchi, Monica Bianchini, Bruno Lepri
Abstract	In this paper, we study how to model taxi drivers’ behaviour and geographical information for an interesting and challenging task: the next destination prediction in a taxi journey. Predicting the next location is a well studied problem in human mobility, which finds several applications in real-world scenarios, from optimizing the efficiency of electronic dispatching systems to predicting and reducing the traffic jam. This task is normally modeled as a multiclass classification problem, where the goal is to select, among a set of already known locations, the next taxi destination. We present a Recurrent Neural Network (RNN) approach that models the taxi drivers’ behaviour and encodes the semantics of visited locations by using geographical information from Location-Based Social Networks (LBSNs). In particular, RNNs are trained to predict the exact coordinates of the next destination, overcoming the problem of producing, in output, a limited set of locations, seen during the training phase. The proposed approach was tested on the ECML/PKDD Discovery Challenge 2015 dataset - based on the city of Porto -, obtaining better results with respect to the competition winner, whilst using less information, and on Manhattan and San Francisco datasets.
Tasks
Published	2018-07-21
URL	http://arxiv.org/abs/1807.08173v2
PDF	http://arxiv.org/pdf/1807.08173v2.pdf
PWC	https://paperswithcode.com/paper/modeling-taxi-drivers-behaviour-for-the-next
Repo
Framework

Quantized Compressive K-Means


Title	Quantized Compressive K-Means
Authors	Vincent Schellekens, Laurent Jacques
Abstract	The recent framework of compressive statistical learning aims at designing tractable learning algorithms that use only a heavily compressed representation-or sketch-of massive datasets. Compressive K-Means (CKM) is such a method: it estimates the centroids of data clusters from pooled, non-linear, random signatures of the learning examples. While this approach significantly reduces computational time on very large datasets, its digital implementation wastes acquisition resources because the learning examples are compressed only after the sensing stage. The present work generalizes the sketching procedure initially defined in Compressive K-Means to a large class of periodic nonlinearities including hardware-friendly implementations that compressively acquire entire datasets. This idea is exemplified in a Quantized Compressive K-Means procedure, a variant of CKM that leverages 1-bit universal quantization (i.e. retaining the least significant bit of a standard uniform quantizer) as the periodic sketch nonlinearity. Trading for this resource-efficient signature (standard in most acquisition schemes) has almost no impact on the clustering performances, as illustrated by numerical experiments.
Tasks	Quantization
Published	2018-04-26
URL	http://arxiv.org/abs/1804.10109v2
PDF	http://arxiv.org/pdf/1804.10109v2.pdf
PWC	https://paperswithcode.com/paper/quantized-compressive-k-means
Repo
Framework

Learning to Speed Up Structured Output Prediction


Title	Learning to Speed Up Structured Output Prediction
Authors	Xingyuan Pan, Vivek Srikumar
Abstract	Predicting structured outputs can be computationally onerous due to the combinatorially large output spaces. In this paper, we focus on reducing the prediction time of a trained black-box structured classifier without losing accuracy. To do so, we train a speedup classifier that learns to mimic a black-box classifier under the learning-to-search approach. As the structured classifier predicts more examples, the speedup classifier will operate as a learned heuristic to guide search to favorable regions of the output space. We present a mistake bound for the speedup classifier and identify inference situations where it can independently make correct judgments without input features. We evaluate our method on the task of entity and relation extraction and show that the speedup classifier outperforms even greedy search in terms of speed without loss of accuracy.
Tasks	Relation Extraction
Published	2018-06-11
URL	http://arxiv.org/abs/1806.04245v1
PDF	http://arxiv.org/pdf/1806.04245v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-speed-up-structured-output
Repo
Framework