Paper Group ANR 56
Ensemble Methods of Classification for Power Systems Security Assessment. Firefly Algorithm for optimization problems with non-continuous variables: A Review and Analysis. Shamela: A Large-Scale Historical Arabic Corpus. On the Complexity of Connection Games. Ordinal Constrained Binary Code Learning for Nearest Neighbor Search. Deep Attributes Driv …
Ensemble Methods of Classification for Power Systems Security Assessment
Title | Ensemble Methods of Classification for Power Systems Security Assessment |
Authors | Alexei Zhukov, Victor Kurbatsky, Nikita Tomin, Denis Sidorov, Daniil Panasetsky, Aoife Foley |
Abstract | One of the most promising approaches for complex technical systems analysis employs ensemble methods of classification. Ensemble methods enable to build a reliable decision rules for feature space classification in the presence of many possible states of the system. In this paper, novel techniques based on decision trees are used for evaluation of the reliability of the regime of electric power systems. We proposed hybrid approach based on random forests models and boosting models. Such techniques can be applied to predict the interaction of increasing renewable power, storage devices and swiching of smart loads from intelligent domestic appliances, heaters and air-conditioning units and electric vehicles with grid for enhanced decision making. The ensemble classification methods were tested on the modified 118-bus IEEE power system showing that proposed technique can be employed to examine whether the power system is secured under steady-state operating conditions. |
Tasks | Decision Making |
Published | 2016-01-07 |
URL | http://arxiv.org/abs/1601.01675v1 |
http://arxiv.org/pdf/1601.01675v1.pdf | |
PWC | https://paperswithcode.com/paper/ensemble-methods-of-classification-for-power |
Repo | |
Framework | |
Firefly Algorithm for optimization problems with non-continuous variables: A Review and Analysis
Title | Firefly Algorithm for optimization problems with non-continuous variables: A Review and Analysis |
Authors | Surafel Luleseged Tilahun, Jean Medard T Ngnotchouye |
Abstract | Firefly algorithm is a swarm based metaheuristic algorithm inspired by the flashing behavior of fireflies. It is an effective and an easy to implement algorithm. It has been tested on different problems from different disciplines and found to be effective. Even though the algorithm is proposed for optimization problems with continuous variables, it has been modified and used for problems with non-continuous variables, including binary and integer valued problems. In this paper a detailed review of this modifications of firefly algorithm for problems with non-continuous variables will be discussed. The strength and weakness of the modifications along with possible future works will be presented. |
Tasks | |
Published | 2016-02-25 |
URL | http://arxiv.org/abs/1602.07884v1 |
http://arxiv.org/pdf/1602.07884v1.pdf | |
PWC | https://paperswithcode.com/paper/firefly-algorithm-for-optimization-problems |
Repo | |
Framework | |
Shamela: A Large-Scale Historical Arabic Corpus
Title | Shamela: A Large-Scale Historical Arabic Corpus |
Authors | Yonatan Belinkov, Alexander Magidow, Maxim Romanov, Avi Shmidman, Moshe Koppel |
Abstract | Arabic is a widely-spoken language with a rich and long history spanning more than fourteen centuries. Yet existing Arabic corpora largely focus on the modern period or lack sufficient diachronic information. We develop a large-scale, historical corpus of Arabic of about 1 billion words from diverse periods of time. We clean this corpus, process it with a morphological analyzer, and enhance it by detecting parallel passages and automatically dating undated texts. We demonstrate its utility with selected case-studies in which we show its application to the digital humanities. |
Tasks | |
Published | 2016-12-28 |
URL | http://arxiv.org/abs/1612.08989v1 |
http://arxiv.org/pdf/1612.08989v1.pdf | |
PWC | https://paperswithcode.com/paper/shamela-a-large-scale-historical-arabic |
Repo | |
Framework | |
On the Complexity of Connection Games
Title | On the Complexity of Connection Games |
Authors | Édouard Bonnet, Florian Jamain, Abdallah Saffidine |
Abstract | In this paper, we study three connection games among the most widely played: Havannah, Twixt, and Slither. We show that determining the outcome of an arbitrary input position is PSPACE-complete in all three cases. Our reductions are based on the popular graph problem Generalized Geography and on Hex itself. We also consider the complexity of generalizations of Hex parameterized by the length of the solution and establish that while Short Generalized Hex is W[1]-hard, Short Hex is FPT. Finally, we prove that the ultra-weak solution to the empty starting position in hex cannot be fully adapted to any of these three games. |
Tasks | |
Published | 2016-05-16 |
URL | http://arxiv.org/abs/1605.04715v1 |
http://arxiv.org/pdf/1605.04715v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-complexity-of-connection-games |
Repo | |
Framework | |
Ordinal Constrained Binary Code Learning for Nearest Neighbor Search
Title | Ordinal Constrained Binary Code Learning for Nearest Neighbor Search |
Authors | Hong Liu, Rongrong Ji, Yongjian Wu, Feiyue Huang |
Abstract | Recent years have witnessed extensive attention in binary code learning, a.k.a. hashing, for nearest neighbor search problems. It has been seen that high-dimensional data points can be quantized into binary codes to give an efficient similarity approximation via Hamming distance. Among existing schemes, ranking-based hashing is recent promising that targets at preserving ordinal relations of ranking in the Hamming space to minimize retrieval loss. However, the size of the ranking tuples, which shows the ordinal relations, is quadratic or cubic to the size of training samples. By given a large-scale training data set, it is very expensive to embed such ranking tuples in binary code learning. Besides, it remains a dificulty to build ranking tuples efficiently for most ranking-preserving hashing, which are deployed over an ordinal graph-based setting. To handle these problems, we propose a novel ranking-preserving hashing method, dubbed Ordinal Constraint Hashing (OCH), which efficiently learns the optimal hashing functions with a graph-based approximation to embed the ordinal relations. The core idea is to reduce the size of ordinal graph with ordinal constraint projection, which preserves the ordinal relations through a small data set (such as clusters or random samples). In particular, to learn such hash functions effectively, we further relax the discrete constraints and design a specific stochastic gradient decent algorithm for optimization. Experimental results on three large-scale visual search benchmark datasets, i.e. LabelMe, Tiny100K and GIST1M, show that the proposed OCH method can achieve superior performance over the state-of-the-arts approaches. |
Tasks | |
Published | 2016-11-19 |
URL | http://arxiv.org/abs/1611.06362v1 |
http://arxiv.org/pdf/1611.06362v1.pdf | |
PWC | https://paperswithcode.com/paper/ordinal-constrained-binary-code-learning-for |
Repo | |
Framework | |
Deep Attributes Driven Multi-Camera Person Re-identification
Title | Deep Attributes Driven Multi-Camera Person Re-identification |
Authors | Chi Su, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian |
Abstract | The visual appearance of a person is easily affected by many factors like pose variations, viewpoint changes and camera parameter differences. This makes person Re-Identification (ReID) among multiple cameras a very challenging task. This work is motivated to learn mid-level human attributes which are robust to such visual appearance variations. And we propose a semi-supervised attribute learning framework which progressively boosts the accuracy of attributes only using a limited number of labeled data. Specifically, this framework involves a three-stage training. A deep Convolutional Neural Network (dCNN) is first trained on an independent dataset labeled with attributes. Then it is fine-tuned on another dataset only labeled with person IDs using our defined triplet loss. Finally, the updated dCNN predicts attribute labels for the target dataset, which is combined with the independent dataset for the final round of fine-tuning. The predicted attributes, namely \emph{deep attributes} exhibit superior generalization ability across different datasets. By directly using the deep attributes with simple Cosine distance, we have obtained surprisingly good accuracy on four person ReID datasets. Experiments also show that a simple metric learning modular further boosts our method, making it significantly outperform many recent works. |
Tasks | Metric Learning, Person Re-Identification |
Published | 2016-05-11 |
URL | http://arxiv.org/abs/1605.03259v2 |
http://arxiv.org/pdf/1605.03259v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-attributes-driven-multi-camera-person-re |
Repo | |
Framework | |
Automatic Detection of Solar Photovoltaic Arrays in High Resolution Aerial Imagery
Title | Automatic Detection of Solar Photovoltaic Arrays in High Resolution Aerial Imagery |
Authors | Jordan M. Malof, Kyle Bradbury, Leslie M. Collins, Richard G. Newell |
Abstract | The quantity of small scale solar photovoltaic (PV) arrays in the United States has grown rapidly in recent years. As a result, there is substantial interest in high quality information about the quantity, power capacity, and energy generated by such arrays, including at a high spatial resolution (e.g., counties, cities, or even smaller regions). Unfortunately, existing methods for obtaining this information, such as surveys and utility interconnection filings, are limited in their completeness and spatial resolution. This work presents a computer algorithm that automatically detects PV panels using very high resolution color satellite imagery. The approach potentially offers a fast, scalable method for obtaining accurate information on PV array location and size, and at much higher spatial resolutions than are currently available. The method is validated using a very large (135 km^2) collection of publicly available [1] aerial imagery, with over 2,700 human annotated PV array locations. The results demonstrate the algorithm is highly effective on a per-pixel basis. It is likewise effective at object-level PV array detection, but with significant potential for improvement in estimating the precise shape/size of the PV arrays. These results are the first of their kind for the detection of solar PV in aerial imagery, demonstrating the feasibility of the approach and establishing a baseline performance for future investigations. |
Tasks | |
Published | 2016-07-20 |
URL | http://arxiv.org/abs/1607.06029v1 |
http://arxiv.org/pdf/1607.06029v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-detection-of-solar-photovoltaic |
Repo | |
Framework | |
Estimating Treatment Effects using Multiple Surrogates: The Role of the Surrogate Score and the Surrogate Index
Title | Estimating Treatment Effects using Multiple Surrogates: The Role of the Surrogate Score and the Surrogate Index |
Authors | Susan Athey, Raj Chetty, Guido Imbens, Hyunseung Kang |
Abstract | Estimating the long-term effects of treatments is of interest in many fields. A common challenge in estimating such treatment effects is that long-term outcomes are unobserved in the time frame needed to make policy decisions. One approach to overcome this missing data problem is to analyze treatments effects on an intermediate outcome, often called a statistical surrogate, if it satisfies the condition that treatment and outcome are independent conditional on the statistical surrogate. The validity of the surrogacy condition is often controversial. Here we exploit that fact that in modern datasets, researchers often observe a large number, possibly hundreds or thousands, of intermediate outcomes, thought to lie on or close to the causal chain between the treatment and the long-term outcome of interest. Even if none of the individual proxies satisfies the statistical surrogacy criterion by itself, using multiple proxies can be useful in causal inference. We focus primarily on a setting with two samples, an experimental sample containing data about the treatment indicator and the surrogates and an observational sample containing information about the surrogates and the primary outcome. We state assumptions under which the average treatment effect be identified and estimated with a high-dimensional vector of proxies that collectively satisfy the surrogacy assumption, and derive the bias from violations of the surrogacy assumption, and show that even if the primary outcome is also observed in the experimental sample, there is still information to be gained from using surrogates. |
Tasks | Causal Inference |
Published | 2016-03-30 |
URL | https://arxiv.org/abs/1603.09326v3 |
https://arxiv.org/pdf/1603.09326v3.pdf | |
PWC | https://paperswithcode.com/paper/estimating-treatment-effects-using-multiple |
Repo | |
Framework | |
The Effects of Data Size and Frequency Range on Distributional Semantic Models
Title | The Effects of Data Size and Frequency Range on Distributional Semantic Models |
Authors | Magnus Sahlgren, Alessandro Lenci |
Abstract | This paper investigates the effects of data size and frequency range on distributional semantic models. We compare the performance of a number of representative models for several test settings over data of varying sizes, and over test items of various frequency. Our results show that neural network-based models underperform when the data is small, and that the most reliable model over data of varying sizes and frequency ranges is the inverted factorized model. |
Tasks | |
Published | 2016-09-27 |
URL | http://arxiv.org/abs/1609.08293v1 |
http://arxiv.org/pdf/1609.08293v1.pdf | |
PWC | https://paperswithcode.com/paper/the-effects-of-data-size-and-frequency-range |
Repo | |
Framework | |
Deep Learning of Part-based Representation of Data Using Sparse Autoencoders with Nonnegativity Constraints
Title | Deep Learning of Part-based Representation of Data Using Sparse Autoencoders with Nonnegativity Constraints |
Authors | Ehsan Hosseini-Asl, Jacek M. Zurada, Olfa Nasraoui |
Abstract | We demonstrate a new deep learning autoencoder network, trained by a nonnegativity constraint algorithm (NCAE), that learns features which show part-based representation of data. The learning algorithm is based on constraining negative weights. The performance of the algorithm is assessed based on decomposing data into parts and its prediction performance is tested on three standard image data sets and one text dataset. The results indicate that the nonnegativity constraint forces the autoencoder to learn features that amount to a part-based representation of data, while improving sparsity and reconstruction quality in comparison with the traditional sparse autoencoder and Nonnegative Matrix Factorization. It is also shown that this newly acquired representation improves the prediction performance of a deep neural network. |
Tasks | |
Published | 2016-01-12 |
URL | http://arxiv.org/abs/1601.02733v1 |
http://arxiv.org/pdf/1601.02733v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-of-part-based-representation-of |
Repo | |
Framework | |
Large-Scale Electron Microscopy Image Segmentation in Spark
Title | Large-Scale Electron Microscopy Image Segmentation in Spark |
Authors | Stephen M. Plaza, Stuart E. Berg |
Abstract | The emerging field of connectomics aims to unlock the mysteries of the brain by understanding the connectivity between neurons. To map this connectivity, we acquire thousands of electron microscopy (EM) images with nanometer-scale resolution. After aligning these images, the resulting dataset has the potential to reveal the shapes of neurons and the synaptic connections between them. However, imaging the brain of even a tiny organism like the fruit fly yields terabytes of data. It can take years of manual effort to examine such image volumes and trace their neuronal connections. One solution is to apply image segmentation algorithms to help automate the tracing tasks. In this paper, we propose a novel strategy to apply such segmentation on very large datasets that exceed the capacity of a single machine. Our solution is robust to potential segmentation errors which could otherwise severely compromise the quality of the overall segmentation, for example those due to poor classifier generalizability or anomalies in the image dataset. We implement our algorithms in a Spark application which minimizes disk I/O, and apply them to a few large EM datasets, revealing both their effectiveness and scalability. We hope this work will encourage external contributions to EM segmentation by providing 1) a flexible plugin architecture that deploys easily on different cluster environments and 2) an in-memory representation of segmentation that could be conducive to new advances. |
Tasks | Electron Microscopy Image Segmentation, Semantic Segmentation |
Published | 2016-04-01 |
URL | http://arxiv.org/abs/1604.00385v1 |
http://arxiv.org/pdf/1604.00385v1.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-electron-microscopy-image |
Repo | |
Framework | |
Nonconvex Sparse Learning via Stochastic Optimization with Progressive Variance Reduction
Title | Nonconvex Sparse Learning via Stochastic Optimization with Progressive Variance Reduction |
Authors | Xingguo Li, Raman Arora, Han Liu, Jarvis Haupt, Tuo Zhao |
Abstract | We propose a stochastic variance reduced optimization algorithm for solving sparse learning problems with cardinality constraints. Sufficient conditions are provided, under which the proposed algorithm enjoys strong linear convergence guarantees and optimal estimation accuracy in high dimensions. We further extend the proposed algorithm to an asynchronous parallel variant with a near linear speedup. Numerical experiments demonstrate the efficiency of our algorithm in terms of both parameter estimation and computational performance. |
Tasks | Sparse Learning, Stochastic Optimization |
Published | 2016-05-09 |
URL | http://arxiv.org/abs/1605.02711v5 |
http://arxiv.org/pdf/1605.02711v5.pdf | |
PWC | https://paperswithcode.com/paper/nonconvex-sparse-learning-via-stochastic |
Repo | |
Framework | |
On the exact recovery of sparse signals via conic relaxations
Title | On the exact recovery of sparse signals via conic relaxations |
Authors | Hongbo Dong |
Abstract | In this note we compare two recently proposed semidefinite relaxations for the sparse linear regression problem by Pilanci, Wainwright and El Ghaoui (Sparse learning via boolean relaxations, 2015) and Dong, Chen and Linderoth (Relaxation vs. Regularization A conic optimization perspective of statistical variable selection, 2015). We focus on the cardinality constrained formulation, and prove that the relaxation proposed by Dong, etc. is theoretically no weaker than the one proposed by Pilanci, etc. Therefore any sufficient condition of exact recovery derived by Pilanci can be readily applied to the other relaxation, including their results on high probability recovery for Gaussian ensemble. Finally we provide empirical evidence that the relaxation by Dong, etc. requires much fewer observations to guarantee the recovery of true support. |
Tasks | Sparse Learning |
Published | 2016-03-15 |
URL | http://arxiv.org/abs/1603.04572v1 |
http://arxiv.org/pdf/1603.04572v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-exact-recovery-of-sparse-signals-via |
Repo | |
Framework | |
The Image Torque Operator for Contour Processing
Title | The Image Torque Operator for Contour Processing |
Authors | Morimichi Nishigaki, Cornelia Fermüller |
Abstract | Contours are salient features for image description, but the detection and localization of boundary contours is still considered a challenging problem. This paper introduces a new tool for edge processing implementing the Gestaltism idea of edge grouping. This tool is a mid-level image operator, called the Torque operator, that is designed to help detect closed contours in images. The torque operator takes as input the raw image and creates an image map by computing from the image gradients within regions of multiple sizes a measure of how well the edges are aligned to form closed convex contours. Fundamental properties of the torque are explored and illustrated through examples. Then it is applied in pure bottom-up processing in a variety of applications, including edge detection, visual attention and segmentation and experimentally demonstrated a useful tool that can improve existing techniques. Finally, its extension as a more general grouping mechanism and application in object recognition is discussed. |
Tasks | Edge Detection, Object Recognition |
Published | 2016-01-18 |
URL | http://arxiv.org/abs/1601.04669v1 |
http://arxiv.org/pdf/1601.04669v1.pdf | |
PWC | https://paperswithcode.com/paper/the-image-torque-operator-for-contour |
Repo | |
Framework | |
Bank Card Usage Prediction Exploiting Geolocation Information
Title | Bank Card Usage Prediction Exploiting Geolocation Information |
Authors | Martin Wistuba, Nghia Duong-Trung, Nicolas Schilling, Lars Schmidt-Thieme |
Abstract | We describe the solution of team ISMLL for the ECML-PKDD 2016 Discovery Challenge on Bank Card Usage for both tasks. Our solution is based on three pillars. Gradient boosted decision trees as a strong regression and classification model, an intensive search for good hyperparameter configurations and strong features that exploit geolocation information. This approach achieved the best performance on the public leaderboard for the first task and a decent fourth position for the second task. |
Tasks | |
Published | 2016-10-13 |
URL | http://arxiv.org/abs/1610.03996v1 |
http://arxiv.org/pdf/1610.03996v1.pdf | |
PWC | https://paperswithcode.com/paper/bank-card-usage-prediction-exploiting |
Repo | |
Framework | |