Paper Group ANR 603
Dynamic Task Allocation for Crowdsourcing Settings. Time Series Anomaly Detection; Detection of anomalous drops with limited features and sparse examples in noisy highly periodic data. Bitwise Operations of Cellular Automaton on Gray-scale Images. Filmy Cloud Removal on Satellite Imagery with Multispectral Conditional Generative Adversarial Nets. E …
Dynamic Task Allocation for Crowdsourcing Settings
Title | Dynamic Task Allocation for Crowdsourcing Settings |
Authors | Angela Zhou, Irineo Cabreros, Karan Singh |
Abstract | We consider the problem of optimal budget allocation for crowdsourcing problems, allocating users to tasks to maximize our final confidence in the crowdsourced answers. Such an optimized worker assignment method allows us to boost the efficacy of any popular crowdsourcing estimation algorithm. We consider a mutual information interpretation of the crowdsourcing problem, which leads to a stochastic subset selection problem with a submodular objective function. We present experimental simulation results which demonstrate the effectiveness of our dynamic task allocation method for achieving higher accuracy, possibly requiring fewer labels, as well as improving upon a previous method which is sensitive to the proportion of users to questions. |
Tasks | |
Published | 2017-01-30 |
URL | http://arxiv.org/abs/1701.08795v2 |
http://arxiv.org/pdf/1701.08795v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-task-allocation-for-crowdsourcing |
Repo | |
Framework | |
Time Series Anomaly Detection; Detection of anomalous drops with limited features and sparse examples in noisy highly periodic data
Title | Time Series Anomaly Detection; Detection of anomalous drops with limited features and sparse examples in noisy highly periodic data |
Authors | Dominique T. Shipmon, Jason M. Gurevitch, Paolo M. Piselli, Stephen T. Edwards |
Abstract | Google uses continuous streams of data from industry partners in order to deliver accurate results to users. Unexpected drops in traffic can be an indication of an underlying issue and may be an early warning that remedial action may be necessary. Detecting such drops is non-trivial because streams are variable and noisy, with roughly regular spikes (in many different shapes) in traffic data. We investigated the question of whether or not we can predict anomalies in these data streams. Our goal is to utilize Machine Learning and statistical approaches to classify anomalous drops in periodic, but noisy, traffic patterns. Since we do not have a large body of labeled examples to directly apply supervised learning for anomaly classification, we approached the problem in two parts. First we used TensorFlow to train our various models including DNNs, RNNs, and LSTMs to perform regression and predict the expected value in the time series. Secondly we created anomaly detection rules that compared the actual values to predicted values. Since the problem requires finding sustained anomalies, rather than just short delays or momentary inactivity in the data, our two detection methods focused on continuous sections of activity rather than just single points. We tried multiple combinations of our models and rules and found that using the intersection of our two anomaly detection methods proved to be an effective method of detecting anomalies on almost all of our models. In the process we also found that not all data fell within our experimental assumptions, as one data stream had no periodicity, and therefore no time based model could predict it. |
Tasks | Anomaly Detection, Time Series |
Published | 2017-08-11 |
URL | http://arxiv.org/abs/1708.03665v1 |
http://arxiv.org/pdf/1708.03665v1.pdf | |
PWC | https://paperswithcode.com/paper/time-series-anomaly-detection-detection-of |
Repo | |
Framework | |
Bitwise Operations of Cellular Automaton on Gray-scale Images
Title | Bitwise Operations of Cellular Automaton on Gray-scale Images |
Authors | Karttikeya Mangalam, K S Venkatesh |
Abstract | Cellular Automata (CA) theory is a discrete model that represents the state of each of its cells from a finite set of possible values which evolve in time according to a pre-defined set of transition rules. CA have been applied to a number of image processing tasks such as Convex Hull Detection, Image Denoising etc. but mostly under the limitation of restricting the input to binary images. In general, a gray-scale image may be converted to a number of different binary images which are finally recombined after CA operations on each of them individually. We have developed a multinomial regression based weighed summation method to recombine binary images for better performance of CA based Image Processing algorithms. The recombination algorithm is tested for the specific case of denoising Salt and Pepper Noise to test against standard benchmark algorithms such as the Median Filter for various images and noise levels. The results indicate several interesting invariances in the application of the CA, such as the particular noise realization and the choice of sub-sampling of pixels to determine recombination weights. Additionally, it appears that simpler algorithms for weight optimization which seek local minima work as effectively as those that seek global minima such as Simulated Annealing. |
Tasks | Denoising, Image Denoising |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.07080v1 |
http://arxiv.org/pdf/1705.07080v1.pdf | |
PWC | https://paperswithcode.com/paper/bitwise-operations-of-cellular-automaton-on |
Repo | |
Framework | |
Filmy Cloud Removal on Satellite Imagery with Multispectral Conditional Generative Adversarial Nets
Title | Filmy Cloud Removal on Satellite Imagery with Multispectral Conditional Generative Adversarial Nets |
Authors | Kenji Enomoto, Ken Sakurada, Weimin Wang, Hiroshi Fukui, Masashi Matsuoka, Ryosuke Nakamura, Nobuo Kawaguchi |
Abstract | In this paper, we propose a method for cloud removal from visible light RGB satellite images by extending the conditional Generative Adversarial Networks (cGANs) from RGB images to multispectral images. Satellite images have been widely utilized for various purposes, such as natural environment monitoring (pollution, forest or rivers), transportation improvement and prompt emergency response to disasters. However, the obscurity caused by clouds makes it unstable to monitor the situation on the ground with the visible light camera. Images captured by a longer wavelength are introduced to reduce the effects of clouds. Synthetic Aperture Radar (SAR) is such an example that improves visibility even the clouds exist. On the other hand, the spatial resolution decreases as the wavelength increases. Furthermore, the images captured by long wavelengths differs considerably from those captured by visible light in terms of their appearance. Therefore, we propose a network that can remove clouds and generate visible light images from the multispectral images taken as inputs. This is achieved by extending the input channels of cGANs to be compatible with multispectral images. The networks are trained to output images that are close to the ground truth using the images synthesized with clouds over the ground truth as inputs. In the available dataset, the proportion of images of the forest or the sea is very high, which will introduce bias in the training dataset if uniformly sampled from the original dataset. Thus, we utilize the t-Distributed Stochastic Neighbor Embedding (t-SNE) to improve the problem of bias in the training dataset. Finally, we confirm the feasibility of the proposed network on the dataset of four bands images, which include three visible light bands and one near-infrared (NIR) band. |
Tasks | |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.04835v1 |
http://arxiv.org/pdf/1710.04835v1.pdf | |
PWC | https://paperswithcode.com/paper/filmy-cloud-removal-on-satellite-imagery-with |
Repo | |
Framework | |
Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach
Title | Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach |
Authors | Emmanouil A. Platanios, Hoifung Poon, Tom M. Mitchell, Eric Horvitz |
Abstract | We propose an efficient method to estimate the accuracy of classifiers using only unlabeled data. We consider a setting with multiple classification problems where the target classes may be tied together through logical constraints. For example, a set of classes may be mutually exclusive, meaning that a data instance can belong to at most one of them. The proposed method is based on the intuition that: (i) when classifiers agree, they are more likely to be correct, and (ii) when the classifiers make a prediction that violates the constraints, at least one classifier must be making an error. Experiments on four real-world data sets produce accuracy estimates within a few percent of the true accuracy, using solely unlabeled data. Our models also outperform existing state-of-the-art solutions in both estimating accuracies, and combining multiple classifier outputs. The results emphasize the utility of logical constraints in estimating accuracy, thus validating our intuition. |
Tasks | |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.07086v1 |
http://arxiv.org/pdf/1705.07086v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-accuracy-from-unlabeled-data-a |
Repo | |
Framework | |
Stacked transfer learning for tropical cyclone intensity prediction
Title | Stacked transfer learning for tropical cyclone intensity prediction |
Authors | Ratneel Vikash Deo, Rohitash Chandra, Anuraganand Sharma |
Abstract | Tropical cyclone wind-intensity prediction is a challenging task considering drastic changes climate patterns over the last few decades. In order to develop robust prediction models, one needs to consider different characteristics of cyclones in terms of spatial and temporal characteristics. Transfer learning incorporates knowledge from a related source dataset to compliment a target datasets especially in cases where there is lack or data. Stacking is a form of ensemble learning focused for improving generalization that has been recently used for transfer learning problems which is referred to as transfer stacking. In this paper, we employ transfer stacking as a means of studying the effects of cyclones whereby we evaluate if cyclones in different geographic locations can be helpful for improving generalization performs. Moreover, we use conventional neural networks for evaluating the effects of duration on cyclones in prediction performance. Therefore, we develop an effective strategy that evaluates the relationships between different types of cyclones through transfer learning and conventional learning methods via neural networks. |
Tasks | Transfer Learning |
Published | 2017-08-22 |
URL | http://arxiv.org/abs/1708.06539v1 |
http://arxiv.org/pdf/1708.06539v1.pdf | |
PWC | https://paperswithcode.com/paper/stacked-transfer-learning-for-tropical |
Repo | |
Framework | |
Treatment-Response Models for Counterfactual Reasoning with Continuous-time, Continuous-valued Interventions
Title | Treatment-Response Models for Counterfactual Reasoning with Continuous-time, Continuous-valued Interventions |
Authors | Hossein Soleimani, Adarsh Subbaswamy, Suchi Saria |
Abstract | Treatment effects can be estimated from observational data as the difference in potential outcomes. In this paper, we address the challenge of estimating the potential outcome when treatment-dose levels can vary continuously over time. Further, the outcome variable may not be measured at a regular frequency. Our proposed solution represents the treatment response curves using linear time-invariant dynamical systems—this provides a flexible means for modeling response over time to highly variable dose curves. Moreover, for multivariate data, the proposed method: uncovers shared structure in treatment response and the baseline across multiple markers; and, flexibly models challenging correlation structure both across and within signals over time. For this, we build upon the framework of multiple-output Gaussian Processes. On simulated and a challenging clinical dataset, we show significant gains in accuracy over state-of-the-art models. |
Tasks | Gaussian Processes |
Published | 2017-04-06 |
URL | http://arxiv.org/abs/1704.02038v2 |
http://arxiv.org/pdf/1704.02038v2.pdf | |
PWC | https://paperswithcode.com/paper/treatment-response-models-for-counterfactual |
Repo | |
Framework | |
Can clone detection support quality assessments of requirements specifications?
Title | Can clone detection support quality assessments of requirements specifications? |
Authors | Elmar Juergens, Florian Deissenboeck, Martin Feilkas, Benjamin Hummel, Bernhard Schaetz, Stefan Wagner, Christoph Domann, Jonathan Streit |
Abstract | Due to their pivotal role in software engineering, considerable effort is spent on the quality assurance of software requirements specifications. As they are mainly described in natural language, relatively few means of automated quality assessment exist. However, we found that clone detection, a technique widely applied to source code, is promising to assess one important quality aspect in an automated way, namely redundancy that stems from copy&paste operations. This paper describes a large-scale case study that applied clone detection to 28 requirements specifications with a total of 8,667 pages. We report on the amount of redundancy found in real-world specifications, discuss its nature as well as its consequences and evaluate in how far existing code clone detection approaches can be applied to assess the quality of requirements specifications in practice. |
Tasks | |
Published | 2017-11-15 |
URL | http://arxiv.org/abs/1711.05472v1 |
http://arxiv.org/pdf/1711.05472v1.pdf | |
PWC | https://paperswithcode.com/paper/can-clone-detection-support-quality |
Repo | |
Framework | |
Fine-Grained Car Detection for Visual Census Estimation
Title | Fine-Grained Car Detection for Visual Census Estimation |
Authors | Timnit Gebru, Jonathan Krause, Yilun Wang, Duyun Chen, Jia Deng, Li Fei-Fei |
Abstract | Targeted socioeconomic policies require an accurate understanding of a country’s demographic makeup. To that end, the United States spends more than 1 billion dollars a year gathering census data such as race, gender, education, occupation and unemployment rates. Compared to the traditional method of collecting surveys across many years which is costly and labor intensive, data-driven, machine learning driven approaches are cheaper and faster–with the potential ability to detect trends in close to real time. In this work, we leverage the ubiquity of Google Street View images and develop a computer vision pipeline to predict income, per capita carbon emission, crime rates and other city attributes from a single source of publicly available visual data. We first detect cars in 50 million images across 200 of the largest US cities and train a model to predict demographic attributes using the detected cars. To facilitate our work, we have collected the largest and most challenging fine-grained dataset reported to date consisting of over 2600 classes of cars comprised of images from Google Street View and other web sources, classified by car experts to account for even the most subtle of visual differences. We use this data to construct the largest scale fine-grained detection system reported to date. Our prediction results correlate well with ground truth income data (r=0.82), Massachusetts department of vehicle registration, and sources investigating crime rates, income segregation, per capita carbon emission, and other market research. Finally, we learn interesting relationships between cars and neighborhoods allowing us to perform the first large scale sociological analysis of cities using computer vision techniques. |
Tasks | |
Published | 2017-09-07 |
URL | http://arxiv.org/abs/1709.02480v1 |
http://arxiv.org/pdf/1709.02480v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-car-detection-for-visual-census |
Repo | |
Framework | |
Label Distribution Learning Forests
Title | Label Distribution Learning Forests |
Authors | Wei Shen, Kai Zhao, Yilu Guo, Alan Yuille |
Abstract | Label distribution learning (LDL) is a general learning framework, which assigns to an instance a distribution over a set of labels rather than a single label or multiple labels. Current LDL methods have either restricted assumptions on the expression form of the label distribution or limitations in representation learning, e.g., to learn deep features in an end-to-end manner. This paper presents label distribution learning forests (LDLFs) - a novel label distribution learning algorithm based on differentiable decision trees, which have several advantages: 1) Decision trees have the potential to model any general form of label distributions by a mixture of leaf node predictions. 2) The learning of differentiable decision trees can be combined with representation learning. We define a distribution-based loss function for a forest, enabling all the trees to be learned jointly, and show that an update function for leaf node predictions, which guarantees a strict decrease of the loss function, can be derived by variational bounding. The effectiveness of the proposed LDLFs is verified on several LDL tasks and a computer vision application, showing significant improvements to the state-of-the-art LDL methods. |
Tasks | Representation Learning |
Published | 2017-02-20 |
URL | http://arxiv.org/abs/1702.06086v4 |
http://arxiv.org/pdf/1702.06086v4.pdf | |
PWC | https://paperswithcode.com/paper/label-distribution-learning-forests |
Repo | |
Framework | |
Identification of individual coherent sets associated with flow trajectories using Coherent Structure Coloring
Title | Identification of individual coherent sets associated with flow trajectories using Coherent Structure Coloring |
Authors | Kristy L. Schlueter-Kuck, John O. Dabiri |
Abstract | We present a method for identifying the coherent structures associated with individual Lagrangian flow trajectories even where only sparse particle trajectory data is available. The method, based on techniques in spectral graph theory, uses the Coherent Structure Coloring vector and associated eigenvectors to analyze the distance in higher-dimensional eigenspace between a selected reference trajectory and other tracer trajectories in the flow. By analyzing this distance metric in a hierarchical clustering, the coherent structure of which the reference particle is a member can be identified. This algorithm is proven successful in identifying coherent structures of varying complexities in canonical unsteady flows. Additionally, the method is able to assess the relative coherence of the associated structure in comparison to the surrounding flow. Although the method is demonstrated here in the context of fluid flow kinematics, the generality of the approach allows for its potential application to other unsupervised clustering problems in dynamical systems such as neuronal activity, gene expression, or social networks. |
Tasks | |
Published | 2017-08-18 |
URL | http://arxiv.org/abs/1708.05757v1 |
http://arxiv.org/pdf/1708.05757v1.pdf | |
PWC | https://paperswithcode.com/paper/identification-of-individual-coherent-sets |
Repo | |
Framework | |
Spectral Methods for Nonparametric Models
Title | Spectral Methods for Nonparametric Models |
Authors | Hsiao-Yu Fish Tung, Chao-Yuan Wu, Manzil Zaheer, Alexander J. Smola |
Abstract | Nonparametric models are versatile, albeit computationally expensive, tool for modeling mixture models. In this paper, we introduce spectral methods for the two most popular nonparametric models: the Indian Buffet Process (IBP) and the Hierarchical Dirichlet Process (HDP). We show that using spectral methods for the inference of nonparametric models are computationally and statistically efficient. In particular, we derive the lower-order moments of the IBP and the HDP, propose spectral algorithms for both models, and provide reconstruction guarantees for the algorithms. For the HDP, we further show that applying hierarchical models on dataset with hierarchical structure, which can be solved with the generalized spectral HDP, produces better solutions to that of flat models regarding likelihood performance. |
Tasks | |
Published | 2017-03-31 |
URL | http://arxiv.org/abs/1704.00003v1 |
http://arxiv.org/pdf/1704.00003v1.pdf | |
PWC | https://paperswithcode.com/paper/spectral-methods-for-nonparametric-models |
Repo | |
Framework | |
On the incorporation of interval-valued fuzzy sets into the Bousi-Prolog system: declarative semantics, implementation and applications
Title | On the incorporation of interval-valued fuzzy sets into the Bousi-Prolog system: declarative semantics, implementation and applications |
Authors | Clemente Rubio-Manzano, Martin Pereira-Fariña |
Abstract | In this paper we analyse the benefits of incorporating interval-valued fuzzy sets into the Bousi-Prolog system. A syntax, declarative semantics and im- plementation for this extension is presented and formalised. We show, by using potential applications, that fuzzy logic programming frameworks enhanced with them can correctly work together with lexical resources and ontologies in order to improve their capabilities for knowledge representation and reasoning. |
Tasks | |
Published | 2017-11-08 |
URL | http://arxiv.org/abs/1711.03147v1 |
http://arxiv.org/pdf/1711.03147v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-incorporation-of-interval-valued-fuzzy |
Repo | |
Framework | |
Image Labeling Based on Graphical Models Using Wasserstein Messages and Geometric Assignment
Title | Image Labeling Based on Graphical Models Using Wasserstein Messages and Geometric Assignment |
Authors | Ruben Hühnerbein, Fabrizio Savarino, Freddie Åström, Christoph Schnörr |
Abstract | We introduce a novel approach to Maximum A Posteriori inference based on discrete graphical models. By utilizing local Wasserstein distances for coupling assignment measures across edges of the underlying graph, a given discrete objective function is smoothly approximated and restricted to the assignment manifold. A corresponding multiplicative update scheme combines in a single process (i) geometric integration of the resulting Riemannian gradient flow and (ii) rounding to integral solutions that represent valid labelings. Throughout this process, local marginalization constraints known from the established LP relaxation are satisfied, whereas the smooth geometric setting results in rapidly converging iterations that can be carried out in parallel for every edge. |
Tasks | |
Published | 2017-10-04 |
URL | http://arxiv.org/abs/1710.01493v2 |
http://arxiv.org/pdf/1710.01493v2.pdf | |
PWC | https://paperswithcode.com/paper/image-labeling-based-on-graphical-models |
Repo | |
Framework | |
Revealing structure components of the retina by deep learning networks
Title | Revealing structure components of the retina by deep learning networks |
Authors | Qi Yan, Zhaofei Yu, Feng Chen, Jian K. Liu |
Abstract | Deep convolutional neural networks (CNNs) have demonstrated impressive performance on visual object classification tasks. In addition, it is a useful model for predication of neuronal responses recorded in visual system. However, there is still no clear understanding of what CNNs learn in terms of visual neuronal circuits. Visualizing CNN’s features to obtain possible connections to neuronscience underpinnings is not easy due to highly complex circuits from the retina to higher visual cortex. Here we address this issue by focusing on single retinal ganglion cells with a simple model and electrophysiological recordings from salamanders. By training CNNs with white noise images to predicate neural responses, we found that convolutional filters learned in the end are resembling to biological components of the retinal circuit. Features represented by these filters tile the space of conventional receptive field of retinal ganglion cells. These results suggest that CNN could be used to reveal structure components of neuronal circuits. |
Tasks | Object Classification |
Published | 2017-11-08 |
URL | http://arxiv.org/abs/1711.02837v1 |
http://arxiv.org/pdf/1711.02837v1.pdf | |
PWC | https://paperswithcode.com/paper/revealing-structure-components-of-the-retina |
Repo | |
Framework | |