Paper Group ANR 793
Scalable k-Means Clustering via Lightweight Coresets. Learning best K analogies from data distribution for case-based software effort estimation. SenseNet: 3D Objects Database and Tactile Simulator. Time-Dependent Representation for Neural Event Sequence Prediction. On the Consistency of Quick Shift. Lesion detection and Grading of Diabetic Retinop …
Scalable k-Means Clustering via Lightweight Coresets
Title | Scalable k-Means Clustering via Lightweight Coresets |
Authors | Olivier Bachem, Mario Lucic, Andreas Krause |
Abstract | Coresets are compact representations of data sets such that models trained on a coreset are provably competitive with models trained on the full data set. As such, they have been successfully used to scale up clustering models to massive data sets. While existing approaches generally only allow for multiplicative approximation errors, we propose a novel notion of lightweight coresets that allows for both multiplicative and additive errors. We provide a single algorithm to construct lightweight coresets for k-means clustering as well as soft and hard Bregman clustering. The algorithm is substantially faster than existing constructions, embarrassingly parallel, and the resulting coresets are smaller. We further show that the proposed approach naturally generalizes to statistical k-means clustering and that, compared to existing results, it can be used to compute smaller summaries for empirical risk minimization. In extensive experiments, we demonstrate that the proposed algorithm outperforms existing data summarization strategies in practice. |
Tasks | Data Summarization |
Published | 2017-02-27 |
URL | http://arxiv.org/abs/1702.08248v2 |
http://arxiv.org/pdf/1702.08248v2.pdf | |
PWC | https://paperswithcode.com/paper/scalable-k-means-clustering-via-lightweight |
Repo | |
Framework | |
Learning best K analogies from data distribution for case-based software effort estimation
Title | Learning best K analogies from data distribution for case-based software effort estimation |
Authors | Mohammad Azzeh, Yousef Elsheikh |
Abstract | Case-Based Reasoning (CBR) has been widely used to generate good software effort estimates. The predictive performance of CBR is a dataset dependent and subject to extremely large space of configuration possibilities. Regardless of the type of adaptation technique, deciding on the optimal number of similar cases to be used before applying CBR is a key challenge. In this paper we propose a new technique based on Bisecting k-medoids clustering algorithm to better understanding the structure of a dataset and discovering the the optimal cases for each individual project by excluding irrelevant cases. Results obtained showed that understanding of the data characteristic prior prediction stage can help in automatically finding the best number of cases for each test project. Performance figures of the proposed estimation method are better than those of other regular K-based CBR methods. |
Tasks | |
Published | 2017-03-11 |
URL | http://arxiv.org/abs/1703.04567v1 |
http://arxiv.org/pdf/1703.04567v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-best-k-analogies-from-data |
Repo | |
Framework | |
SenseNet: 3D Objects Database and Tactile Simulator
Title | SenseNet: 3D Objects Database and Tactile Simulator |
Authors | Jason Toy |
Abstract | The majority of artificial intelligence research, as it relates from which to biological senses has been focused on vision. The recent explosion of machine learning and in particular, dee p learning, can be partially attributed to the release of high quality data sets for algorithm s from which to model the world on. Thus, most of these datasets are comprised of images. We believe that focusing on sensorimotor systems and tactile feedback will create algorithms that better mimic human intelligence. Here we present SenseNet: a collection of tactile simulators and a large scale dataset of 3D objects for manipulation. SenseNet was created for the purpose of researching and training Artificial Intelligences (AIs) to interact with the environment via sensorimotor neural systems and tactile feedback. We aim to accelerate that same explosion in image processing, but for the domain of tactile feedback and sensorimotor research. We hope that SenseNet can offer researchers in both the machine learning and computational neuroscience communities brand new opportunities and avenues to explore. |
Tasks | |
Published | 2017-12-31 |
URL | http://arxiv.org/abs/1801.00361v1 |
http://arxiv.org/pdf/1801.00361v1.pdf | |
PWC | https://paperswithcode.com/paper/sensenet-3d-objects-database-and-tactile |
Repo | |
Framework | |
Time-Dependent Representation for Neural Event Sequence Prediction
Title | Time-Dependent Representation for Neural Event Sequence Prediction |
Authors | Yang Li, Nan Du, Samy Bengio |
Abstract | Existing sequence prediction methods are mostly concerned with time-independent sequences, in which the actual time span between events is irrelevant and the distance between events is simply the difference between their order positions in the sequence. While this time-independent view of sequences is applicable for data such as natural languages, e.g., dealing with words in a sentence, it is inappropriate and inefficient for many real world events that are observed and collected at unequally spaced points of time as they naturally arise, e.g., when a person goes to a grocery store or makes a phone call. The time span between events can carry important information about the sequence dependence of human behaviors. In this work, we propose a set of methods for using time in sequence prediction. Because neural sequence models such as RNN are more amenable for handling token-like input, we propose two methods for time-dependent event representation, based on the intuition on how time is tokenized in everyday life and previous work on embedding contextualization. We also introduce two methods for using next event duration as regularization for training a sequence prediction model. We discuss these methods based on recurrent neural nets. We evaluate these methods as well as baseline models on five datasets that resemble a variety of sequence prediction tasks. The experiments revealed that the proposed methods offer accuracy gain over baseline models in a range of settings. |
Tasks | |
Published | 2017-07-31 |
URL | http://arxiv.org/abs/1708.00065v4 |
http://arxiv.org/pdf/1708.00065v4.pdf | |
PWC | https://paperswithcode.com/paper/time-dependent-representation-for-neural |
Repo | |
Framework | |
On the Consistency of Quick Shift
Title | On the Consistency of Quick Shift |
Authors | Heinrich Jiang |
Abstract | Quick Shift is a popular mode-seeking and clustering algorithm. We present finite sample statistical consistency guarantees for Quick Shift on mode and cluster recovery under mild distributional assumptions. We then apply our results to construct a consistent modal regression algorithm. |
Tasks | |
Published | 2017-10-29 |
URL | http://arxiv.org/abs/1710.10646v2 |
http://arxiv.org/pdf/1710.10646v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-consistency-of-quick-shift |
Repo | |
Framework | |
Lesion detection and Grading of Diabetic Retinopathy via Two-stages Deep Convolutional Neural Networks
Title | Lesion detection and Grading of Diabetic Retinopathy via Two-stages Deep Convolutional Neural Networks |
Authors | Yehui Yang, Tao Li, Wensi Li, Haishan Wu, Wei Fan, Wensheng Zhang |
Abstract | We propose an automatic diabetic retinopathy (DR) analysis algorithm based on two-stages deep convolutional neural networks (DCNN). Compared to existing DCNN-based DR detection methods, the proposed algorithm have the following advantages: (1) Our method can point out the location and type of lesions in the fundus images, as well as giving the severity grades of DR. Moreover, since retina lesions and DR severity appear with different scales in fundus images, the integration of both local and global networks learn more complete and specific features for DR analysis. (2) By introducing imbalanced weighting map, more attentions will be given to lesion patches for DR grading, which significantly improve the performance of the proposed algorithm. In this study, we label 12,206 lesion patches and re-annotate the DR grades of 23,595 fundus images from Kaggle competition dataset. Under the guidance of clinical ophthalmologists, the experimental results show that our local lesion detection net achieve comparable performance with trained human observers, and the proposed imbalanced weighted scheme also be proved to significantly improve the capability of our DCNN-based DR grading algorithm. |
Tasks | |
Published | 2017-05-02 |
URL | http://arxiv.org/abs/1705.00771v1 |
http://arxiv.org/pdf/1705.00771v1.pdf | |
PWC | https://paperswithcode.com/paper/lesion-detection-and-grading-of-diabetic |
Repo | |
Framework | |
EndNet: Sparse AutoEncoder Network for Endmember Extraction and Hyperspectral Unmixing
Title | EndNet: Sparse AutoEncoder Network for Endmember Extraction and Hyperspectral Unmixing |
Authors | Savas Ozkan, Berk Kaya, Gozde Bozdagi Akar |
Abstract | Data acquired from multi-channel sensors is a highly valuable asset to interpret the environment for a variety of remote sensing applications. However, low spatial resolution is a critical limitation for previous sensors and the constituent materials of a scene can be mixed in different fractions due to their spatial interactions. Spectral unmixing is a technique that allows us to obtain the material spectral signatures and their fractions from hyperspectral data. In this paper, we propose a novel endmember extraction and hyperspectral unmixing scheme, so called \textit{EndNet}, that is based on a two-staged autoencoder network. This well-known structure is completely enhanced and restructured by introducing additional layers and a projection metric (i.e., spectral angle distance (SAD) instead of inner product) to achieve an optimum solution. Moreover, we present a novel loss function that is composed of a Kullback-Leibler divergence term with SAD similarity and additional penalty terms to improve the sparsity of the estimates. These modifications enable us to set the common properties of endmembers such as non-linearity and sparsity for autoencoder networks. Lastly, due to the stochastic-gradient based approach, the method is scalable for large-scale data and it can be accelerated on Graphical Processing Units (GPUs). To demonstrate the superiority of our proposed method, we conduct extensive experiments on several well-known datasets. The results confirm that the proposed method considerably improves the performance compared to the state-of-the-art techniques in literature. |
Tasks | Hyperspectral Unmixing |
Published | 2017-08-06 |
URL | http://arxiv.org/abs/1708.01894v4 |
http://arxiv.org/pdf/1708.01894v4.pdf | |
PWC | https://paperswithcode.com/paper/endnet-sparse-autoencoder-network-for |
Repo | |
Framework | |
A Multi-model Combination Approach for Probabilistic Wind Power Forecasting
Title | A Multi-model Combination Approach for Probabilistic Wind Power Forecasting |
Authors | You Lin, Ming Yang, Can Wan, Jianhui Wang, Yonghua Song |
Abstract | Short-term probabilistic wind power forecasting can provide critical quantified uncertainty information of wind generation for power system operation and control. As the complicated characteristics of wind power prediction error, it would be difficult to develop a universal forecasting model dominating over other alternative models. Therefore, a novel multi-model combination (MMC) approach for short-term probabilistic wind generation forecasting is proposed in this paper to exploit the advantages of different forecasting models. The proposed approach can combine different forecasting models those provide different kinds of probability density functions to improve the probabilistic forecast accuracy. Three probabilistic forecasting models based on the sparse Bayesian learning, kernel density estimation and beta distribution fitting are used to form the combined model. The parameters of the MMC model are solved based on Bayesian framework. Numerical tests illustrate the effectiveness of the proposed MMC approach. |
Tasks | Density Estimation |
Published | 2017-02-13 |
URL | http://arxiv.org/abs/1702.03613v1 |
http://arxiv.org/pdf/1702.03613v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-model-combination-approach-for |
Repo | |
Framework | |
Hyperspectral Unmixing with Endmember Variability using Semi-supervised Partial Membership Latent Dirichlet Allocation
Title | Hyperspectral Unmixing with Endmember Variability using Semi-supervised Partial Membership Latent Dirichlet Allocation |
Authors | Sheng Zou, Hao Sun, Alina Zare |
Abstract | A semi-supervised Partial Membership Latent Dirichlet Allocation approach is developed for hyperspectral unmixing and endmember estimation while accounting for spectral variability and spatial information. Partial Membership Latent Dirichlet Allocation is an effective approach for spectral unmixing while representing spectral variability and leveraging spatial information. In this work, we extend Partial Membership Latent Dirichlet Allocation to incorporate any available (imprecise) label information to help guide unmixing. Experimental results on two hyperspectral datasets show that the proposed semi-supervised PM-LDA can yield improved hyperspectral unmixing and endmember estimation results. |
Tasks | Hyperspectral Unmixing |
Published | 2017-03-17 |
URL | http://arxiv.org/abs/1703.06151v1 |
http://arxiv.org/pdf/1703.06151v1.pdf | |
PWC | https://paperswithcode.com/paper/hyperspectral-unmixing-with-endmember |
Repo | |
Framework | |
Disentangling Motion, Foreground and Background Features in Videos
Title | Disentangling Motion, Foreground and Background Features in Videos |
Authors | Xunyu Lin, Victor Campos, Xavier Giro-i-Nieto, Jordi Torres, Cristian Canton Ferrer |
Abstract | This paper introduces an unsupervised framework to extract semantically rich features for video representation. Inspired by how the human visual system groups objects based on motion cues, we propose a deep convolutional neural network that disentangles motion, foreground and background information. The proposed architecture consists of a 3D convolutional feature encoder for blocks of 16 frames, which is trained for reconstruction tasks over the first and last frames of the sequence. A preliminary supervised experiment was conducted to verify the feasibility of proposed method by training the model with a fraction of videos from the UCF-101 dataset taking as ground truth the bounding boxes around the activity regions. Qualitative results indicate that the network can successfully segment foreground and background in videos as well as update the foreground appearance based on disentangled motion features. The benefits of these learned features are shown in a discriminative classification task, where initializing the network with the proposed pretraining method outperforms both random initialization and autoencoder pretraining. Our model and source code are publicly available at https://imatge-upc.github.io/unsupervised-2017-cvprw/ . |
Tasks | |
Published | 2017-07-13 |
URL | http://arxiv.org/abs/1707.04092v2 |
http://arxiv.org/pdf/1707.04092v2.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-motion-foreground-and |
Repo | |
Framework | |
Three IQs of AI Systems and their Testing Methods
Title | Three IQs of AI Systems and their Testing Methods |
Authors | Feng Liu, Yong Shi, Ying Liu |
Abstract | The rapid development of artificial intelligence has brought the artificial intelligence threat theory as well as the problem about how to evaluate the intelligence level of intelligent products. Both need to find a quantitative method to evaluate the intelligence level of intelligence systems, including human intelligence. Based on the standard intelligence system and the extended Von Neumann architecture, this paper proposes General IQ, Service IQ and Value IQ evaluation methods for intelligence systems, depending on different evaluation purposes. Among them, the General IQ of intelligence systems is to answer the question of whether the artificial intelligence can surpass the human intelligence, which is reflected in putting the intelligence systems on an equal status and conducting the unified evaluation. The Service IQ and Value IQ of intelligence systems are used to answer the question of how the intelligent products can better serve the human, reflecting the intelligence and required cost of each intelligence system as a product in the process of serving human. |
Tasks | |
Published | 2017-12-14 |
URL | http://arxiv.org/abs/1712.06440v1 |
http://arxiv.org/pdf/1712.06440v1.pdf | |
PWC | https://paperswithcode.com/paper/three-iqs-of-ai-systems-and-their-testing |
Repo | |
Framework | |
Ocasta: Clustering Configuration Settings For Error Recovery
Title | Ocasta: Clustering Configuration Settings For Error Recovery |
Authors | Zhen Huang, David Lie |
Abstract | Effective machine-aided diagnosis and repair of configuration errors continues to elude computer systems designers. Most of the literature targets errors that can be attributed to a single erroneous configuration setting. However, a recent study found that a significant amount of configuration errors require fixing more than one setting together. To address this limitation, Ocasta statistically clusters dependent configuration settings based on the application’s accesses to its configuration settings and utilizes the extracted clustering of configuration settings to fix configuration errors involving more than one configuration settings. Ocasta treats applications as black-boxes and only relies on the ability to observe application accesses to their configuration settings. We collected traces of real application usage from 24 Linux and 5 Windows desktops computers and found that Ocasta is able to correctly identify clusters with 88.6% accuracy. To demonstrate the effectiveness of Ocasta, we evaluated it on 16 real-world configuration errors of 11 Linux and Windows applications. Ocasta is able to successfully repair all evaluated configuration errors in 11 minutes on average and only requires the user to examine an average of 3 screenshots of the output of the application to confirm that the error is repaired. A user study we conducted shows that Ocasta is easy to use by both expert and non-expert users and is more efficient than manual configuration error troubleshooting. |
Tasks | |
Published | 2017-11-02 |
URL | http://arxiv.org/abs/1711.04030v1 |
http://arxiv.org/pdf/1711.04030v1.pdf | |
PWC | https://paperswithcode.com/paper/ocasta-clustering-configuration-settings-for |
Repo | |
Framework | |
Music Signal Processing Using Vector Product Neural Networks
Title | Music Signal Processing Using Vector Product Neural Networks |
Authors | Z. C. Fan, T. S. Chan, Y. H. Yang, J. S. R. Jang |
Abstract | We propose a novel neural network model for music signal processing using vector product neurons and dimensionality transformations. Here, the inputs are first mapped from real values into three-dimensional vectors then fed into a three-dimensional vector product neural network where the inputs, outputs, and weights are all three-dimensional values. Next, the final outputs are mapped back to the reals. Two methods for dimensionality transformation are proposed, one via context windows and the other via spectral coloring. Experimental results on the iKala dataset for blind singing voice separation confirm the efficacy of our model. |
Tasks | |
Published | 2017-06-29 |
URL | http://arxiv.org/abs/1706.09555v1 |
http://arxiv.org/pdf/1706.09555v1.pdf | |
PWC | https://paperswithcode.com/paper/music-signal-processing-using-vector-product |
Repo | |
Framework | |
Hierarchical Subtask Discovery With Non-Negative Matrix Factorization
Title | Hierarchical Subtask Discovery With Non-Negative Matrix Factorization |
Authors | Adam C. Earle, Andrew M. Saxe, Benjamin Rosman |
Abstract | Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We present a novel algorithm for subtask discovery, based on the recently introduced multitask linearly-solvable Markov decision process (MLMDP) framework. The MLMDP can perform never-before-seen tasks by representing them as a linear combination of a previously learned basis set of tasks. In this setting, the subtask discovery problem can naturally be posed as finding an optimal low-rank approximation of the set of tasks the agent will face in a domain. We use non-negative matrix factorization to discover this minimal basis set of tasks, and show that the technique learns intuitive decompositions in a variety of domains. Our method has several qualitatively desirable features: it is not limited to learning subtasks with single goal states, instead learning distributed patterns of preferred states; it learns qualitatively different hierarchical decompositions in the same domain depending on the ensemble of tasks the agent will face; and it may be straightforwardly iterated to obtain deeper hierarchical decompositions. |
Tasks | Hierarchical Reinforcement Learning |
Published | 2017-08-01 |
URL | http://arxiv.org/abs/1708.00463v1 |
http://arxiv.org/pdf/1708.00463v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-subtask-discovery-with-non |
Repo | |
Framework | |
On the Performance of Network Parallel Training in Artificial Neural Networks
Title | On the Performance of Network Parallel Training in Artificial Neural Networks |
Authors | Ludvig Ericson, Rendani Mbuvha |
Abstract | Artificial Neural Networks (ANNs) have received increasing attention in recent years with applications that span a wide range of disciplines including vital domains such as medicine, network security and autonomous transportation. However, neural network architectures are becoming increasingly complex and with an increasing need to obtain real-time results from such models, it has become pivotal to use parallelization as a mechanism for speeding up network training and deployment. In this work we propose an implementation of Network Parallel Training through Cannon’s Algorithm for matrix multiplication. We show that increasing the number of processes speeds up training until the point where process communication costs become prohibitive; this point varies by network complexity. We also show through empirical efficiency calculations that the speedup obtained is superlinear. |
Tasks | |
Published | 2017-01-18 |
URL | http://arxiv.org/abs/1701.05130v1 |
http://arxiv.org/pdf/1701.05130v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-performance-of-network-parallel |
Repo | |
Framework | |