Paper Group ANR 224
Defining the Collective Intelligence Supply Chain. Learning from the Kernel and the Range Space. Gaussian Mixture Generative Adversarial Networks for Diverse Datasets, and the Unsupervised Clustering of Images. Classification of load forecasting studies by forecasting problem to select load forecasting techniques and methodologies. The Concept of C …
Defining the Collective Intelligence Supply Chain
Title | Defining the Collective Intelligence Supply Chain |
Authors | Iain Barclay, Alun Preece, Ian Taylor |
Abstract | Organisations are increasingly open to scrutiny, and need to be able to prove that they operate in a fair and ethical way. Accountability should extend to the production and use of the data and knowledge assets used in AI systems, as it would for any raw material or process used in production of physical goods. This paper considers collective intelligence, comprising data and knowledge generated by crowd-sourced workforces, which can be used as core components of AI systems. A proposal is made for the development of a supply chain model for tracking the creation and use of crowdsourced collective intelligence assets, with a blockchain based decentralised architecture identified as an appropriate means of providing validation, accountability and fairness. |
Tasks | |
Published | 2018-09-25 |
URL | http://arxiv.org/abs/1809.09444v1 |
http://arxiv.org/pdf/1809.09444v1.pdf | |
PWC | https://paperswithcode.com/paper/defining-the-collective-intelligence-supply |
Repo | |
Framework | |
Learning from the Kernel and the Range Space
Title | Learning from the Kernel and the Range Space |
Authors | Kar-Ann Toh |
Abstract | In this article, a novel approach to learning a complex function which can be written as the system of linear equations is introduced. This learning is grounded upon the observation that solving the system of linear equations by a manipulation in the kernel and the range space boils down to an estimation based on the least squares error approximation. The learning approach is applied to learn a deep feedforward network with full weight connections. The numerical experiments on network learning of synthetic and benchmark data not only show feasibility of the proposed learning approach but also provide insights into the mechanism of data representation. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09071v1 |
http://arxiv.org/pdf/1810.09071v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-the-kernel-and-the-range-space |
Repo | |
Framework | |
Gaussian Mixture Generative Adversarial Networks for Diverse Datasets, and the Unsupervised Clustering of Images
Title | Gaussian Mixture Generative Adversarial Networks for Diverse Datasets, and the Unsupervised Clustering of Images |
Authors | Matan Ben-Yosef, Daphna Weinshall |
Abstract | Generative Adversarial Networks (GANs) have been shown to produce realistically looking synthetic images with remarkable success, yet their performance seems less impressive when the training set is highly diverse. In order to provide a better fit to the target data distribution when the dataset includes many different classes, we propose a variant of the basic GAN model, called Gaussian Mixture GAN (GM-GAN), where the probability distribution over the latent space is a mixture of Gaussians. We also propose a supervised variant which is capable of conditional sample synthesis. In order to evaluate the model’s performance, we propose a new scoring method which separately takes into account two (typically conflicting) measures - diversity vs. quality of the generated data. Through a series of empirical experiments, using both synthetic and real-world datasets, we quantitatively show that GM-GANs outperform baselines, both when evaluated using the commonly used Inception Score, and when evaluated using our own alternative scoring method. In addition, we qualitatively demonstrate how the \textit{unsupervised} variant of GM-GAN tends to map latent vectors sampled from different Gaussians in the latent space to samples of different classes in the data space. We show how this phenomenon can be exploited for the task of unsupervised clustering, and provide quantitative evaluation showing the superiority of our method for the unsupervised clustering of image datasets. Finally, we demonstrate a feature which further sets our model apart from other GAN models: the option to control the quality-diversity trade-off by altering, post-training, the probability distribution of the latent space. This allows one to sample higher quality and lower diversity samples, or vice versa, according to one’s needs. |
Tasks | |
Published | 2018-08-30 |
URL | http://arxiv.org/abs/1808.10356v1 |
http://arxiv.org/pdf/1808.10356v1.pdf | |
PWC | https://paperswithcode.com/paper/gaussian-mixture-generative-adversarial |
Repo | |
Framework | |
Classification of load forecasting studies by forecasting problem to select load forecasting techniques and methodologies
Title | Classification of load forecasting studies by forecasting problem to select load forecasting techniques and methodologies |
Authors | Jonathan Dumas, Bertrand Cornélusse |
Abstract | The key contribution of this paper is to propose a classification into two dimensions of the load forecasting studies to decide which forecasting tools to use in which case. This classification aims to provide a synthetic view of the relevant forecasting techniques and methodologies by forecasting problem. In addition, the key principles of the main techniques and methodologies used are summarized along with the reviews of these papers. The classification process relies on two couples of parameters that define a forecasting problem. Each article is classified with key information about the dataset used and the forecasting tools implemented: the forecasting techniques (probabilistic or deterministic) and methodologies, the data cleansing techniques, and the error metrics. The process to select the articles reviewed in this paper was conducted into two steps. First, a set of load forecasting studies was built based on relevant load forecasting reviews and forecasting competitions. The second step consisted in selecting the most relevant studies of this set based on the following criteria: the quality of the description of the forecasting techniques and methodologies implemented, the description of the results, and the contributions. This paper can be read in two passes. The first one by identifying the forecasting problem of interest to select the corresponding class into one of the four classification tables. Each one references all the articles classified across a forecasting horizon. They provide a synthetic view of the forecasting tools used by articles addressing similar forecasting problems. Then, a second level composed of four Tables summarizes key information about the forecasting tools and the results of these studies. The second pass consists in reading the key principles of the main techniques and methodologies of interest and the reviews of the articles. |
Tasks | Load Forecasting |
Published | 2018-12-21 |
URL | https://arxiv.org/abs/1901.05052v2 |
https://arxiv.org/pdf/1901.05052v2.pdf | |
PWC | https://paperswithcode.com/paper/classification-of-load-forecasting-studies-by |
Repo | |
Framework | |
The Concept of Criticality in Reinforcement Learning
Title | The Concept of Criticality in Reinforcement Learning |
Authors | Yitzhak Spielberg, Amos Azaria |
Abstract | Reinforcement learning methods carry a well known bias-variance trade-off in n-step algorithms for optimal control. Unfortunately, this has rarely been addressed in current research. This trade-off principle holds independent of the choice of the algorithm, such as n-step SARSA, n-step Expected SARSA or n-step Tree backup. A small n results in a large bias, while a large n leads to large variance. The literature offers no straightforward recipe for the best choice of this value. While currently all n-step algorithms use a fixed value of n over the state space we extend the framework of n-step updates by allowing each state to have its specific n. We propose a solution to this problem within the context of human aided reinforcement learning. Our approach is based on the observation that a human can learn more efficiently if she receives input regarding the criticality of a given state and thus the amount of attention she needs to invest into the learning in that state. This observation is related to the idea that each state of the MDP has a certain measure of criticality which indicates how much the choice of the action in that state influences the return. In our algorithm the RL agent utilizes the criticality measure, a function provided by a human trainer, in order to locally choose the best stepnumber n for the update of the Q function. |
Tasks | |
Published | 2018-10-16 |
URL | http://arxiv.org/abs/1810.07254v1 |
http://arxiv.org/pdf/1810.07254v1.pdf | |
PWC | https://paperswithcode.com/paper/the-concept-of-criticality-in-reinforcement |
Repo | |
Framework | |
Uncalibrated Non-Rigid Factorisation by Independent Subspace Analysis
Title | Uncalibrated Non-Rigid Factorisation by Independent Subspace Analysis |
Authors | Sami Sebastian Brandt, Hanno Ackermann, Stella Grasshof |
Abstract | We propose a general, prior-free approach for the uncalibrated non-rigid structure-from-motion problem for modelling and analysis of non-rigid objects such as human faces. The word general refers to an approach that recovers the non-rigid affine structure and motion from 2D point correspondences by assuming that (1) the non-rigid shapes are generated by a linear combination of rigid 3D basis shapes, (2) that the non-rigid shapes are affine in nature, i.e., they can be modelled as deviations from the mean, rigid shape, (3) and that the basis shapes are statistically independent. In contrast to the majority of existing works, no prior information is assumed for the structure and motion apart from the assumption the that underlying basis shapes are statistically independent. The independent 3D shape bases are recovered by independent subspace analysis (ISA). Likewise, in contrast to the most previous approaches, no calibration information is assumed for affine cameras; the reconstruction is solved up to a global affine ambiguity that makes our approach simple but efficient. In the experiments, we evaluated the method with several standard data sets including a real face expression data set of 7200 faces with 2D point correspondences and unknown 3D structure and motion for which we obtained promising results. |
Tasks | Calibration |
Published | 2018-11-22 |
URL | http://arxiv.org/abs/1811.09132v1 |
http://arxiv.org/pdf/1811.09132v1.pdf | |
PWC | https://paperswithcode.com/paper/uncalibrated-non-rigid-factorisation-by |
Repo | |
Framework | |
Contextual Hourglass Network for Semantic Segmentation of High Resolution Aerial Imagery
Title | Contextual Hourglass Network for Semantic Segmentation of High Resolution Aerial Imagery |
Authors | Panfeng Li, Youzuo Lin, Emily Schultz-Fellenz |
Abstract | Semantic segmentation for aerial imagery is a challenging and important problem in remotely sensed imagery analysis. In recent years, with the success of deep learning, various convolutional neural network (CNN) based models have been developed. However, due to the varying sizes of the objects and imbalanced class labels, it can be challenging to obtain accurate pixel-wise semantic segmentation results. To address those challenges, we develop a novel semantic segmentation method and call it Contextual Hourglass Network. In our method, in order to improve the robustness of the prediction, we design a new contextual hourglass module which incorporates attention mechanism on processed low-resolution featuremaps to exploit the contextual semantics. We further exploit the stacked encoder-decoder structure by connecting multiple contextual hourglass modules from end to end. This architecture can effectively extract rich multi-scale features and add more feedback loops for better learning contextual semantics through intermediate supervision. To demonstrate the efficacy of our semantic segmentation method, we test it on Potsdam and Vaihingen datasets. Through the comparisons to other baseline methods, our method yields the best results on overall performance. |
Tasks | Semantic Segmentation |
Published | 2018-10-30 |
URL | http://arxiv.org/abs/1810.12813v2 |
http://arxiv.org/pdf/1810.12813v2.pdf | |
PWC | https://paperswithcode.com/paper/contextual-hourglass-network-for-semantic |
Repo | |
Framework | |
Automatic multi-objective based feature selection for classification
Title | Automatic multi-objective based feature selection for classification |
Authors | Zhiguo Zhou, Shulong Li, Genggeng Qin, Michael Folkert, Steve Jiang, Jing Wang |
Abstract | Objective: Accurately classifying the malignancy of lesions detected in a screening scan is critical for reducing false positives. Radiomics holds great potential to differentiate malignant from benign tumors by extracting and analyzing a large number of quantitative image features. Since not all radiomic features contribute to an effective classifying model, selecting an optimal feature subset is critical. Methods: This work proposes a new multi-objective based feature selection (MO-FS) algorithm that considers sensitivity and specificity simultaneously as the objective functions during feature selection. For MO-FS, we developed a modified entropy based termination criterion (METC) that stops the algorithm automatically rather than relying on a preset number of generations. We also designed a solution selection methodology for multi-objective learning that uses the evidential reasoning approach (SMOLER) to automatically select the optimal solution from the Pareto-optimal set. Furthermore, we developed an adaptive mutation operation to generate the mutation probability in MO-FS automatically. Results: We evaluated the MO-FS for classifying lung nodule malignancy in low-dose CT and breast lesion malignancy in digital breast tomosynthesis. Conclusion: The experimental results demonstrated that the feature set selected by MO-FS achieved better classification performance than features selected by other commonly used methods. Significance: The proposed method is general and more effective radiomic feature selection strategy. |
Tasks | Feature Selection |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03236v4 |
http://arxiv.org/pdf/1807.03236v4.pdf | |
PWC | https://paperswithcode.com/paper/automatic-multi-objective-based-feature |
Repo | |
Framework | |
Revealing Fine Structures of the Retinal Receptive Field by Deep Learning Networks
Title | Revealing Fine Structures of the Retinal Receptive Field by Deep Learning Networks |
Authors | Qi Yan, Yajing Zheng, Shanshan Jia, Yichen Zhang, Zhaofei Yu, Feng Chen, Yonghong Tian, Tiejun Huang, Jian K. Liu |
Abstract | Deep convolutional neural networks (CNNs) have demonstrated impressive performance on many visual tasks. Recently, they became useful models for the visual system in neuroscience. However, it is still not clear what are learned by CNNs in terms of neuronal circuits. When a deep CNN with many layers is used for the visual system, it is not easy to compare the structure components of CNNs with possible neuroscience underpinnings due to highly complex circuits from the retina to higher visual cortex. Here we address this issue by focusing on single retinal ganglion cells with biophysical models and recording data from animals. By training CNNs with white noise images to predict neuronal responses, we found that fine structures of the retinal receptive field can be revealed. Specifically, convolutional filters learned are resembling biological components of the retinal circuit. This suggests that a CNN learning from one single retinal cell reveals a minimal neural network carried out in this cell. Furthermore, when CNNs learned from different cells are transferred between cells, there is a diversity of transfer learning performance, which indicates that CNNs are cell-specific. Moreover, when CNNs are transferred between different types of input images, here white noise v.s. natural images, transfer learning shows a good performance, which implies that CNNs indeed capture the full computational ability of a single retinal cell for different inputs. Taken together, these results suggest that CNNs could be used to reveal structure components of neuronal circuits, and provide a powerful model for neural system identification. |
Tasks | Transfer Learning |
Published | 2018-11-06 |
URL | https://arxiv.org/abs/1811.02290v2 |
https://arxiv.org/pdf/1811.02290v2.pdf | |
PWC | https://paperswithcode.com/paper/revealing-fine-structures-of-the-retinal |
Repo | |
Framework | |
Toward Multimodal Model-Agnostic Meta-Learning
Title | Toward Multimodal Model-Agnostic Meta-Learning |
Authors | Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim |
Abstract | Gradient-based meta-learners such as MAML are able to learn a meta-prior from similar tasks to adapt to novel tasks from the same distribution with few gradient updates. One important limitation of such frameworks is that they seek a common initialization shared across the entire task distribution, substantially limiting the diversity of the task distributions that they are able to learn from. In this paper, we augment MAML with the capability to identify tasks sampled from a multimodal task distribution and adapt quickly through gradient updates. Specifically, we propose a multimodal MAML algorithm that is able to modulate its meta-learned prior according to the identified task, allowing faster adaptation. We evaluate the proposed model on a diverse set of problems including regression, few-shot image classification, and reinforcement learning. The results demonstrate the effectiveness of our model in modulating the meta-learned prior in response to the characteristics of tasks sampled from a multimodal distribution. |
Tasks | Few-Shot Image Classification, Image Classification, Meta-Learning |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07172v1 |
http://arxiv.org/pdf/1812.07172v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-multimodal-model-agnostic-meta |
Repo | |
Framework | |
Smoothed Analysis of Discrete Tensor Decomposition and Assemblies of Neurons
Title | Smoothed Analysis of Discrete Tensor Decomposition and Assemblies of Neurons |
Authors | Nima Anari, Constantinos Daskalakis, Wolfgang Maass, Christos H. Papadimitriou, Amin Saberi, Santosh Vempala |
Abstract | We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] but allows for a wider range of perturbation models, including discrete ones. We give an application to recovering assemblies of neurons. Assemblies are large sets of neurons representing specific memories or concepts. The size of the intersection of two assemblies has been shown in experiments to represent the extent to which these memories co-occur or these concepts are related; the phenomenon is called association of assemblies. This suggests that an animal’s memory is a complex web of associations, and poses the problem of recovering this representation from cognitive data. Motivated by this problem, we study the following more general question: Can we reconstruct the Venn diagram of a family of sets, given the sizes of their $\ell$-wise intersections? We show that as long as the family of sets is randomly perturbed, it is enough for the number of measurements to be polynomially larger than the number of nonempty regions of the Venn diagram to fully reconstruct the diagram. |
Tasks | |
Published | 2018-10-28 |
URL | http://arxiv.org/abs/1810.11896v1 |
http://arxiv.org/pdf/1810.11896v1.pdf | |
PWC | https://paperswithcode.com/paper/smoothed-analysis-of-discrete-tensor |
Repo | |
Framework | |
Towards Long-Term Memory for Social Robots: Proposing a New Challenge for the RoboCup@Home League
Title | Towards Long-Term Memory for Social Robots: Proposing a New Challenge for the RoboCup@Home League |
Authors | Matías Pavez, Javier Ruiz del Solar, Victoria Amo, Felix Meyer zu Driehausen |
Abstract | Long-term memory is essential to feel like a continuous being, and to be able to interact/communicate coherently. Social robots need long-term memories in order to establish long-term relationships with humans and other robots, and do not act just for the moment. In this paper this challenge is highlighted, open questions are identified, the need of addressing this challenge in the RoboCup@Home League with new tests is motivated, and a new test is proposed. |
Tasks | |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.10758v1 |
http://arxiv.org/pdf/1811.10758v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-long-term-memory-for-social-robots |
Repo | |
Framework | |
Learning Item-Interaction Embeddings for User Recommendations
Title | Learning Item-Interaction Embeddings for User Recommendations |
Authors | Xiaoting Zhao, Raphael Louca, Diane Hu, Liangjie Hong |
Abstract | Industry-scale recommendation systems have become a cornerstone of the e-commerce shopping experience. For Etsy, an online marketplace with over 50 million handmade and vintage items, users come to rely on personalized recommendations to surface relevant items from its massive inventory. One hallmark of Etsy’s shopping experience is the multitude of ways in which a user can interact with an item they are interested in: they can view it, favorite it, add it to a collection, add it to cart, purchase it, etc. We hypothesize that the different ways in which a user interacts with an item indicates different kinds of intent. Consequently, a user’s recommendations should be based not only on the item from their past activity, but also the way in which they interacted with that item. In this paper, we propose a novel method for learning interaction-based item embeddings that encode the co-occurrence patterns of not only the item itself, but also the interaction type. The learned embeddings give us a convenient way of approximating the likelihood that one item-interaction pair would co-occur with another by way of a simple inner product. Because of its computational efficiency, our model lends itself naturally as a candidate set selection method, and we evaluate it as such in an industry-scale recommendation system that serves live traffic on Etsy.com. Our experiments reveal that taking interaction type into account shows promising results in improving the accuracy of modeling user shopping behavior. |
Tasks | Recommendation Systems |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04407v1 |
http://arxiv.org/pdf/1812.04407v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-item-interaction-embeddings-for-user |
Repo | |
Framework | |
A Preliminary Study on Hyperparameter Configuration for Human Activity Recognition
Title | A Preliminary Study on Hyperparameter Configuration for Human Activity Recognition |
Authors | Kemilly Dearo Garcia, Tiago Carvalho, João Mendes-Moreira, João M. P. Cardoso, André C. P. L. F. de Carvalho |
Abstract | Human activity recognition (HAR) is a classification task that aims to classify human activities or predict human behavior by means of features extracted from sensors data. Typical HAR systems use wearable sensors and/or handheld and mobile devices with built-in sensing capabilities. Due to the widespread use of smartphones and to the inclusion of various sensors in all contemporary smartphones (e.g., accelerometers and gyroscopes), they are commonly used for extracting and collecting data from sensors and even for implementing HAR systems. When using mobile devices, e.g., smartphones, HAR systems need to deal with several constraints regarding battery, computation and memory. These constraints enforce the need of a system capable of managing its resources and maintain acceptable levels of classification accuracy. Moreover, several factors can influence activity recognition, such as classification models, sensors availability and size of data window for feature extraction, making stable accuracy a difficult task. In this paper, we present a semi-supervised classifier and a study regarding the influence of hyperparameter configuration in classification accuracy, depending on the user and the activities performed by each user. This study focuses on sensing data provided by the PAMAP2 dataset. Experimental results show that it is possible to maintain classification accuracy by adjusting hyperparameters, like window size and windows overlap factor, depending on user and activity performed. These experiments motivate the development of a system able to automatically adapt hyperparameter settings for the activity performed by each user. |
Tasks | Activity Recognition, Human Activity Recognition |
Published | 2018-10-25 |
URL | http://arxiv.org/abs/1810.10956v1 |
http://arxiv.org/pdf/1810.10956v1.pdf | |
PWC | https://paperswithcode.com/paper/a-preliminary-study-on-hyperparameter |
Repo | |
Framework | |
Interpreting Models by Allowing to Ask
Title | Interpreting Models by Allowing to Ask |
Authors | Sungmin Kang, David Keetae Park, Jaehyuk Chang, Jaegul Choo |
Abstract | Questions convey information about the questioner, namely what one does not know. In this paper, we propose a novel approach to allow a learning agent to ask what it considers as tricky to predict, in the course of producing a final output. By analyzing when and what it asks, we can make our model more transparent and interpretable. We first develop this idea to propose a general framework of deep neural networks that can ask questions, which we call asking networks. A specific architecture and training process for an asking network is proposed for the task of colorization, which is an exemplar one-to-many task and thus a task where asking questions is helpful in performing the task accurately. Our results show that the model learns to generate meaningful questions, asks difficult questions first, and utilizes the provided hint more efficiently than baseline models. We conclude that the proposed asking framework makes the learning agent reveal its weaknesses, which poses a promising new direction in developing interpretable and interactive models. |
Tasks | Colorization |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05106v1 |
http://arxiv.org/pdf/1811.05106v1.pdf | |
PWC | https://paperswithcode.com/paper/interpreting-models-by-allowing-to-ask |
Repo | |
Framework | |