Paper Group ANR 168
On Nonlinear Dimensionality Reduction, Linear Smoothing and Autoencoding. Brain Connectivity Impairments and Categorization Disabilities in Autism: A Theoretical Approach via Artificial Neural Networks. Autoencoder Based Architecture For Fast & Real Time Audio Style Transfer. Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Alg …
On Nonlinear Dimensionality Reduction, Linear Smoothing and Autoencoding
Title | On Nonlinear Dimensionality Reduction, Linear Smoothing and Autoencoding |
Authors | Daniel Ting, Michael I. Jordan |
Abstract | We develop theory for nonlinear dimensionality reduction (NLDR). A number of NLDR methods have been developed, but there is limited understanding of how these methods work and the relationships between them. There is limited basis for using existing NLDR theory for deriving new algorithms. We provide a novel framework for analysis of NLDR via a connection to the statistical theory of linear smoothers. This allows us to both understand existing methods and derive new ones. We use this connection to smoothing to show that asymptotically, existing NLDR methods correspond to discrete approximations of the solutions of sets of differential equations given a boundary condition. In particular, we can characterize many existing methods in terms of just three limiting differential operators and boundary conditions. Our theory also provides a way to assert that one method is preferable to another; indeed, we show Local Tangent Space Alignment is superior within a class of methods that assume a global coordinate chart defines an isometric embedding of the manifold. |
Tasks | Dimensionality Reduction |
Published | 2018-03-06 |
URL | http://arxiv.org/abs/1803.02432v1 |
http://arxiv.org/pdf/1803.02432v1.pdf | |
PWC | https://paperswithcode.com/paper/on-nonlinear-dimensionality-reduction-linear |
Repo | |
Framework | |
Brain Connectivity Impairments and Categorization Disabilities in Autism: A Theoretical Approach via Artificial Neural Networks
Title | Brain Connectivity Impairments and Categorization Disabilities in Autism: A Theoretical Approach via Artificial Neural Networks |
Authors | Daniele Q. M. Madureira, Vera Lucia P. S. Caminha, Rogerio Salvini |
Abstract | A developmental disorder that severely damages communicative and social functions, the Autism Spectrum Disorder (ASD) also presents aspects related to mental rigidity, repetitive behavior, and difficulty in abstract reasoning. More, imbalances between excitatory and inhibitory brain states, in addition to cortical connectivity disruptions, are at the source of the autistic behavior. Our main goal consists in unveiling the way by which these local excitatory imbalances and/or long brain connections disruptions are linked to the above mentioned cognitive features. We developed a theoretical model based on Self-Organizing Maps (SOM), where a three-level artificial neural network qualitatively incorporates these kinds of alterations observed in brains of patients with ASD. Computational simulations of our model indicate that high excitatory states or long distance under-connectivity are at the origins of cognitive alterations, as difficulty in categorization and mental rigidity. More specifically, the enlargement of excitatory synaptic reach areas in a cortical map development conducts to low categorization (over-selectivity) and poor concepts formation. And, both the over-strengthening of local excitatory synapses and the long distance under-connectivity, although through distinct mechanisms, contribute to impaired categorization (under-selectivity) and mental rigidity. Our results indicate how, together, both local and global brain connectivity alterations give rise to spoiled cortical structures in distinct ways and in distinct cortical areas. These alterations would disrupt the codification of sensory stimuli, the representation of concepts and, thus, the process of categorization - by this way imposing serious limits to the mental flexibility and to the capacity of generalization in the autistic reasoning. |
Tasks | |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.07020v1 |
http://arxiv.org/pdf/1811.07020v1.pdf | |
PWC | https://paperswithcode.com/paper/brain-connectivity-impairments-and |
Repo | |
Framework | |
Autoencoder Based Architecture For Fast & Real Time Audio Style Transfer
Title | Autoencoder Based Architecture For Fast & Real Time Audio Style Transfer |
Authors | Dhruv Ramani, Samarjit Karmakar, Anirban Panda, Asad Ahmed, Pratham Tangri |
Abstract | Recently, there has been great interest in the field of audio style transfer, where a stylized audio is generated by imposing the style of a reference audio on the content of a target audio. We improve on the current approaches which use neural networks to extract the content and the style of the audio signal and propose a new autoencoder based architecture for the task. This network generates a stylized audio for a content audio in a single forward pass. The proposed network architecture proves to be advantageous over the quality of audio produced and the time taken to train the network. The network is experimented on speech signals to confirm the validity of our proposal. |
Tasks | Style Transfer |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07159v2 |
http://arxiv.org/pdf/1812.07159v2.pdf | |
PWC | https://paperswithcode.com/paper/autoencoder-based-architecture-for-fast-real |
Repo | |
Framework | |
Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms
Title | Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms |
Authors | Ashok Vardhan Makkuva, Sewoong Oh, Sreeram Kannan, Pramod Viswanath |
Abstract | Mixture-of-Experts (MoE) is a widely popular model for ensemble learning and is a basic building block of highly successful modern neural networks as well as a component in Gated Recurrent Units (GRU) and Attention networks. However, present algorithms for learning MoE including the EM algorithm, and gradient descent are known to get stuck in local optima. From a theoretical viewpoint, finding an efficient and provably consistent algorithm to learn the parameters remains a long standing open problem for more than two decades. In this paper, we introduce the first algorithm that learns the true parameters of a MoE model for a wide class of non-linearities with global consistency guarantees. While existing algorithms jointly or iteratively estimate the expert parameters and the gating paramters in the MoE, we propose a novel algorithm that breaks the deadlock and can directly estimate the expert parameters by sensing its echo in a carefully designed cross-moment tensor between the inputs and the output. Once the experts are known, the recovery of gating parameters still requires an EM algorithm; however, we show that the EM algorithm for this simplified problem, unlike the joint EM algorithm, converges to the true parameters. We empirically validate our algorithm on both the synthetic and real data sets in a variety of settings, and show superior performance to standard baselines. |
Tasks | |
Published | 2018-02-21 |
URL | https://arxiv.org/abs/1802.07417v3 |
https://arxiv.org/pdf/1802.07417v3.pdf | |
PWC | https://paperswithcode.com/paper/breaking-the-gridlock-in-mixture-of-experts |
Repo | |
Framework | |
Question-Guided Hybrid Convolution for Visual Question Answering
Title | Question-Guided Hybrid Convolution for Visual Question Answering |
Authors | Peng Gao, Pan Lu, Hongsheng Li, Shuang Li, Yikang Li, Steven Hoi, Xiaogang Wang |
Abstract | In this paper, we propose a novel Question-Guided Hybrid Convolution (QGHC) network for Visual Question Answering (VQA). Most state-of-the-art VQA methods fuse the high-level textual and visual features from the neural network and abandon the visual spatial information when learning multi-modal features.To address these problems, question-guided kernels generated from the input question are designed to convolute with visual features for capturing the textual and visual relationship in the early stage. The question-guided convolution can tightly couple the textual and visual information but also introduce more parameters when learning kernels. We apply the group convolution, which consists of question-independent kernels and question-dependent kernels, to reduce the parameter size and alleviate over-fitting. The hybrid convolution can generate discriminative multi-modal features with fewer parameters. The proposed approach is also complementary to existing bilinear pooling fusion and attention based VQA methods. By integrating with them, our method could further boost the performance. Extensive experiments on public VQA datasets validate the effectiveness of QGHC. |
Tasks | Question Answering, Visual Question Answering |
Published | 2018-08-08 |
URL | http://arxiv.org/abs/1808.02632v1 |
http://arxiv.org/pdf/1808.02632v1.pdf | |
PWC | https://paperswithcode.com/paper/question-guided-hybrid-convolution-for-visual |
Repo | |
Framework | |
Convex Programming Based Spectral Clustering
Title | Convex Programming Based Spectral Clustering |
Authors | Tomohiko Mizutani |
Abstract | Clustering is a fundamental task in data analysis, and spectral clustering has been recognized as a promising approach to it. Given a graph describing the relationship between data, spectral clustering explores the underlying cluster structure in two stages. The first stage embeds the nodes of the graph into real space, and the second stage groups the embedded nodes into several clusters. The use of the $k$-means method in the grouping stage is currently standard practice. We present a spectral clustering algorithm that uses convex programming in the grouping stage, and study how well it works. The concept behind the algorithm design lies in the following observation. The nodes with the largest degree in each cluster may be found by computing an enclosing ellipsoid for embedded nodes in real space, and the clusters may be identified by using those nodes. We show that the observations are valid, and the algorithm returns clusters to provide the conductance of graph, if the gap assumption, introduced by Peng el al. at COLT 2015, is satisfied. We also give an experimental assessment of the algorithm’s performance. |
Tasks | |
Published | 2018-05-11 |
URL | http://arxiv.org/abs/1805.04246v1 |
http://arxiv.org/pdf/1805.04246v1.pdf | |
PWC | https://paperswithcode.com/paper/convex-programming-based-spectral-clustering |
Repo | |
Framework | |
Generalization Error Bounds for Noisy, Iterative Algorithms
Title | Generalization Error Bounds for Noisy, Iterative Algorithms |
Authors | Ankit Pensia, Varun Jog, Po-Ling Loh |
Abstract | In statistical learning theory, generalization error is used to quantify the degree to which a supervised machine learning algorithm may overfit to training data. Recent work [Xu and Raginsky (2017)] has established a bound on the generalization error of empirical risk minimization based on the mutual information $I(S;W)$ between the algorithm input $S$ and the algorithm output $W$, when the loss function is sub-Gaussian. We leverage these results to derive generalization error bounds for a broad class of iterative algorithms that are characterized by bounded, noisy updates with Markovian structure. Our bounds are very general and are applicable to numerous settings of interest, including stochastic gradient Langevin dynamics (SGLD) and variants of the stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm. Furthermore, our error bounds hold for any output function computed over the path of iterates, including the last iterate of the algorithm or the average of subsets of iterates, and also allow for non-uniform sampling of data in successive updates of the algorithm. |
Tasks | |
Published | 2018-01-12 |
URL | http://arxiv.org/abs/1801.04295v1 |
http://arxiv.org/pdf/1801.04295v1.pdf | |
PWC | https://paperswithcode.com/paper/generalization-error-bounds-for-noisy |
Repo | |
Framework | |
Computational Theories of Curiosity-Driven Learning
Title | Computational Theories of Curiosity-Driven Learning |
Authors | Pierre-Yves Oudeyer |
Abstract | What are the functions of curiosity? What are the mechanisms of curiosity-driven learning? We approach these questions about the living using concepts and tools from machine learning and developmental robotics. We argue that curiosity-driven learning enables organisms to make discoveries to solve complex problems with rare or deceptive rewards. By fostering exploration and discovery of a diversity of behavioural skills, and ignoring these rewards, curiosity can be efficient to bootstrap learning when there is no information, or deceptive information, about local improvement towards these problems. We also explain the key role of curiosity for efficient learning of world models. We review both normative and heuristic computational frameworks used to understand the mechanisms of curiosity in humans, conceptualizing the child as a sense-making organism. These frameworks enable us to discuss the bi-directional causal links between curiosity and learning, and to provide new hypotheses about the fundamental role of curiosity in self-organizing developmental structures through curriculum learning. We present various developmental robotics experiments that study these mechanisms in action, both supporting these hypotheses to understand better curiosity in humans and opening new research avenues in machine learning and artificial intelligence. Finally, we discuss challenges for the design of experimental paradigms for studying curiosity in psychology and cognitive neuroscience. Keywords: Curiosity, intrinsic motivation, lifelong learning, predictions, world model, rewards, free-energy principle, learning progress, machine learning, AI, developmental robotics, development, curriculum learning, self-organization. |
Tasks | |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10546v2 |
http://arxiv.org/pdf/1802.10546v2.pdf | |
PWC | https://paperswithcode.com/paper/computational-theories-of-curiosity-driven |
Repo | |
Framework | |
Nonparametric learning from Bayesian models with randomized objective functions
Title | Nonparametric learning from Bayesian models with randomized objective functions |
Authors | S. P. Lyddon, S. G. Walker, C. C. Holmes |
Abstract | Bayesian learning is built on an assumption that the model space contains a true reflection of the data generating mechanism. This assumption is problematic, particularly in complex data environments. Here we present a Bayesian nonparametric approach to learning that makes use of statistical models, but does not assume that the model is true. Our approach has provably better properties than using a parametric model and admits a Monte Carlo sampling scheme that can afford massive scalability on modern computer architectures. The model-based aspect of learning is particularly attractive for regularizing nonparametric inference when the sample size is small, and also for correcting approximate approaches such as variational Bayes (VB). We demonstrate the approach on a number of examples including VB classifiers and Bayesian random forests. |
Tasks | |
Published | 2018-06-29 |
URL | http://arxiv.org/abs/1806.11544v2 |
http://arxiv.org/pdf/1806.11544v2.pdf | |
PWC | https://paperswithcode.com/paper/nonparametric-learning-from-bayesian-models |
Repo | |
Framework | |
Visitors to urban greenspace have higher sentiment and lower negativity on Twitter
Title | Visitors to urban greenspace have higher sentiment and lower negativity on Twitter |
Authors | Aaron J. Schwartz, Peter Sheridan Dodds, Jarlath P. M. O’Neil-Dunne, Christopher M. Danforth, Taylor H. Ricketts |
Abstract | With more people living in cities, we are witnessing a decline in exposure to nature. A growing body of research has demonstrated an association between nature contact and improved mood. Here, we used Twitter and the Hedonometer, a world analysis tool, to investigate how sentiment, or the estimated happiness of the words people write, varied before, during, and after visits to San Francisco’s urban park system. We found that sentiment was substantially higher during park visits and remained elevated for several hours following the visit. Leveraging differences in vegetative cover across park types, we explored how different types of outdoor public spaces may contribute to subjective well-being. Tweets during visits to Regional Parks, which are greener and have greater vegetative cover, exhibited larger increases in sentiment than tweets during visits to Civic Plazas and Squares. Finally, we analyzed word frequencies to explore several mechanisms theorized to link nature exposure with mental and cognitive benefits. Negation words such as ‘no’, ‘not’, and ‘don’t’ decreased in frequency during visits to urban parks. These results can be used by urban planners and public health officials to better target nature contact recommendations for growing urban populations. |
Tasks | |
Published | 2018-07-20 |
URL | https://arxiv.org/abs/1807.07982v2 |
https://arxiv.org/pdf/1807.07982v2.pdf | |
PWC | https://paperswithcode.com/paper/exposure-to-urban-parks-improves-affect-and |
Repo | |
Framework | |
Divergence Network: Graphical calculation method of divergence functions
Title | Divergence Network: Graphical calculation method of divergence functions |
Authors | Tomohiro Nishiyama |
Abstract | In this paper, we introduce directed networks called `divergence network’ in order to perform graphical calculation of divergence functions. By using the divergence networks, we can easily understand the geometric meaning of calculation results and grasp relations among divergence functions intuitively. | |
Tasks | |
Published | 2018-10-30 |
URL | http://arxiv.org/abs/1810.12794v2 |
http://arxiv.org/pdf/1810.12794v2.pdf | |
PWC | https://paperswithcode.com/paper/divergence-network-graphical-calculation |
Repo | |
Framework | |
Neural Network-Based Equations for Predicting PGA and PGV in Texas, Oklahoma, and Kansas
Title | Neural Network-Based Equations for Predicting PGA and PGV in Texas, Oklahoma, and Kansas |
Authors | Farid Khosravikia, Yasaman Zeinali, Zoltan Nagy, Patricia Clayton, Ellen M. Rathje |
Abstract | Parts of Texas, Oklahoma, and Kansas have experienced increased rates of seismicity in recent years, providing new datasets of earthquake recordings to develop ground motion prediction models for this particular region of the Central and Eastern North America (CENA). This paper outlines a framework for using Artificial Neural Networks (ANNs) to develop attenuation models from the ground motion recordings in this region. While attenuation models exist for the CENA, concerns over the increased rate of seismicity in this region necessitate investigation of ground motions prediction models particular to these states. To do so, an ANN-based framework is proposed to predict peak ground acceleration (PGA) and peak ground velocity (PGV) given magnitude, earthquake source-to-site distance, and shear wave velocity. In this framework, approximately 4,500 ground motions with magnitude greater than 3.0 recorded in these three states (Texas, Oklahoma, and Kansas) since 2005 are considered. Results from this study suggest that existing ground motion prediction models developed for CENA do not accurately predict the ground motion intensity measures for earthquakes in this region, especially for those with low source-to-site distances or on very soft soil conditions. The proposed ANN models provide much more accurate prediction of the ground motion intensity measures at all distances and magnitudes. The proposed ANN models are also converted to relatively simple mathematical equations so that engineers can easily use them to predict the ground motion intensity measures for future events. Finally, through a sensitivity analysis, the contributions of the predictive parameters to the prediction of the considered intensity measures are investigated. |
Tasks | motion prediction |
Published | 2018-06-04 |
URL | http://arxiv.org/abs/1806.01052v1 |
http://arxiv.org/pdf/1806.01052v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-based-equations-for-predicting |
Repo | |
Framework | |
Meta-Learning for Multi-objective Reinforcement Learning
Title | Meta-Learning for Multi-objective Reinforcement Learning |
Authors | Xi Chen, Ali Ghadirzadeh, Mårten Björkman, Patric Jensfelt |
Abstract | Multi-objective reinforcement learning (MORL) is the generalization of standard reinforcement learning (RL) approaches to solve sequential decision making problems that consist of several, possibly conflicting, objectives. Generally, in such formulations, there is no single optimal policy which optimizes all the objectives simultaneously, and instead, a number of policies has to be found each optimizing a preference of the objectives. In other words, the MORL is framed as a meta-learning problem, with the task distribution given by a distribution over the preferences. We demonstrate that such a formulation results in a better approximation of the Pareto optimal solutions in terms of both the optimality and the computational efficiency. We evaluated our method on obtaining Pareto optimal policies using a number of continuous control problems with high degrees of freedom. |
Tasks | Continuous Control, Decision Making, Meta-Learning |
Published | 2018-11-08 |
URL | https://arxiv.org/abs/1811.03376v2 |
https://arxiv.org/pdf/1811.03376v2.pdf | |
PWC | https://paperswithcode.com/paper/meta-learning-for-multi-objective |
Repo | |
Framework | |
Towards Symbolic Reinforcement Learning with Common Sense
Title | Towards Symbolic Reinforcement Learning with Common Sense |
Authors | Artur d’Avila Garcez, Aimore Resende Riquetti Dutra, Eduardo Alonso |
Abstract | Deep Reinforcement Learning (deep RL) has made several breakthroughs in recent years in applications ranging from complex control tasks in unmanned vehicles to game playing. Despite their success, deep RL still lacks several important capacities of human intelligence, such as transfer learning, abstraction and interpretability. Deep Symbolic Reinforcement Learning (DSRL) seeks to incorporate such capacities to deep Q-networks (DQN) by learning a relevant symbolic representation prior to using Q-learning. In this paper, we propose a novel extension of DSRL, which we call Symbolic Reinforcement Learning with Common Sense (SRL+CS), offering a better balance between generalization and specialization, inspired by principles of common sense when assigning rewards and aggregating Q-values. Experiments reported in this paper show that SRL+CS learns consistently faster than Q-learning and DSRL, achieving also a higher accuracy. In the hardest case, where agents were trained in a deterministic environment and tested in a random environment, SRL+CS achieves nearly 100% average accuracy compared to DSRL’s 70% and DQN’s 50% accuracy. To the best of our knowledge, this is the first case of near perfect zero-shot transfer learning using Reinforcement Learning. |
Tasks | Common Sense Reasoning, Q-Learning, Transfer Learning |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08597v1 |
http://arxiv.org/pdf/1804.08597v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-symbolic-reinforcement-learning-with |
Repo | |
Framework | |
Word-Level Loss Extensions for Neural Temporal Relation Classification
Title | Word-Level Loss Extensions for Neural Temporal Relation Classification |
Authors | Artuur Leeuwenberg, Marie-Francine Moens |
Abstract | Unsupervised pre-trained word embeddings are used effectively for many tasks in natural language processing to leverage unlabeled textual data. Often these embeddings are either used as initializations or as fixed word representations for task-specific classification models. In this work, we extend our classification model’s task loss with an unsupervised auxiliary loss on the word-embedding level of the model. This is to ensure that the learned word representations contain both task-specific features, learned from the supervised loss component, and more general features learned from the unsupervised loss component. We evaluate our approach on the task of temporal relation extraction, in particular, narrative containment relation extraction from clinical records, and show that continued training of the embeddings on the unsupervised objective together with the task objective gives better task-specific embeddings, and results in an improvement over the state of the art on the THYME dataset, using only a general-domain part-of-speech tagger as linguistic resource. |
Tasks | Relation Classification, Relation Extraction, Word Embeddings |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.02374v1 |
http://arxiv.org/pdf/1808.02374v1.pdf | |
PWC | https://paperswithcode.com/paper/word-level-loss-extensions-for-neural |
Repo | |
Framework | |