Paper Group ANR 953
Finite Biased Teaching with Infinite Concept Classes. GP-RVM: Genetic Programing-based Symbolic Regression Using Relevance Vector Machine. Deep Spatial Feature Reconstruction for Partial Person Re-identification: Alignment-Free Approach. Is Q-learning Provably Efficient?. Smooth Inter-layer Propagation of Stabilized Neural Networks for Classificati …
Finite Biased Teaching with Infinite Concept Classes
Title | Finite Biased Teaching with Infinite Concept Classes |
Authors | Jose Hernandez-Orallo, Jan Arne Telle |
Abstract | We investigate the teaching of infinite concept classes through the effect of the learning bias (which is used by the learner to prefer some concepts over others and by the teacher to devise the teaching examples) and the sampling bias (which determines how the concepts are sampled from the class). We analyse two important classes: Turing machines and finite-state machines. We derive bounds for the biased teaching dimension when the learning bias is derived from a complexity measure (Kolmogorov complexity and minimal number of states respectively) and analyse the sampling distributions that lead to finite expected biased teaching dimensions. We highlight the existing trade-off between the bound and the representativeness of the sample, and its implications for the understanding of what teaching rich concepts to machines entails. |
Tasks | |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.07121v1 |
http://arxiv.org/pdf/1804.07121v1.pdf | |
PWC | https://paperswithcode.com/paper/finite-biased-teaching-with-infinite-concept |
Repo | |
Framework | |
GP-RVM: Genetic Programing-based Symbolic Regression Using Relevance Vector Machine
Title | GP-RVM: Genetic Programing-based Symbolic Regression Using Relevance Vector Machine |
Authors | Hossein Izadi Rad, Ji Feng, Hitoshi Iba |
Abstract | This paper proposes a hybrid basis function construction method (GP-RVM) for Symbolic Regression problem, which combines an extended version of Genetic Programming called Kaizen Programming and Relevance Vector Machine to evolve an optimal set of basis functions. Different from traditional evolutionary algorithms where a single individual is a complete solution, our method proposes a solution based on linear combination of basis functions built from individuals during the evolving process. RVM which is a sparse Bayesian kernel method selects suitable functions to constitute the basis. RVM determines the posterior weight of a function by evaluating its quality and sparsity. The solution produced by GP-RVM is a sparse Bayesian linear model of the coefficients of many non-linear functions. Our hybrid approach is focused on nonlinear white-box models selecting the right combination of functions to build robust predictions without prior knowledge about data. Experimental results show that GP-RVM outperforms conventional methods, which suggest that it is an efficient and accurate technique for solving SR. The computational complexity of GP-RVM scales in $O( M^{3})$, where $M$ is the number of functions in the basis set and is typically much smaller than the number $N$ of training patterns. |
Tasks | |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02502v3 |
http://arxiv.org/pdf/1806.02502v3.pdf | |
PWC | https://paperswithcode.com/paper/gp-rvm-genetic-programing-based-symbolic |
Repo | |
Framework | |
Deep Spatial Feature Reconstruction for Partial Person Re-identification: Alignment-Free Approach
Title | Deep Spatial Feature Reconstruction for Partial Person Re-identification: Alignment-Free Approach |
Authors | Lingxiao He, Jian Liang, Haiqing Li, Zhenan Sun |
Abstract | Partial person re-identification (re-id) is a challenging problem, where only several partial observations (images) of people are available for matching. However, few studies have provided flexible solutions to identifying a person in an image containing arbitrary part of the body. In this paper, we propose a fast and accurate matching method to address this problem. The proposed method leverages Fully Convolutional Network (FCN) to generate fix-sized spatial feature maps such that pixel-level features are consistent. To match a pair of person images of different sizes, a novel method called Deep Spatial feature Reconstruction (DSR) is further developed to avoid explicit alignment. Specifically, DSR exploits the reconstructing error from popular dictionary learning models to calculate the similarity between different spatial feature maps. In that way, we expect that the proposed FCN can decrease the similarity of coupled images from different persons and increase that from the same person. Experimental results on two partial person datasets demonstrate the efficiency and effectiveness of the proposed method in comparison with several state-of-the-art partial person re-id approaches. Additionally, DSR achieves competitive results on a benchmark person dataset Market1501 with 83.58% Rank-1 accuracy. |
Tasks | Dictionary Learning, Person Re-Identification |
Published | 2018-01-03 |
URL | http://arxiv.org/abs/1801.00881v3 |
http://arxiv.org/pdf/1801.00881v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-spatial-feature-reconstruction-for |
Repo | |
Framework | |
Is Q-learning Provably Efficient?
Title | Is Q-learning Provably Efficient? |
Authors | Chi Jin, Zeyuan Allen-Zhu, Sebastien Bubeck, Michael I. Jordan |
Abstract | Model-free reinforcement learning (RL) algorithms, such as Q-learning, directly parameterize and update value functions or policies without explicitly modeling the environment. They are typically simpler, more flexible to use, and thus more prevalent in modern deep RL than model-based approaches. However, empirical work has suggested that model-free algorithms may require more samples to learn [Deisenroth and Rasmussen 2011, Schulman et al. 2015]. The theoretical question of “whether model-free algorithms can be made sample efficient” is one of the most fundamental questions in RL, and remains unsolved even in the basic scenario with finitely many states and actions. We prove that, in an episodic MDP setting, Q-learning with UCB exploration achieves regret $\tilde{O}(\sqrt{H^3 SAT})$, where $S$ and $A$ are the numbers of states and actions, $H$ is the number of steps per episode, and $T$ is the total number of steps. This sample efficiency matches the optimal regret that can be achieved by any model-based approach, up to a single $\sqrt{H}$ factor. To the best of our knowledge, this is the first analysis in the model-free setting that establishes $\sqrt{T}$ regret without requiring access to a “simulator.” |
Tasks | Q-Learning |
Published | 2018-07-10 |
URL | http://arxiv.org/abs/1807.03765v1 |
http://arxiv.org/pdf/1807.03765v1.pdf | |
PWC | https://paperswithcode.com/paper/is-q-learning-provably-efficient |
Repo | |
Framework | |
Smooth Inter-layer Propagation of Stabilized Neural Networks for Classification
Title | Smooth Inter-layer Propagation of Stabilized Neural Networks for Classification |
Authors | Jingfeng Zhang, Laura Wynter |
Abstract | Recent work has studied the reasons for the remarkable performance of deep neural networks in image classification. We examine batch normalization on the one hand and the dynamical systems view of residual networks on the other hand. Our goal is in understanding the notions of stability and smoothness of the inter-layer propagation of ResNets so as to explain when they contribute to significantly enhanced performance. We postulate that such stability is of importance for the trained ResNet to transfer. |
Tasks | Image Classification |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10315v2 |
http://arxiv.org/pdf/1809.10315v2.pdf | |
PWC | https://paperswithcode.com/paper/smooth-inter-layer-propagation-of-stabilized |
Repo | |
Framework | |
How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG
Title | How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG |
Authors | Paul Trichelair, Ali Emami, Adam Trischler, Kaheer Suleman, Jackie Chi Kit Cheung |
Abstract | Recent studies have significantly improved the state-of-the-art on common-sense reasoning (CSR) benchmarks like the Winograd Schema Challenge (WSC) and SWAG. The question we ask in this paper is whether improved performance on these benchmarks represents genuine progress towards common-sense-enabled systems. We make case studies of both benchmarks and design protocols that clarify and qualify the results of previous work by analyzing threats to the validity of previous experimental designs. Our protocols account for several properties prevalent in common-sense benchmarks including size limitations, structural regularities, and variable instance difficulty. |
Tasks | Common Sense Reasoning |
Published | 2018-11-05 |
URL | https://arxiv.org/abs/1811.01778v2 |
https://arxiv.org/pdf/1811.01778v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-evaluation-of-common-sense-reasoning |
Repo | |
Framework | |
Automatic speech recognition for launch control center communication using recurrent neural networks with data augmentation and custom language model
Title | Automatic speech recognition for launch control center communication using recurrent neural networks with data augmentation and custom language model |
Authors | Kyongsik Yun, Joseph Osborne, Madison Lee, Thomas Lu, Edward Chow |
Abstract | Transcribing voice communications in NASA’s launch control center is important for information utilization. However, automatic speech recognition in this environment is particularly challenging due to the lack of training data, unfamiliar words in acronyms, multiple different speakers and accents, and conversational characteristics of speaking. We used bidirectional deep recurrent neural networks to train and test speech recognition performance. We showed that data augmentation and custom language models can improve speech recognition accuracy. Transcribing communications from the launch control center will help the machine analyze information and accelerate knowledge generation. |
Tasks | Data Augmentation, Language Modelling, Speech Recognition |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.09552v1 |
http://arxiv.org/pdf/1804.09552v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-speech-recognition-for-launch |
Repo | |
Framework | |
DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation
Title | DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation |
Authors | Rim Assouel, Mohamed Ahmed, Marwin H Segler, Amir Saffari, Yoshua Bengio |
Abstract | Generating novel molecules with optimal properties is a crucial step in many industries such as drug discovery. Recently, deep generative models have shown a promising way of performing de-novo molecular design. Although graph generative models are currently available they either have a graph size dependency in their number of parameters, limiting their use to only very small graphs or are formulated as a sequence of discrete actions needed to construct a graph, making the output graph non-differentiable w.r.t the model parameters, therefore preventing them to be used in scenarios such as conditional graph generation. In this work we propose a model for conditional graph generation that is computationally efficient and enables direct optimisation of the graph. We demonstrate favourable performance of our model on prototype-based molecular graph conditional generation tasks. |
Tasks | Drug Discovery, Graph Generation |
Published | 2018-11-24 |
URL | http://arxiv.org/abs/1811.09766v1 |
http://arxiv.org/pdf/1811.09766v1.pdf | |
PWC | https://paperswithcode.com/paper/defactor-differentiable-edge-factorization |
Repo | |
Framework | |
Predictive Collective Variable Discovery with Deep Bayesian Models
Title | Predictive Collective Variable Discovery with Deep Bayesian Models |
Authors | Markus Schöberl, Nicholas Zabaras, Phaedon-Stelios Koutsourelakis |
Abstract | Extending spatio-temporal scale limitations of models for complex atomistic systems considered in biochemistry and materials science necessitates the development of enhanced sampling methods. The potential acceleration in exploring the configurational space by enhanced sampling methods depends on the choice of collective variables (CVs). In this work, we formulate the discovery of CVs as a Bayesian inference problem and consider the CVs as hidden generators of the full-atomistic trajectory. The ability to generate samples of the fine-scale atomistic configurations using limited training data allows us to compute estimates of observables as well as our probabilistic confidence on them. The methodology is based on emerging methodological advances in machine learning and variational inference. The discovered CVs are related to physicochemical properties which are essential for understanding mechanisms especially in unexplored complex systems. We provide a quantitative assessment of the CVs in terms of their predictive ability for alanine dipeptide (ALA-2) and ALA-15 peptide. |
Tasks | Bayesian Inference |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06913v2 |
http://arxiv.org/pdf/1809.06913v2.pdf | |
PWC | https://paperswithcode.com/paper/predictive-collective-variable-discovery-with |
Repo | |
Framework | |
Data-Driven Clustering via Parameterized Lloyd’s Families
Title | Data-Driven Clustering via Parameterized Lloyd’s Families |
Authors | Maria-Florina Balcan, Travis Dick, Colin White |
Abstract | Algorithms for clustering points in metric spaces is a long-studied area of research. Clustering has seen a multitude of work both theoretically, in understanding the approximation guarantees possible for many objective functions such as k-median and k-means clustering, and experimentally, in finding the fastest algorithms and seeding procedures for Lloyd’s algorithm. The performance of a given clustering algorithm depends on the specific application at hand, and this may not be known up front. For example, a “typical instance” may vary depending on the application, and different clustering heuristics perform differently depending on the instance. In this paper, we define an infinite family of algorithms generalizing Lloyd’s algorithm, with one parameter controlling the initialization procedure, and another parameter controlling the local search procedure. This family of algorithms includes the celebrated k-means++ algorithm, as well as the classic farthest-first traversal algorithm. We design efficient learning algorithms which receive samples from an application-specific distribution over clustering instances and learn a near-optimal clustering algorithm from the class. We show the best parameters vary significantly across datasets such as MNIST, CIFAR, and mixtures of Gaussians. Our learned algorithms never perform worse than k-means++, and on some datasets we see significant improvements. |
Tasks | |
Published | 2018-09-19 |
URL | https://arxiv.org/abs/1809.06987v3 |
https://arxiv.org/pdf/1809.06987v3.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-clustering-via-parameterized |
Repo | |
Framework | |
Efficient nonmyopic active search with applications in drug and materials discovery
Title | Efficient nonmyopic active search with applications in drug and materials discovery |
Authors | Shali Jiang, Gustavo Malkomes, Benjamin Moseley, Roman Garnett |
Abstract | Active search is a learning paradigm for actively identifying as many members of a given class as possible. A critical target scenario is high-throughput screening for scientific discovery, such as drug or materials discovery. In this paper, we approach this problem in Bayesian decision framework. We first derive the Bayesian optimal policy under a natural utility, and establish a theoretical hardness of active search, proving that the optimal policy can not be approximated for any constant ratio. We also study the batch setting for the first time, where a batch of $b>1$ points can be queried at each iteration. We give an asymptotic lower bound, linear in batch size, on the adaptivity gap: how much we could lose if we query $b$ points at a time for $t$ iterations, instead of one point at a time for $bt$ iterations. We then introduce a novel approach to nonmyopic approximations of the optimal policy that admits efficient computation. Our proposed policy can automatically trade off exploration and exploitation, without relying on any tuning parameters. We also generalize our policy to batch setting, and propose two approaches to tackle the combinatorial search challenge. We evaluate our proposed policies on a large database of drug discovery and materials science. Results demonstrate the superior performance of our proposed policy in both sequential and batch setting; the nonmyopic behavior is also illustrated in various aspects. |
Tasks | Drug Discovery |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08871v2 |
http://arxiv.org/pdf/1811.08871v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-nonmyopic-active-search-with |
Repo | |
Framework | |
DCASE 2018 Challenge: Solution for Task 5
Title | DCASE 2018 Challenge: Solution for Task 5 |
Authors | Jeremy Chew, Yingxiang Sun, Lahiru Jayasinghe, Chau Yuen |
Abstract | To address Task 5 in the Detection and Classification of Acoustic Scenes and Events (DCASE) 2018 challenge, in this paper, we propose an ensemble learning system. The proposed system consists of three different models, based on convolutional neural network and long short memory recurrent neural network. With extracted features such as spectrogram and mel-frequency cepstrum coefficients from different channels, the proposed system can classify different domestic activities effectively. Experimental results obtained from the provided development dataset show that good performance with F1-score of 92.19% can be achieved. Compared with the baseline system, our proposed system significantly improves the performance of F1-score by 7.69%. |
Tasks | |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04618v1 |
http://arxiv.org/pdf/1812.04618v1.pdf | |
PWC | https://paperswithcode.com/paper/dcase-2018-challenge-solution-for-task-5 |
Repo | |
Framework | |
Action Learning for 3D Point Cloud Based Organ Segmentation
Title | Action Learning for 3D Point Cloud Based Organ Segmentation |
Authors | Xia Zhong, Mario Amrehn, Nishant Ravikumar, Shuqing Chen, Norbert Strobel, Annette Birkhold, Markus Kowarschik, Rebecca Fahrig, Andreas Maier |
Abstract | We propose a novel point cloud based 3D organ segmentation pipeline utilizing deep Q-learning. In order to preserve shape properties, the learning process is guided using a statistical shape model. The trained agent directly predicts piece-wise linear transformations for all vertices in each iteration. This mapping between the ideal transformation for an object outline estimation is learned based on image features. To this end, we introduce aperture features that extract gray values by sampling the 3D volume within the cone centered around the associated vertex and its normal vector. Our approach is also capable of estimating a hierarchical pyramid of non rigid deformations for multi-resolution meshes. In the application phase, we use a marginal approach to gradually estimate affine as well as non-rigid transformations. We performed extensive evaluations to highlight the robust performance of our approach on a variety of challenge data as well as clinical data. Additionally, our method has a run time ranging from 0.3 to 2.7 seconds to segment each organ. In addition, we show that the proposed method can be applied to different organs, X-ray based modalities, and scanning protocols without the need of transfer learning. As we learn actions, even unseen reference meshes can be processed as demonstrated in an example with the Visible Human. From this we conclude that our method is robust, and we believe that our method can be successfully applied to many more applications, in particular, in the interventional imaging space. |
Tasks | Q-Learning, Transfer Learning |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05724v1 |
http://arxiv.org/pdf/1806.05724v1.pdf | |
PWC | https://paperswithcode.com/paper/action-learning-for-3d-point-cloud-based |
Repo | |
Framework | |
DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences
Title | DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences |
Authors | Ingoo Lee, Jongsoo Keum, Hojung Nam |
Abstract | Identification of drug-target interactions (DTIs) plays a key role in drug discovery. The high cost and labor-intensive nature of in vitro and in vivo experiments have highlighted the importance of in silico-based DTI prediction approaches. In several computational models, conventional protein descriptors are shown to be not informative enough to predict accurate DTIs. Thus, in this study, we employ a convolutional neural network (CNN) on raw protein sequences to capture local residue patterns participating in DTIs. With CNN on protein sequences, our model performs better than previous protein descriptor-based models. In addition, our model performs better than the previous deep learning model for massive prediction of DTIs. By examining the pooled convolution results, we found that our model can detect binding sites of proteins for DTIs. In conclusion, our prediction model for detecting local residue patterns of target proteins successfully enriches the protein features of a raw protein sequence, yielding better prediction results than previous approaches. |
Tasks | Drug Discovery |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02114v1 |
http://arxiv.org/pdf/1811.02114v1.pdf | |
PWC | https://paperswithcode.com/paper/deepconv-dti-prediction-of-drug-target |
Repo | |
Framework | |
Automated Map Reading: Image Based Localisation in 2-D Maps Using Binary Semantic Descriptors
Title | Automated Map Reading: Image Based Localisation in 2-D Maps Using Binary Semantic Descriptors |
Authors | Pilailuck Panphattarasap, Andrew Calway |
Abstract | We describe a novel approach to image based localisation in urban environments using semantic matching between images and a 2-D map. It contrasts with the vast majority of existing approaches which use image to image database matching. We use highly compact binary descriptors to represent semantic features at locations, significantly increasing scalability compared with existing methods and having the potential for greater invariance to variable imaging conditions. The approach is also more akin to human map reading, making it more suited to human-system interaction. The binary descriptors indicate the presence or not of semantic features relating to buildings and road junctions in discrete viewing directions. We use CNN classifiers to detect the features in images and match descriptor estimates with a database of location tagged descriptors derived from the 2-D map. In isolation, the descriptors are not sufficiently discriminative, but when concatenated sequentially along a route, their combination becomes highly distinctive and allows localisation even when using non-perfect classifiers. Performance is further improved by taking into account left or right turns over a route. Experimental results obtained using Google StreetView and OpenStreetMap data show that the approach has considerable potential, achieving localisation accuracy of around 85% using routes corresponding to approximately 200 meters. |
Tasks | |
Published | 2018-03-02 |
URL | http://arxiv.org/abs/1803.00788v1 |
http://arxiv.org/pdf/1803.00788v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-map-reading-image-based |
Repo | |
Framework | |