May 6, 2019

3106 words 15 mins read

Paper Group ANR 324

Paper Group ANR 324

The SP Theory of Intelligence as a Foundation for the Development of a General, Human-Level Thinking Machine. Do They All Look the Same? Deciphering Chinese, Japanese and Koreans by Fine-Grained Deep Learning. Convergence of Contrastive Divergence with Annealed Learning Rate in Exponential Family. Ontology Driven Disease Incidence Detection on Twit …

The SP Theory of Intelligence as a Foundation for the Development of a General, Human-Level Thinking Machine

Title The SP Theory of Intelligence as a Foundation for the Development of a General, Human-Level Thinking Machine
Authors J Gerard Wolff
Abstract This paper summarises how the “SP theory of intelligence” and its realisation in the “SP computer model” simplifies and integrates concepts across artificial intelligence and related areas, and thus provides a promising foundation for the development of a general, human-level thinking machine, in accordance with the main goal of research in artificial general intelligence. The key to this simplification and integration is the powerful concept of “multiple alignment”, borrowed and adapted from bioinformatics. This concept has the potential to be the “double helix” of intelligence, with as much significance for human-level intelligence as has DNA for biological sciences. Strengths of the SP system include: versatility in the representation of diverse kinds of knowledge; versatility in aspects of intelligence (including: strengths in unsupervised learning; the processing of natural language; pattern recognition at multiple levels of abstraction that is robust in the face of errors in data; several kinds of reasoning (including: one-step `deductive’ reasoning; chains of reasoning; abductive reasoning; reasoning with probabilistic networks and trees; reasoning with ‘rules’; nonmonotonic reasoning and reasoning with default values; Bayesian reasoning with ‘explaining away’; and more); planning; problem solving; and more); seamless integration of diverse kinds of knowledge and diverse aspects of intelligence in any combination; and potential for application in several areas (including: helping to solve nine problems with big data; helping to develop human-level intelligence in autonomous robots; serving as a database with intelligence and with versatility in the representation and integration of several forms of knowledge; serving as a vehicle for medical knowledge and as an aid to medical diagnosis; and several more). |
Tasks Medical Diagnosis
Published 2016-12-22
URL http://arxiv.org/abs/1612.07555v1
PDF http://arxiv.org/pdf/1612.07555v1.pdf
PWC https://paperswithcode.com/paper/the-sp-theory-of-intelligence-as-a-foundation
Repo
Framework

Do They All Look the Same? Deciphering Chinese, Japanese and Koreans by Fine-Grained Deep Learning

Title Do They All Look the Same? Deciphering Chinese, Japanese and Koreans by Fine-Grained Deep Learning
Authors Yu Wang, Haofu Liao, Yang Feng, Xiangyang Xu, Jiebo Luo
Abstract We study to what extend Chinese, Japanese and Korean faces can be classified and which facial attributes offer the most important cues. First, we propose a novel way of obtaining large numbers of facial images with nationality labels. Then we train state-of-the-art neural networks with these labeled images. We are able to achieve an accuracy of 75.03% in the classification task, with chances being 33.33% and human accuracy 38.89% . Further, we train multiple facial attribute classifiers to identify the most distinctive features for each group. We find that Chinese, Japanese and Koreans do exhibit substantial differences in certain attributes, such as bangs, smiling, and bushy eyebrows. Along the way, we uncover several gender-related cross-country patterns as well. Our work, which complements existing APIs such as Microsoft Cognitive Services and Face++, could find potential applications in tourism, e-commerce, social media marketing, criminal justice and even counter-terrorism.
Tasks
Published 2016-10-06
URL http://arxiv.org/abs/1610.01854v2
PDF http://arxiv.org/pdf/1610.01854v2.pdf
PWC https://paperswithcode.com/paper/do-they-all-look-the-same-deciphering-chinese
Repo
Framework

Convergence of Contrastive Divergence with Annealed Learning Rate in Exponential Family

Title Convergence of Contrastive Divergence with Annealed Learning Rate in Exponential Family
Authors Bai Jiang, Tung-yu Wu, Wing H. Wong
Abstract In our recent paper, we showed that in exponential family, contrastive divergence (CD) with fixed learning rate will give asymptotically consistent estimates \cite{wu2016convergence}. In this paper, we establish consistency and convergence rate of CD with annealed learning rate $\eta_t$. Specifically, suppose CD-$m$ generates the sequence of parameters ${\theta_t}_{t \ge 0}$ using an i.i.d. data sample $\mathbf{X}1^n \sim p{\theta^*}$ of size $n$, then $\delta_n(\mathbf{X}_1^n) = \limsup_{t \to \infty} \Vert \sum_{s=t_0}^t \eta_s \theta_s / \sum_{s=t_0}^t \eta_s - \theta^* \Vert$ converges in probability to 0 at a rate of $1/\sqrt[3]{n}$. The number ($m$) of MCMC transitions in CD only affects the coefficient factor of convergence rate. Our proof is not a simple extension of the one in \cite{wu2016convergence}. which depends critically on the fact that ${\theta_t}_{t \ge 0}$ is a homogeneous Markov chain conditional on the observed sample $\mathbf{X}_1^n$. Under annealed learning rate, the homogeneous Markov property is not available and we have to develop an alternative approach based on super-martingales. Experiment results of CD on a fully-visible $2\times 2$ Boltzmann Machine are provided to demonstrate our theoretical results.
Tasks
Published 2016-05-20
URL http://arxiv.org/abs/1605.06220v1
PDF http://arxiv.org/pdf/1605.06220v1.pdf
PWC https://paperswithcode.com/paper/convergence-of-contrastive-divergence-with
Repo
Framework

Ontology Driven Disease Incidence Detection on Twitter

Title Ontology Driven Disease Incidence Detection on Twitter
Authors Mark Abraham Magumba, Peter Nabende
Abstract In this work we address the issue of generic automated disease incidence monitoring on twitter. We employ an ontology of disease related concepts and use it to obtain a conceptual representation of tweets. Unlike previous key word based systems and topic modeling approaches, our ontological approach allows us to apply more stringent criteria for determining which messages are relevant such as spatial and temporal characteristics whilst giving a stronger guarantee that the resulting models will perform well on new data that may be lexically divergent. We achieve this by training learners on concepts rather than individual words. For training we use a dataset containing mentions of influenza and Listeria and use the learned models to classify datasets containing mentions of an arbitrary selection of other diseases. We show that our ontological approach achieves good performance on this task using a variety of Natural Language Processing Techniques. We also show that word vectors can be learned directly from our concepts to achieve even better results.
Tasks
Published 2016-11-21
URL http://arxiv.org/abs/1611.06671v1
PDF http://arxiv.org/pdf/1611.06671v1.pdf
PWC https://paperswithcode.com/paper/ontology-driven-disease-incidence-detection
Repo
Framework

Fast Kronecker product kernel methods via generalized vec trick

Title Fast Kronecker product kernel methods via generalized vec trick
Authors Antti Airola, Tapio Pahikkala
Abstract Kronecker product kernel provides the standard approach in the kernel methods literature for learning from graph data, where edges are labeled and both start and end vertices have their own feature representations. The methods allow generalization to such new edges, whose start and end vertices do not appear in the training data, a setting known as zero-shot or zero-data learning. Such a setting occurs in numerous applications, including drug-target interaction prediction, collaborative filtering and information retrieval. Efficient training algorithms based on the so-called vec trick, that makes use of the special structure of the Kronecker product, are known for the case where the training data is a complete bipartite graph. In this work we generalize these results to non-complete training graphs. This allows us to derive a general framework for training Kronecker product kernel methods, as specific examples we implement Kronecker ridge regression and support vector machine algorithms. Experimental results demonstrate that the proposed approach leads to accurate models, while allowing order of magnitude improvements in training and prediction time.
Tasks Information Retrieval
Published 2016-01-07
URL http://arxiv.org/abs/1601.01507v3
PDF http://arxiv.org/pdf/1601.01507v3.pdf
PWC https://paperswithcode.com/paper/fast-kronecker-product-kernel-methods-via
Repo
Framework

A Rapid Pattern-Recognition Method for Driving Types Using Clustering-Based Support Vector Machines

Title A Rapid Pattern-Recognition Method for Driving Types Using Clustering-Based Support Vector Machines
Authors Wenshuo Wang, Junqiang Xi
Abstract A rapid pattern-recognition approach to characterize driver’s curve-negotiating behavior is proposed. To shorten the recognition time and improve the recognition of driving styles, a k-means clustering-based support vector machine ( kMC-SVM) method is developed and used for classifying drivers into two types: aggressive and moderate. First, vehicle speed and throttle opening are treated as the feature parameters to reflect the driving styles. Second, to discriminate driver curve-negotiating behaviors and reduce the number of support vectors, the k-means clustering method is used to extract and gather the two types of driving data and shorten the recognition time. Then, based on the clustering results, a support vector machine approach is utilized to generate the hyperplane for judging and predicting to which types the human driver are subject. Lastly, to verify the validity of the kMC-SVM method, a cross-validation experiment is designed and conducted. The research results show that the $ k $MC-SVM is an effective method to classify driving styles with a short time, compared with SVM method.
Tasks
Published 2016-05-22
URL http://arxiv.org/abs/1605.06742v1
PDF http://arxiv.org/pdf/1605.06742v1.pdf
PWC https://paperswithcode.com/paper/a-rapid-pattern-recognition-method-for
Repo
Framework

AP16-OL7: A Multilingual Database for Oriental Languages and A Language Recognition Baseline

Title AP16-OL7: A Multilingual Database for Oriental Languages and A Language Recognition Baseline
Authors Dong Wang, Lantian Li, Difei Tang, Qing Chen
Abstract We present the AP16-OL7 database which was released as the training and test data for the oriental language recognition (OLR) challenge on APSIPA 2016. Based on the database, a baseline system was constructed on the basis of the i-vector model. We report the baseline results evaluated in various metrics defined by the AP16-OLR evaluation plan and demonstrate that AP16-OL7 is a reasonable data resource for multilingual research.
Tasks
Published 2016-09-27
URL http://arxiv.org/abs/1609.08445v1
PDF http://arxiv.org/pdf/1609.08445v1.pdf
PWC https://paperswithcode.com/paper/ap16-ol7-a-multilingual-database-for-oriental
Repo
Framework

Con-Patch: When a Patch Meets its Context

Title Con-Patch: When a Patch Meets its Context
Authors Yaniv Romano, Michael Elad
Abstract Measuring the similarity between patches in images is a fundamental building block in various tasks. Naturally, the patch-size has a major impact on the matching quality, and on the consequent application performance. Under the assumption that our patch database is sufficiently sampled, using large patches (e.g. 21-by-21) should be preferred over small ones (e.g. 7-by-7). However, this “dense-sampling” assumption is rarely true; in most cases large patches cannot find relevant nearby examples. This phenomenon is a consequence of the curse of dimensionality, stating that the database-size should grow exponentially with the patch-size to ensure proper matches. This explains the favored choice of small patch-size in most applications. Is there a way to keep the simplicity and work with small patches while getting some of the benefits that large patches provide? In this work we offer such an approach. We propose to concatenate the regular content of a conventional (small) patch with a compact representation of its (large) surroundings - its context. Therefore, with a minor increase of the dimensions (e.g. with additional 10 values to the patch representation), we implicitly/softly describe the information of a large patch. The additional descriptors are computed based on a self-similarity behavior of the patch surrounding. We show that this approach achieves better matches, compared to the use of conventional-size patches, without the need to increase the database-size. Also, the effectiveness of the proposed method is tested on three distinct problems: (i) External natural image denoising, (ii) Depth image super-resolution, and (iii) Motion-compensated frame-rate up-conversion.
Tasks Denoising, Image Denoising, Image Super-Resolution, Super-Resolution
Published 2016-03-22
URL http://arxiv.org/abs/1603.06812v3
PDF http://arxiv.org/pdf/1603.06812v3.pdf
PWC https://paperswithcode.com/paper/con-patch-when-a-patch-meets-its-context
Repo
Framework
Title New Trends in Neutrosophic Theory and Applications
Authors Florentin Smarandache, Surapati Pramanik
Abstract Neutrosophic theory and applications have been expanding in all directions at an astonishing rate especially after the introduction the journal entitled Neutrosophic Sets and Systems. New theories, techniques, algorithms have been rapidly developed. One of the most striking trends in the neutrosophic theory is the hybridization of neutrosophic set with other potential sets such as rough set, bipolar set, soft set, hesitant fuzzy set, etc. The different hybrid structure such as rough neutrosophic set, single valued neutrosophic rough set, bipolar neutrosophic set, single valued neutrosophic hesitant fuzzy set, etc. are proposed in the literature in a short period of time. Neutrosophic set has been a very important tool in all various areas of data mining, decision making, e-learning, engineering, medicine, social science, and some more. The book New Trends in Neutrosophic Theories and Applications focuses on theories, methods, algorithms for decision making and also applications involving neutrosophic information. Some topics deal with data mining, decision making, e-learning, graph theory, medical diagnosis, probability theory, topology, and some more.
Tasks Decision Making, Medical Diagnosis
Published 2016-11-23
URL http://arxiv.org/abs/1611.08555v1
PDF http://arxiv.org/pdf/1611.08555v1.pdf
PWC https://paperswithcode.com/paper/new-trends-in-neutrosophic-theory-and
Repo
Framework

RecSys Challenge 2016: job recommendations based on preselection of offers and gradient boosting

Title RecSys Challenge 2016: job recommendations based on preselection of offers and gradient boosting
Authors Andrzej Pacuk, Piotr Sankowski, Karol Węgrzycki, Adam Witkowski, Piotr Wygocki
Abstract We present the Mim-Solution’s approach to the RecSys Challenge 2016, which ranked 2nd. The goal of the competition was to prepare job recommendations for the users of the website Xing.com. Our two phase algorithm consists of candidate selection followed by the candidate ranking. We ranked the candidates by the predicted probability that the user will positively interact with the job offer. We have used Gradient Boosting Decision Trees as the regression tool.
Tasks
Published 2016-12-03
URL http://arxiv.org/abs/1612.00959v1
PDF http://arxiv.org/pdf/1612.00959v1.pdf
PWC https://paperswithcode.com/paper/recsys-challenge-2016-job-recommendations
Repo
Framework

Statistical mechanics of the inverse Ising problem and the optimal objective function

Title Statistical mechanics of the inverse Ising problem and the optimal objective function
Authors Johannes Berg
Abstract The inverse Ising problem seeks to reconstruct the parameters of an Ising Hamiltonian on the basis of spin configurations sampled from the Boltzmann measure. Over the last decade, many applications of the inverse Ising problem have arisen, driven by the advent of large-scale data across different scientific disciplines. Recently, strategies to solve the inverse Ising problem based on convex optimisation have proven to be very successful. These approaches maximise particular objective functions with respect to the model parameters. Examples are the pseudolikelihood method and interaction screening. In this paper, we establish a link between approaches to the inverse Ising problem based on convex optimisation and the statistical physics of disordered systems. We characterise the performance of an arbitrary objective function and calculate the objective function which optimally reconstructs the model parameters. We evaluate the optimal objective function within a replica-symmetric ansatz and compare the results of the optimal objective function with other reconstruction methods. Apart from giving a theoretical underpinning to solving the inverse Ising problem by convex optimisation, the optimal objective function outperforms state-of-the-art methods, albeit by a small margin.
Tasks
Published 2016-11-14
URL http://arxiv.org/abs/1611.04281v4
PDF http://arxiv.org/pdf/1611.04281v4.pdf
PWC https://paperswithcode.com/paper/statistical-mechanics-of-the-inverse-ising
Repo
Framework

Structured Dropout for Weak Label and Multi-Instance Learning and Its Application to Score-Informed Source Separation

Title Structured Dropout for Weak Label and Multi-Instance Learning and Its Application to Score-Informed Source Separation
Authors Sebastian Ewert, Mark B. Sandler
Abstract Many success stories involving deep neural networks are instances of supervised learning, where available labels power gradient-based learning methods. Creating such labels, however, can be expensive and thus there is increasing interest in weak labels which only provide coarse information, with uncertainty regarding time, location or value. Using such labels often leads to considerable challenges for the learning process. Current methods for weak-label training often employ standard supervised approaches that additionally reassign or prune labels during the learning process. The information gain, however, is often limited as only the importance of labels where the network already yields reasonable results is boosted. We propose treating weak-label training as an unsupervised problem and use the labels to guide the representation learning to induce structure. To this end, we propose two autoencoder extensions: class activity penalties and structured dropout. We demonstrate the capabilities of our approach in the context of score-informed source separation of music.
Tasks Representation Learning
Published 2016-09-15
URL http://arxiv.org/abs/1609.04557v2
PDF http://arxiv.org/pdf/1609.04557v2.pdf
PWC https://paperswithcode.com/paper/structured-dropout-for-weak-label-and-multi
Repo
Framework

Much Ado About Time: Exhaustive Annotation of Temporal Data

Title Much Ado About Time: Exhaustive Annotation of Temporal Data
Authors Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, Abhinav Gupta
Abstract Large-scale annotated datasets allow AI systems to learn from and build upon the knowledge of the crowd. Many crowdsourcing techniques have been developed for collecting image annotations. These techniques often implicitly rely on the fact that a new input image takes a negligible amount of time to perceive. In contrast, we investigate and determine the most cost-effective way of obtaining high-quality multi-label annotations for temporal data such as videos. Watching even a short 30-second video clip requires a significant time investment from a crowd worker; thus, requesting multiple annotations following a single viewing is an important cost-saving strategy. But how many questions should we ask per video? We conclude that the optimal strategy is to ask as many questions as possible in a HIT (up to 52 binary questions after watching a 30-second video clip in our experiments). We demonstrate that while workers may not correctly answer all questions, the cost-benefit analysis nevertheless favors consensus from multiple such cheap-yet-imperfect iterations over more complex alternatives. When compared with a one-question-per-video baseline, our method is able to achieve a 10% improvement in recall 76.7% ours versus 66.7% baseline) at comparable precision (83.8% ours versus 83.0% baseline) in about half the annotation time (3.8 minutes ours compared to 7.1 minutes baseline). We demonstrate the effectiveness of our method by collecting multi-label annotations of 157 human activities on 1,815 videos.
Tasks
Published 2016-07-25
URL http://arxiv.org/abs/1607.07429v2
PDF http://arxiv.org/pdf/1607.07429v2.pdf
PWC https://paperswithcode.com/paper/much-ado-about-time-exhaustive-annotation-of
Repo
Framework

Modeling Industrial ADMET Data with Multitask Networks

Title Modeling Industrial ADMET Data with Multitask Networks
Authors Steven Kearnes, Brian Goldman, Vijay Pande
Abstract Deep learning methods such as multitask neural networks have recently been applied to ligand-based virtual screening and other drug discovery applications. Using a set of industrial ADMET datasets, we compare neural networks to standard baseline models and analyze multitask learning effects with both random cross-validation and a more relevant temporal validation scheme. We confirm that multitask learning can provide modest benefits over single-task models and show that smaller datasets tend to benefit more than larger datasets from multitask learning. Additionally, we find that adding massive amounts of side information is not guaranteed to improve performance relative to simpler multitask learning. Our results emphasize that multitask effects are highly dataset-dependent, suggesting the use of dataset-specific models to maximize overall performance.
Tasks Drug Discovery
Published 2016-06-28
URL http://arxiv.org/abs/1606.08793v3
PDF http://arxiv.org/pdf/1606.08793v3.pdf
PWC https://paperswithcode.com/paper/modeling-industrial-admet-data-with-multitask
Repo
Framework

Learning Genomic Representations to Predict Clinical Outcomes in Cancer

Title Learning Genomic Representations to Predict Clinical Outcomes in Cancer
Authors Safoora Yousefi, Congzheng Song, Nelson Nauata, Lee Cooper
Abstract Genomics are rapidly transforming medical practice and basic biomedical research, providing insights into disease mechanisms and improving therapeutic strategies, particularly in cancer. The ability to predict the future course of a patient’s disease from high-dimensional genomic profiling will be essential in realizing the promise of genomic medicine, but presents significant challenges for state-of-the-art survival analysis methods. In this abstract we present an investigation in learning genomic representations with neural networks to predict patient survival in cancer. We demonstrate the advantages of this approach over existing survival analysis methods using brain tumor data.
Tasks Survival Analysis
Published 2016-09-27
URL http://arxiv.org/abs/1609.08663v1
PDF http://arxiv.org/pdf/1609.08663v1.pdf
PWC https://paperswithcode.com/paper/learning-genomic-representations-to-predict
Repo
Framework
comments powered by Disqus