Paper Group ANR 587
Uncertainty in Multitask Transfer Learning. Multi-view Point Cloud Registration with Adaptive Convergence Threshold and its Application on 3D Model Retrieval. Deep Knowledge Tracing and Dynamic Student Classification for Knowledge Tracing. Q-learning with Nearest Neighbors. New Results on Multi-Step Traffic Flow Prediction. Extended Affinity Propag …
Uncertainty in Multitask Transfer Learning
Title | Uncertainty in Multitask Transfer Learning |
Authors | Alexandre Lacoste, Boris Oreshkin, Wonchang Chung, Thomas Boquet, Negar Rostamzadeh, David Krueger |
Abstract | Using variational Bayes neural networks, we develop an algorithm capable of accumulating knowledge into a prior from multiple different tasks. The result is a rich and meaningful prior capable of few-shot learning on new tasks. The posterior can go beyond the mean field approximation and yields good uncertainty on the performed experiments. Analysis on toy tasks shows that it can learn from significantly different tasks while finding similarities among them. Experiments of Mini-Imagenet yields the new state of the art with 74.5% accuracy on 5 shot learning. Finally, we provide experiments showing that other existing methods can fail to perform well in different benchmarks. |
Tasks | Few-Shot Learning, Transfer Learning |
Published | 2018-06-20 |
URL | http://arxiv.org/abs/1806.07528v3 |
http://arxiv.org/pdf/1806.07528v3.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-in-multitask-transfer-learning |
Repo | |
Framework | |
Multi-view Point Cloud Registration with Adaptive Convergence Threshold and its Application on 3D Model Retrieval
Title | Multi-view Point Cloud Registration with Adaptive Convergence Threshold and its Application on 3D Model Retrieval |
Authors | Yaochen Li, Ying Liu, Rui Sun, Rui Guo, Li Zhu, Yong Qi |
Abstract | Multi-view point cloud registration is a hot topic in the communities of multimedia technology and artificial intelligence (AI). In this paper, we propose a framework to reconstruct the 3D models by the multi-view point cloud registration algorithm with adaptive convergence threshold, and subsequently apply it to 3D model retrieval. The iterative closest point (ICP) algorithm is implemented combining with the motion average algorithm for the registration of multi-view point clouds. After the registration process, we design applications for 3D model retrieval. The geometric saliency map is computed based on the vertex curvature. The test facial triangle is then generated based on the saliency map, which is applied to compare with the standard facial triangle. The face and non-face models are then discriminated. The experiments and comparisons prove the effectiveness of the proposed framework. |
Tasks | Point Cloud Registration |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.10026v2 |
http://arxiv.org/pdf/1811.10026v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-view-point-cloud-registration-with |
Repo | |
Framework | |
Deep Knowledge Tracing and Dynamic Student Classification for Knowledge Tracing
Title | Deep Knowledge Tracing and Dynamic Student Classification for Knowledge Tracing |
Authors | Sein Minn, Yi Yu, Michel C. Desmarais, Feida Zhu, Jill Jenn Vie |
Abstract | In Intelligent Tutoring System (ITS), tracing the student’s knowledge state during learning has been studied for several decades in order to provide more supportive learning instructions. In this paper, we propose a novel model for knowledge tracing that i) captures students’ learning ability and dynamically assigns students into distinct groups with similar ability at regular time intervals, and ii) combines this information with a Recurrent Neural Network architecture known as Deep Knowledge Tracing. Experimental results confirm that the proposed model is significantly better at predicting student performance than well known state-of-the-art techniques for student modelling. |
Tasks | Knowledge Tracing |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.08713v1 |
http://arxiv.org/pdf/1809.08713v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-knowledge-tracing-and-dynamic-student |
Repo | |
Framework | |
Q-learning with Nearest Neighbors
Title | Q-learning with Nearest Neighbors |
Authors | Devavrat Shah, Qiaomin Xie |
Abstract | We consider model-free reinforcement learning for infinite-horizon discounted Markov Decision Processes (MDPs) with a continuous state space and unknown transition kernel, when only a single sample path under an arbitrary policy of the system is available. We consider the Nearest Neighbor Q-Learning (NNQL) algorithm to learn the optimal Q function using nearest neighbor regression method. As the main contribution, we provide tight finite sample analysis of the convergence rate. In particular, for MDPs with a $d$-dimensional state space and the discounted factor $\gamma \in (0,1)$, given an arbitrary sample path with “covering time” $ L $, we establish that the algorithm is guaranteed to output an $\varepsilon$-accurate estimate of the optimal Q-function using $\tilde{O}\big(L/(\varepsilon^3(1-\gamma)^7)\big)$ samples. For instance, for a well-behaved MDP, the covering time of the sample path under the purely random policy scales as $ \tilde{O}\big(1/\varepsilon^d\big),$ so the sample complexity scales as $\tilde{O}\big(1/\varepsilon^{d+3}\big).$ Indeed, we establish a lower bound that argues that the dependence of $ \tilde{\Omega}\big(1/\varepsilon^{d+2}\big)$ is necessary. |
Tasks | Q-Learning |
Published | 2018-02-12 |
URL | http://arxiv.org/abs/1802.03900v2 |
http://arxiv.org/pdf/1802.03900v2.pdf | |
PWC | https://paperswithcode.com/paper/q-learning-with-nearest-neighbors |
Repo | |
Framework | |
New Results on Multi-Step Traffic Flow Prediction
Title | New Results on Multi-Step Traffic Flow Prediction |
Authors | Arief Koesdwiady, Fakhri Karray |
Abstract | In its simplest form, the traffic flow prediction problem is restricted to predicting a single time-step into the future. Multi-step traffic flow prediction extends this set-up to the case where predicting multiple time-steps into the future based on some finite history is of interest. This problem is significantly more difficult than its single-step variant and is known to suffer from degradation in predictions as the time step increases. In this paper, two approaches to improve multi-step traffic flow prediction performance in recursive and multi-output settings are introduced. In particular, a model that allows recursive prediction approaches to take into account the temporal context in term of time-step index when making predictions is introduced. In addition, a conditional generative adversarial network-based data augmentation method is proposed to improve prediction performance in the multi-output setting. The experiments on a real-world traffic flow dataset show that the two methods improve on multi-step traffic flow prediction in recursive and multi-output settings, respectively. |
Tasks | Data Augmentation |
Published | 2018-03-04 |
URL | http://arxiv.org/abs/1803.01365v2 |
http://arxiv.org/pdf/1803.01365v2.pdf | |
PWC | https://paperswithcode.com/paper/new-results-on-multi-step-traffic-flow |
Repo | |
Framework | |
Extended Affinity Propagation: Global Discovery and Local Insights
Title | Extended Affinity Propagation: Global Discovery and Local Insights |
Authors | Rayyan Ahmad Khan, Rana Ali Amjad, Martin Kleinsteuber |
Abstract | We propose a new clustering algorithm, Extended Affinity Propagation, based on pairwise similarities. Extended Affinity Propagation is developed by modifying Affinity Propagation such that the desirable features of Affinity Propagation, e.g., exemplars, reasonable computational complexity and no need to specify number of clusters, are preserved while the shortcomings, e.g., the lack of global structure discovery, that limit the applicability of Affinity Propagation are overcome. Extended Affinity Propagation succeeds not only in achieving this goal but can also provide various additional insights into the internal structure of the individual clusters, e.g., refined confidence values, relative cluster densities and local cluster strength in different regions of a cluster, which are valuable for an analyst. We briefly discuss how these insights can help in easily tuning the hyperparameters. We also illustrate these desirable features and the performance of Extended Affinity Propagation on various synthetic and real world datasets. |
Tasks | |
Published | 2018-03-12 |
URL | http://arxiv.org/abs/1803.04459v2 |
http://arxiv.org/pdf/1803.04459v2.pdf | |
PWC | https://paperswithcode.com/paper/clustering-with-simultaneous-local-and-global |
Repo | |
Framework | |
A GPU-based WFST Decoder with Exact Lattice Generation
Title | A GPU-based WFST Decoder with Exact Lattice Generation |
Authors | Zhehuai Chen, Justin Luitjens, Hainan Xu, Yiming Wang, Daniel Povey, Sanjeev Khudanpur |
Abstract | We describe initial work on an extension of the Kaldi toolkit that supports weighted finite-state transducer (WFST) decoding on Graphics Processing Units (GPUs). We implement token recombination as an atomic GPU operation in order to fully parallelize the Viterbi beam search, and propose a dynamic load balancing strategy for more efficient token passing scheduling among GPU threads. We also redesign the exact lattice generation and lattice pruning algorithms for better utilization of the GPUs. Experiments on the Switchboard corpus show that the proposed method achieves identical 1-best results and lattice quality in recognition and confidence measure tasks, while running 3 to 15 times faster than the single process Kaldi decoder. The above results are reported on different GPU architectures. Additionally we obtain a 46-fold speedup with sequence parallelism and multi-process service (MPS) in GPU. |
Tasks | |
Published | 2018-04-09 |
URL | http://arxiv.org/abs/1804.03243v3 |
http://arxiv.org/pdf/1804.03243v3.pdf | |
PWC | https://paperswithcode.com/paper/a-gpu-based-wfst-decoder-with-exact-lattice |
Repo | |
Framework | |
SMHD: A Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions
Title | SMHD: A Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions |
Authors | Arman Cohan, Bart Desmet, Andrew Yates, Luca Soldaini, Sean MacAvaney, Nazli Goharian |
Abstract | Mental health is a significant and growing public health concern. As language usage can be leveraged to obtain crucial insights into mental health conditions, there is a need for large-scale, labeled, mental health-related datasets of users who have been diagnosed with one or more of such conditions. In this paper, we investigate the creation of high-precision patterns to identify self-reported diagnoses of nine different mental health conditions, and obtain high-quality labeled data without the need for manual labelling. We introduce the SMHD (Self-reported Mental Health Diagnoses) dataset and make it available. SMHD is a novel large dataset of social media posts from users with one or multiple mental health conditions along with matched control users. We examine distinctions in users’ language, as measured by linguistic and psychological variables. We further explore text classification methods to identify individuals with mental conditions through their language. |
Tasks | Text Classification |
Published | 2018-06-13 |
URL | http://arxiv.org/abs/1806.05258v2 |
http://arxiv.org/pdf/1806.05258v2.pdf | |
PWC | https://paperswithcode.com/paper/smhd-a-large-scale-resource-for-exploring |
Repo | |
Framework | |
Superpixel based Class-Semantic Texton Occurrences for Natural Roadside Vegetation Segmentation
Title | Superpixel based Class-Semantic Texton Occurrences for Natural Roadside Vegetation Segmentation |
Authors | Ligang Zhang, Brijesh Verma |
Abstract | Vegetation segmentation from roadside data is a field that has received relatively little attention in present studies, but can be of great potentials in a wide range of real-world applications, such as road safety assessment and vegetation condition monitoring. In this paper, we present a novel approach that generates class-semantic color-texture textons and aggregates superpixel based texton occurrences for vegetation segmentation in natural roadside images. Pixel-level class-semantic textons are first learnt by generating two individual sets of bag-of-word visual dictionaries from color and filter-bank texture features separately for each object class using manually cropped training data. For a testing image, it is first oversegmented into a set of homogeneous superpixels. The color and texture features of all pixels in each superpixel are extracted and further mapped to one of the learnt textons using the nearest distance metric, resulting in a color and a texture texton occurrence matrix. The color and texture texton occurrences are aggregated using a linear mixing method over each superpixel and the segmentation is finally achieved using a simple yet effective majority voting strategy. Evaluations on two public image datasets from videos collected by the Department of Transport and Main Roads (DTMR), Queensland, Australia, and a public roadside grass dataset show high accuracy of the proposed approach. We also demonstrate the effectiveness of the approach for vegetation segmentation in real-world scenarios. |
Tasks | |
Published | 2018-02-24 |
URL | http://arxiv.org/abs/1802.08781v1 |
http://arxiv.org/pdf/1802.08781v1.pdf | |
PWC | https://paperswithcode.com/paper/superpixel-based-class-semantic-texton |
Repo | |
Framework | |
Curvilinear Structure Enhancement by Multiscale Top-Hat Tensor in 2D/3D Images
Title | Curvilinear Structure Enhancement by Multiscale Top-Hat Tensor in 2D/3D Images |
Authors | Shuaa S. Alharbi, Cigdem Sazak, Carl J. Nelson, Boguslaw Obara |
Abstract | A wide range of biomedical applications requires enhancement, detection, quantification and modelling of curvilinear structures in 2D and 3D images. Curvilinear structure enhancement is a crucial step for further analysis, but many of the enhancement approaches still suffer from contrast variations and noise. This can be addressed using a multiscale approach that produces a better quality enhancement for low contrast and noisy images compared with a single-scale approach in a wide range of biomedical images. Here, we propose the Multiscale Top-Hat Tensor (MTHT) approach, which combines multiscale morphological filtering with a local tensor representation of curvilinear structures in 2D and 3D images. The proposed approach is validated on synthetic and real data and is also compared to the state-of-the-art approaches. Our results show that the proposed approach achieves high-quality curvilinear structure enhancement in synthetic examples and in a wide range of 2D and 3D images. |
Tasks | |
Published | 2018-09-23 |
URL | http://arxiv.org/abs/1809.08678v2 |
http://arxiv.org/pdf/1809.08678v2.pdf | |
PWC | https://paperswithcode.com/paper/curvilinear-structure-enhancement-by |
Repo | |
Framework | |
Coupled Dictionary Learning for Multi-contrast MRI Reconstruction
Title | Coupled Dictionary Learning for Multi-contrast MRI Reconstruction |
Authors | Pingfan Song, Lior Weizman, Joao F. C. Mota, Yonina C. Eldar, Miguel R. D. Rodrigues |
Abstract | Medical imaging tasks often involve multiple contrasts, such as T1- and T2-weighted magnetic resonance imaging (MRI) data. These contrasts capture information associated with the same underlying anatomy and thus exhibit similarities. In this paper, we propose a Coupled Dictionary Learning based multi-contrast MRI reconstruction (CDLMRI) approach to leverage an available guidance contrast to restore the target contrast. Our approach consists of three stages: coupled dictionary learning, coupled sparse denoising, and $k$-space consistency enforcing. The first stage learns a group of dictionaries that capture correlations among multiple contrasts. By capitalizing on the learned adaptive dictionaries, the second stage performs joint sparse coding to denoise the corrupted target image with the aid of a guidance contrast. The third stage enforces consistency between the denoised image and the measurements in the $k$-space domain. Numerical experiments on the retrospective under-sampling of clinical MR images demonstrate that incorporating additional guidance contrast via our design improves MRI reconstruction, compared to state-of-the-art approaches. |
Tasks | Denoising, Dictionary Learning |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.09930v1 |
http://arxiv.org/pdf/1806.09930v1.pdf | |
PWC | https://paperswithcode.com/paper/coupled-dictionary-learning-for-multi |
Repo | |
Framework | |
Error Forward-Propagation: Reusing Feedforward Connections to Propagate Errors in Deep Learning
Title | Error Forward-Propagation: Reusing Feedforward Connections to Propagate Errors in Deep Learning |
Authors | Adam A. Kohan, Edward A. Rietman, Hava T. Siegelmann |
Abstract | We introduce Error Forward-Propagation, a biologically plausible mechanism to propagate error feedback forward through the network. Architectural constraints on connectivity are virtually eliminated for error feedback in the brain; systematic backward connectivity is not used or needed to deliver error feedback. Feedback as a means of assigning credit to neurons earlier in the forward pathway for their contribution to the final output is thought to be used in learning in the brain. How the brain solves the credit assignment problem is unclear. In machine learning, error backpropagation is a highly successful mechanism for credit assignment in deep multilayered networks. Backpropagation requires symmetric reciprocal connectivity for every neuron. From a biological perspective, there is no evidence of such an architectural constraint, which makes backpropagation implausible for learning in the brain. This architectural constraint is reduced with the use of random feedback weights. Models using random feedback weights require backward connectivity patterns for every neuron, but avoid symmetric weights and reciprocal connections. In this paper, we practically remove this architectural constraint, requiring only a backward loop connection for effective error feedback. We propose reusing the forward connections to deliver the error feedback by feeding the outputs into the input receiving layer. This mechanism, Error Forward-Propagation, is a plausible basis for how error feedback occurs deep in the brain independent of and yet in support of the functionality underlying intricate network architectures. We show experimentally that recurrent neural networks with two and three hidden layers can be trained using Error Forward-Propagation on the MNIST and Fashion MNIST datasets, achieving $1.90%$ and $11%$ generalization errors respectively. |
Tasks | |
Published | 2018-08-09 |
URL | http://arxiv.org/abs/1808.03357v1 |
http://arxiv.org/pdf/1808.03357v1.pdf | |
PWC | https://paperswithcode.com/paper/error-forward-propagation-reusing-feedforward |
Repo | |
Framework | |
Troubling Trends in Machine Learning Scholarship
Title | Troubling Trends in Machine Learning Scholarship |
Authors | Zachary C. Lipton, Jacob Steinhardt |
Abstract | Collectively, machine learning (ML) researchers are engaged in the creation and dissemination of knowledge about data-driven algorithms. In a given paper, researchers might aspire to any subset of the following goals, among others: to theoretically characterize what is learnable, to obtain understanding through empirically rigorous experiments, or to build a working system that has high predictive accuracy. While determining which knowledge warrants inquiry may be subjective, once the topic is fixed, papers are most valuable to the community when they act in service of the reader, creating foundational knowledge and communicating as clearly as possible. Recent progress in machine learning comes despite frequent departures from these ideals. In this paper, we focus on the following four patterns that appear to us to be trending in ML scholarship: (i) failure to distinguish between explanation and speculation; (ii) failure to identify the sources of empirical gains, e.g., emphasizing unnecessary modifications to neural architectures when gains actually stem from hyper-parameter tuning; (iii) mathiness: the use of mathematics that obfuscates or impresses rather than clarifies, e.g., by confusing technical and non-technical concepts; and (iv) misuse of language, e.g., by choosing terms of art with colloquial connotations or by overloading established technical terms. While the causes behind these patterns are uncertain, possibilities include the rapid expansion of the community, the consequent thinness of the reviewer pool, and the often-misaligned incentives between scholarship and short-term measures of success (e.g., bibliometrics, attention, and entrepreneurial opportunity). While each pattern offers a corresponding remedy (don’t do it), we also discuss some speculative suggestions for how the community might combat these trends. |
Tasks | |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03341v2 |
http://arxiv.org/pdf/1807.03341v2.pdf | |
PWC | https://paperswithcode.com/paper/troubling-trends-in-machine-learning |
Repo | |
Framework | |
Towards an Evolvable Cancer Treatment Simulator
Title | Towards an Evolvable Cancer Treatment Simulator |
Authors | Richard J. Preen, Larry Bull, Andrew Adamatzky |
Abstract | The use of high-fidelity computational simulations promises to enable high-throughput hypothesis testing and optimisation of cancer therapies. However, increasing realism comes at the cost of increasing computational requirements. This article explores the use of surrogate-assisted evolutionary algorithms to optimise the targeted delivery of a therapeutic compound to cancerous tumour cells with the multicellular simulator, PhysiCell. The use of both Gaussian process models and multi-layer perceptron neural network surrogate models are investigated. We find that evolutionary algorithms are able to effectively explore the parameter space of biophysical properties within the agent-based simulations, minimising the resulting number of cancerous cells after a period of simulated treatment. Both model-assisted algorithms are found to outperform a standard evolutionary algorithm, demonstrating their ability to perform a more effective search within the very small evaluation budget. This represents the first use of efficient evolutionary algorithms within a high-throughput multicellular computing approach to find therapeutic design optima that maximise tumour regression. |
Tasks | |
Published | 2018-12-19 |
URL | https://arxiv.org/abs/1812.08252v3 |
https://arxiv.org/pdf/1812.08252v3.pdf | |
PWC | https://paperswithcode.com/paper/towards-an-evolvable-cancer-treatment |
Repo | |
Framework | |
Computational Power and the Social Impact of Artificial Intelligence
Title | Computational Power and the Social Impact of Artificial Intelligence |
Authors | Tim Hwang |
Abstract | Machine learning is a computational process. To that end, it is inextricably tied to computational power - the tangible material of chips and semiconductors that the algorithms of machine intelligence operate on. Most obviously, computational power and computing architectures shape the speed of training and inference in machine learning, and therefore influence the rate of progress in the technology. But, these relationships are more nuanced than that: hardware shapes the methods used by researchers and engineers in the design and development of machine learning models. Characteristics such as the power consumption of chips also define where and how machine learning can be used in the real world. Despite this, many analyses of the social impact of the current wave of progress in AI have not substantively brought the dimension of hardware into their accounts. While a common trope in both the popular press and scholarly literature is to highlight the massive increase in computational power that has enabled the recent breakthroughs in machine learning, the analysis frequently goes no further than this observation around magnitude. This paper aims to dig more deeply into the relationship between computational power and the development of machine learning. Specifically, it examines how changes in computing architectures, machine learning methodologies, and supply chains might influence the future of AI. In doing so, it seeks to trace a set of specific relationships between this underlying hardware layer and the broader social impacts and risks around AI. |
Tasks | |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08971v1 |
http://arxiv.org/pdf/1803.08971v1.pdf | |
PWC | https://paperswithcode.com/paper/computational-power-and-the-social-impact-of |
Repo | |
Framework | |