Paper Group ANR 674
Distribution-Based Categorization of Classifier Transfer Learning. Evolving Unsupervised Deep Neural Networks for Learning Meaningful Representations. On Prediction and Tolerance Intervals for Dynamic Treatment Regimes. De-identification In practice. Pseudorehearsal in value function approximation. Alternating Iteratively Reweighted Minimization Al …
Distribution-Based Categorization of Classifier Transfer Learning
Title | Distribution-Based Categorization of Classifier Transfer Learning |
Authors | Ricardo Gamelas Sousa, Luís A. Alexandre, Jorge M. Santos, Luís M. Silva, Joaquim Marques de Sá |
Abstract | Transfer Learning (TL) aims to transfer knowledge acquired in one problem, the source problem, onto another problem, the target problem, dispensing with the bottom-up construction of the target model. Due to its relevance, TL has gained significant interest in the Machine Learning community since it paves the way to devise intelligent learning models that can easily be tailored to many different applications. As it is natural in a fast evolving area, a wide variety of TL methods, settings and nomenclature have been proposed so far. However, a wide range of works have been reporting different names for the same concepts. This concept and terminology mixture contribute however to obscure the TL field, hindering its proper consideration. In this paper we present a review of the literature on the majority of classification TL methods, and also a distribution-based categorization of TL with a common nomenclature suitable to classification problems. Under this perspective three main TL categories are presented, discussed and illustrated with examples. |
Tasks | Transfer Learning |
Published | 2017-12-06 |
URL | http://arxiv.org/abs/1712.02159v1 |
http://arxiv.org/pdf/1712.02159v1.pdf | |
PWC | https://paperswithcode.com/paper/distribution-based-categorization-of |
Repo | |
Framework | |
Evolving Unsupervised Deep Neural Networks for Learning Meaningful Representations
Title | Evolving Unsupervised Deep Neural Networks for Learning Meaningful Representations |
Authors | Yanan Sun, Gary G. Yen, Zhang Yi |
Abstract | Deep Learning (DL) aims at learning the \emph{meaningful representations}. A meaningful representation refers to the one that gives rise to significant performance improvement of associated Machine Learning (ML) tasks by replacing the raw data as the input. However, optimal architecture design and model parameter estimation in DL algorithms are widely considered to be intractable. Evolutionary algorithms are much preferable for complex and non-convex problems due to its inherent characteristics of gradient-free and insensitivity to local optimum. In this paper, we propose a computationally economical algorithm for evolving \emph{unsupervised deep neural networks} to efficiently learn \emph{meaningful representations}, which is very suitable in the current Big Data era where sufficient labeled data for training is often expensive to acquire. In the proposed algorithm, finding an appropriate architecture and the initialized parameter values for a ML task at hand is modeled by one computational efficient gene encoding approach, which is employed to effectively model the task with a large number of parameters. In addition, a local search strategy is incorporated to facilitate the exploitation search for further improving the performance. Furthermore, a small proportion labeled data is utilized during evolution search to guarantee the learnt representations to be meaningful. The performance of the proposed algorithm has been thoroughly investigated over classification tasks. Specifically, error classification rate on MNIST with $1.15%$ is reached by the proposed algorithm consistently, which is a very promising result against state-of-the-art unsupervised DL algorithms. |
Tasks | |
Published | 2017-12-13 |
URL | http://arxiv.org/abs/1712.05043v2 |
http://arxiv.org/pdf/1712.05043v2.pdf | |
PWC | https://paperswithcode.com/paper/evolving-unsupervised-deep-neural-networks |
Repo | |
Framework | |
On Prediction and Tolerance Intervals for Dynamic Treatment Regimes
Title | On Prediction and Tolerance Intervals for Dynamic Treatment Regimes |
Authors | Daniel J. Lizotte, Arezoo Tahmasebi |
Abstract | We develop and evaluate tolerance interval methods for dynamic treatment regimes (DTRs) that can provide more detailed prognostic information to patients who will follow an estimated optimal regime. Although the problem of constructing confidence intervals for DTRs has been extensively studied, prediction and tolerance intervals have received little attention. We begin by reviewing in detail different interval estimation and prediction methods and then adapting them to the DTR setting. We illustrate some of the challenges associated with tolerance interval estimation stemming from the fact that we do not typically have data that were generated from the estimated optimal regime. We give an extensive empirical evaluation of the methods and discussed several practical aspects of method choice, and we present an example application using data from a clinical trial. Finally, we discuss future directions within this important emerging area of DTR research. |
Tasks | |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07453v1 |
http://arxiv.org/pdf/1704.07453v1.pdf | |
PWC | https://paperswithcode.com/paper/on-prediction-and-tolerance-intervals-for |
Repo | |
Framework | |
De-identification In practice
Title | De-identification In practice |
Authors | Besat Kassaie |
Abstract | We report our effort to identify the sensitive information, subset of data items listed by HIPAA (Health Insurance Portability and Accountability), from medical text using the recent advances in natural language processing and machine learning techniques. We represent the words with high dimensional continuous vectors learned by a variant of Word2Vec called Continous Bag Of Words (CBOW). We feed the word vectors into a simple neural network with a Long Short-Term Memory (LSTM) architecture. Without any attempts to extract manually crafted features and considering that our medical dataset is too small to be fed into neural network, we obtained promising results. The results thrilled us to think about the larger scale of the project with precise parameter tuning and other possible improvements. |
Tasks | |
Published | 2017-01-11 |
URL | http://arxiv.org/abs/1701.03129v1 |
http://arxiv.org/pdf/1701.03129v1.pdf | |
PWC | https://paperswithcode.com/paper/de-identification-in-practice |
Repo | |
Framework | |
Pseudorehearsal in value function approximation
Title | Pseudorehearsal in value function approximation |
Authors | Vladimir Marochko, Leonard Johard, Manuel Mazzara |
Abstract | Catastrophic forgetting is of special importance in reinforcement learning, as the data distribution is generally non-stationary over time. We study and compare several pseudorehearsal approaches for Q-learning with function approximation in a pole balancing task. We have found that pseudorehearsal seems to assist learning even in such very simple problems, given proper initialization of the rehearsal parameters. |
Tasks | Q-Learning |
Published | 2017-03-21 |
URL | http://arxiv.org/abs/1703.07075v1 |
http://arxiv.org/pdf/1703.07075v1.pdf | |
PWC | https://paperswithcode.com/paper/pseudorehearsal-in-value-function |
Repo | |
Framework | |
Alternating Iteratively Reweighted Minimization Algorithms for Low-Rank Matrix Factorization
Title | Alternating Iteratively Reweighted Minimization Algorithms for Low-Rank Matrix Factorization |
Authors | Paris V. Giampouras, Athanasios A. Rontogiannis, Konstantinos D. Koutroumbas |
Abstract | Nowadays, the availability of large-scale data in disparate application domains urges the deployment of sophisticated tools for extracting valuable knowledge out of this huge bulk of information. In that vein, low-rank representations (LRRs) which seek low-dimensional embeddings of data have naturally appeared. In an effort to reduce computational complexity and improve estimation performance, LRR has been viewed via a matrix factorization (MF) perspective. Recently, low-rank MF (LRMF) approaches have been proposed for tackling the inherent weakness of MF i.e., the unawareness of the dimension of the low-dimensional space where data reside. Herein, inspired by the merits of iterative reweighted schemes for rank minimization, we come up with a generic low-rank promoting regularization function. Then, focusing on a specific instance of it, we propose a regularizer that imposes column-sparsity jointly on the two matrix factors that result from MF, thus promoting low-rankness on the optimization problem. The problems of denoising, matrix completion and non-negative matrix factorization (NMF) are redefined according to the new LRMF formulation and solved via efficient Newton-type algorithms with proven theoretical guarantees as to their convergence and rates of convergence to stationary points. The effectiveness of the proposed algorithms is verified in diverse simulated and real data experiments. |
Tasks | Denoising, Matrix Completion |
Published | 2017-10-05 |
URL | http://arxiv.org/abs/1710.02004v1 |
http://arxiv.org/pdf/1710.02004v1.pdf | |
PWC | https://paperswithcode.com/paper/alternating-iteratively-reweighted |
Repo | |
Framework | |
Testing Symmetric Markov Chains from a Single Trajectory
Title | Testing Symmetric Markov Chains from a Single Trajectory |
Authors | Constantinos Daskalakis, Nishanth Dikkala, Nick Gravin |
Abstract | Classical distribution testing assumes access to i.i.d. samples from the distribution that is being tested. We initiate the study of Markov chain testing, assuming access to a single trajectory of a Markov Chain. In particular, we observe a single trajectory X0,…,Xt,… of an unknown, symmetric, and finite state Markov Chain M. We do not control the starting state X0, and we cannot restart the chain. Given our single trajectory, the goal is to test whether M is identical to a model Markov Chain M0 , or far from it under an appropriate notion of difference. We propose a measure of difference between two Markov chains, motivated by the early work of Kazakos [Kaz78], which captures the scaling behavior of the total variation distance between trajectories sampled from the Markov chains as the length of these trajectories grows. We provide efficient testers and information-theoretic lower bounds for testing identity of symmetric Markov chains under our proposed measure of difference, which are tight up to logarithmic factors if the hitting times of the model chain M0 is O(n) in the size of the state space n. |
Tasks | |
Published | 2017-04-22 |
URL | http://arxiv.org/abs/1704.06850v2 |
http://arxiv.org/pdf/1704.06850v2.pdf | |
PWC | https://paperswithcode.com/paper/testing-symmetric-markov-chains-from-a-single |
Repo | |
Framework | |
Visualizing Residual Networks
Title | Visualizing Residual Networks |
Authors | Brian Chu, Daylen Yang, Ravi Tadinada |
Abstract | Residual networks are the current state of the art on ImageNet. Similar work in the direction of utilizing shortcut connections has been done extremely recently with derivatives of residual networks and with highway networks. This work potentially challenges our understanding that CNNs learn layers of local features that are followed by increasingly global features. Through qualitative visualization and empirical analysis, we explore the purpose that residual skip connections serve. Our assessments show that the residual shortcut connections force layers to refine features, as expected. We also provide alternate visualizations that confirm that residual networks learn what is already intuitively known about CNNs in general. |
Tasks | |
Published | 2017-01-09 |
URL | http://arxiv.org/abs/1701.02362v1 |
http://arxiv.org/pdf/1701.02362v1.pdf | |
PWC | https://paperswithcode.com/paper/visualizing-residual-networks |
Repo | |
Framework | |
Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study
Title | Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study |
Authors | John P. Lalor, Hao Wu, Tsendsuren Munkhdalai, Hong Yu |
Abstract | Interpreting the performance of deep learning models beyond test set accuracy is challenging. Characteristics of individual data points are often not considered during evaluation, and each data point is treated equally. We examine the impact of a test set question’s difficulty to determine if there is a relationship between difficulty and performance. We model difficulty using well-studied psychometric methods on human response patterns. Experiments on Natural Language Inference (NLI) and Sentiment Analysis (SA) show that the likelihood of answering a question correctly is impacted by the question’s difficulty. As DNNs are trained with more data, easy examples are learned more quickly than hard examples. |
Tasks | Natural Language Inference, Sentiment Analysis |
Published | 2017-02-15 |
URL | http://arxiv.org/abs/1702.04811v3 |
http://arxiv.org/pdf/1702.04811v3.pdf | |
PWC | https://paperswithcode.com/paper/understanding-deep-learning-performance |
Repo | |
Framework | |
Evolution-Preserving Dense Trajectory Descriptors
Title | Evolution-Preserving Dense Trajectory Descriptors |
Authors | Yang Wang, Vinh Tran, Minh Hoai |
Abstract | Recently Trajectory-pooled Deep-learning Descriptors were shown to achieve state-of-the-art human action recognition results on a number of datasets. This paper improves their performance by applying rank pooling to each trajectory, encoding the temporal evolution of deep learning features computed along the trajectory. This leads to Evolution-Preserving Trajectory (EPT) descriptors, a novel type of video descriptor that significantly outperforms Trajectory-pooled Deep-learning Descriptors. EPT descriptors are defined based on dense trajectories, and they provide complimentary benefits to video descriptors that are not based on trajectories. In particular, we show that the combination of EPT descriptors and VideoDarwin leads to state-of-the-art performance on Hollywood2 and UCF101 datasets. |
Tasks | Temporal Action Localization |
Published | 2017-02-14 |
URL | http://arxiv.org/abs/1702.04037v1 |
http://arxiv.org/pdf/1702.04037v1.pdf | |
PWC | https://paperswithcode.com/paper/evolution-preserving-dense-trajectory |
Repo | |
Framework | |
CortexNet: a Generic Network Family for Robust Visual Temporal Representations
Title | CortexNet: a Generic Network Family for Robust Visual Temporal Representations |
Authors | Alfredo Canziani, Eugenio Culurciello |
Abstract | In the past five years we have observed the rise of incredibly well performing feed-forward neural networks trained supervisedly for vision related tasks. These models have achieved super-human performance on object recognition, localisation, and detection in still images. However, there is a need to identify the best strategy to employ these networks with temporal visual inputs and obtain a robust and stable representation of video data. Inspired by the human visual system, we propose a deep neural network family, CortexNet, which features not only bottom-up feed-forward connections, but also it models the abundant top-down feedback and lateral connections, which are present in our visual cortex. We introduce two training schemes - the unsupervised MatchNet and weakly supervised TempoNet modes - where a network learns how to correctly anticipate a subsequent frame in a video clip or the identity of its predominant subject, by learning egomotion clues and how to automatically track several objects in the current scene. Find the project website at https://engineering.purdue.edu/elab/CortexNet/. |
Tasks | Object Recognition |
Published | 2017-06-08 |
URL | http://arxiv.org/abs/1706.02735v2 |
http://arxiv.org/pdf/1706.02735v2.pdf | |
PWC | https://paperswithcode.com/paper/cortexnet-a-generic-network-family-for-robust |
Repo | |
Framework | |
Efficient Deformable Shape Correspondence via Kernel Matching
Title | Efficient Deformable Shape Correspondence via Kernel Matching |
Authors | Zorah Lähner, Matthias Vestner, Amit Boyarski, Or Litany, Ron Slossberg, Tal Remez, Emanuele Rodolà, Alex Bronstein, Michael Bronstein, Ron Kimmel, Daniel Cremers |
Abstract | We present a method to match three dimensional shapes under non-isometric deformations, topology changes and partiality. We formulate the problem as matching between a set of pair-wise and point-wise descriptors, imposing a continuity prior on the mapping, and propose a projected descent optimization procedure inspired by difference of convex functions (DC) programming. Surprisingly, in spite of the highly non-convex nature of the resulting quadratic assignment problem, our method converges to a semantically meaningful and continuous mapping in most of our experiments, and scales well. We provide preliminary theoretical analysis and several interpretations of the method. |
Tasks | |
Published | 2017-07-25 |
URL | http://arxiv.org/abs/1707.08991v3 |
http://arxiv.org/pdf/1707.08991v3.pdf | |
PWC | https://paperswithcode.com/paper/efficient-deformable-shape-correspondence-via |
Repo | |
Framework | |
Music Transcription by Deep Learning with Data and “Artificial Semantic” Augmentation
Title | Music Transcription by Deep Learning with Data and “Artificial Semantic” Augmentation |
Authors | Vladyslav Sarnatskyi, Vadym Ovcharenko, Mariia Tkachenko, Sergii Stirenko, Yuri Gordienko, Anis Rojbi |
Abstract | In this progress paper the previous results of the single note recognition by deep learning are presented. The several ways for data augmentation and “artificial semantic” augmentation are proposed to enhance efficiency of deep learning approaches for monophonic and polyphonic note recognition by increase of dimensions of training data, their lossless and lossy transformations. |
Tasks | Data Augmentation |
Published | 2017-12-08 |
URL | http://arxiv.org/abs/1712.03228v1 |
http://arxiv.org/pdf/1712.03228v1.pdf | |
PWC | https://paperswithcode.com/paper/music-transcription-by-deep-learning-with |
Repo | |
Framework | |
TV-GAN: Generative Adversarial Network Based Thermal to Visible Face Recognition
Title | TV-GAN: Generative Adversarial Network Based Thermal to Visible Face Recognition |
Authors | Teng Zhang, Arnold Wiliem, Siqi Yang, Brian C. Lovell |
Abstract | This work tackles the face recognition task on images captured using thermal camera sensors which can operate in the non-light environment. While it can greatly increase the scope and benefits of the current security surveillance systems, performing such a task using thermal images is a challenging problem compared to face recognition task in the Visible Light Domain (VLD). This is partly due to the much smaller amount number of thermal imagery data collected compared to the VLD data. Unfortunately, direct application of the existing very strong face recognition models trained using VLD data into the thermal imagery data will not produce a satisfactory performance. This is due to the existence of the domain gap between the thermal and VLD images. To this end, we propose a Thermal-to-Visible Generative Adversarial Network (TV-GAN) that is able to transform thermal face images into their corresponding VLD images whilst maintaining identity information which is sufficient enough for the existing VLD face recognition models to perform recognition. Some examples are presented in Figure 1. Unlike the previous methods, our proposed TV-GAN uses an explicit closed-set face recognition loss to regularize the discriminator network training. This information will then be conveyed into the generator network in the forms of gradient loss. In the experiment, we show that by using this additional explicit regularization for the discriminator network, the TV-GAN is able to preserve more identity information when translating a thermal image of a person which is not seen before by the TV-GAN. |
Tasks | Face Recognition |
Published | 2017-12-07 |
URL | http://arxiv.org/abs/1712.02514v1 |
http://arxiv.org/pdf/1712.02514v1.pdf | |
PWC | https://paperswithcode.com/paper/tv-gan-generative-adversarial-network-based |
Repo | |
Framework | |
Universally consistent predictive distributions
Title | Universally consistent predictive distributions |
Authors | Vladimir Vovk |
Abstract | This paper describes simple universally consistent procedures of probability forecasting that satisfy a natural property of small-sample validity, under the assumption that the observations are produced independently in the IID fashion. |
Tasks | |
Published | 2017-08-06 |
URL | http://arxiv.org/abs/1708.01902v2 |
http://arxiv.org/pdf/1708.01902v2.pdf | |
PWC | https://paperswithcode.com/paper/universally-consistent-predictive |
Repo | |
Framework | |