July 29, 2019

3166 words 15 mins read

Paper Group ANR 58

Hyperparameters Optimization in Deep Convolutional Neural Network / Bayesian Approach with Gaussian Process Prior. Solving the Resource Constrained Project Scheduling Problem Using the Parallel Tabu Search Designed for the CUDA Platform. Generative Adversarial Networks for Electronic Health Records: A Framework for Exploring and Evaluating Methods …

Hyperparameters Optimization in Deep Convolutional Neural Network / Bayesian Approach with Gaussian Process Prior


Title	Hyperparameters Optimization in Deep Convolutional Neural Network / Bayesian Approach with Gaussian Process Prior
Authors	Pushparaja Murugan
Abstract	Convolutional Neural Network is known as ConvNet have been extensively used in many complex machine learning tasks. However, hyperparameters optimization is one of a crucial step in developing ConvNet architectures, since the accuracy and performance are reliant on the hyperparameters. This multilayered architecture parameterized by a set of hyperparameters such as the number of convolutional layers, number of fully connected dense layers & neurons, the probability of dropout implementation, learning rate. Hence the searching the hyperparameter over the hyperparameter space are highly difficult to build such complex hierarchical architecture. Many methods have been proposed over the decade to explore the hyperparameter space and find the optimum set of hyperparameter values. Reportedly, Gird search and Random search are said to be inefficient and extremely expensive, due to a large number of hyperparameters of the architecture. Hence, Sequential model-based Bayesian Optimization is a promising alternative technique to address the extreme of the unknown cost function. The recent study on Bayesian Optimization by Snoek in nine convolutional network parameters is achieved the lowerest error report in the CIFAR-10 benchmark. This article is intended to provide the overview of the mathematical concept behind the Bayesian Optimization over a Gaussian prior.
Tasks
Published	2017-12-19
URL	http://arxiv.org/abs/1712.07233v1
PDF	http://arxiv.org/pdf/1712.07233v1.pdf
PWC	https://paperswithcode.com/paper/hyperparameters-optimization-in-deep
Repo
Framework

Solving the Resource Constrained Project Scheduling Problem Using the Parallel Tabu Search Designed for the CUDA Platform


Title	Solving the Resource Constrained Project Scheduling Problem Using the Parallel Tabu Search Designed for the CUDA Platform
Authors	Libor Bukata, Premysl Sucha, Zdenek Hanzalek
Abstract	In the paper, a parallel Tabu Search algorithm for the Resource Constrained Project Scheduling Problem is proposed. To deal with this NP-hard combinatorial problem many optimizations have been performed. For example, a resource evaluation algorithm is selected by a heuristic and an effective Tabu List was designed. In addition to that, a capacity-indexed resource evaluation algorithm was proposed and the GPU (Graphics Processing Unit) version uses a homogeneous model to reduce the required communication bandwidth. According to the experiments, the GPU version outperforms the optimized parallel CPU version with respect to the computational time and the quality of solutions. In comparison with other existing heuristics, the proposed solution often gives better quality solutions.
Tasks
Published	2017-11-13
URL	http://arxiv.org/abs/1711.04556v1
PDF	http://arxiv.org/pdf/1711.04556v1.pdf
PWC	https://paperswithcode.com/paper/solving-the-resource-constrained-project
Repo
Framework

Generative Adversarial Networks for Electronic Health Records: A Framework for Exploring and Evaluating Methods for Predicting Drug-Induced Laboratory Test Trajectories


Title	Generative Adversarial Networks for Electronic Health Records: A Framework for Exploring and Evaluating Methods for Predicting Drug-Induced Laboratory Test Trajectories
Authors	Alexandre Yahi, Rami Vanguri, Noémie Elhadad, Nicholas P. Tatonetti
Abstract	Generative Adversarial Networks (GANs) represent a promising class of generative networks that combine neural networks with game theory. From generating realistic images and videos to assisting musical creation, GANs are transforming many fields of arts and sciences. However, their application to healthcare has not been fully realized, more specifically in generating electronic health records (EHR) data. In this paper, we propose a framework for exploring the value of GANs in the context of continuous laboratory time series data. We devise an unsupervised evaluation method that measures the predictive power of synthetic laboratory test time series. Further, we show that when it comes to predicting the impact of drug exposure on laboratory test data, incorporating representation learning of the training cohorts prior to training GAN models is beneficial.
Tasks	Representation Learning, Time Series
Published	2017-12-01
URL	http://arxiv.org/abs/1712.00164v1
PDF	http://arxiv.org/pdf/1712.00164v1.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-networks-for-3
Repo
Framework

Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification


Title	Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification
Authors	Yongyu Wang, Zhuo Feng
Abstract	The eigendeomposition of nearest-neighbor (NN) graph Laplacian matrices is the main computational bottleneck in spectral clustering. In this work, we introduce a highly-scalable, spectrum-preserving graph sparsification algorithm that enables to build ultra-sparse NN (u-NN) graphs with guaranteed preservation of the original graph spectrums, such as the first few eigenvectors of the original graph Laplacian. Our approach can immediately lead to scalable spectral clustering of large data networks without sacrificing solution quality. The proposed method starts from constructing low-stretch spanning trees (LSSTs) from the original graphs, which is followed by iteratively recovering small portions of “spectrally critical” off-tree edges to the LSSTs by leveraging a spectral off-tree embedding scheme. To determine the suitable amount of off-tree edges to be recovered to the LSSTs, an eigenvalue stability checking scheme is proposed, which enables to robustly preserve the first few Laplacian eigenvectors within the sparsified graph. Additionally, an incremental graph densification scheme is proposed for identifying extra edges that have been missing in the original NN graphs but can still play important roles in spectral clustering tasks. Our experimental results for a variety of well-known data sets show that the proposed method can dramatically reduce the complexity of NN graphs, leading to significant speedups in spectral clustering.
Tasks
Published	2017-10-12
URL	http://arxiv.org/abs/1710.04584v4
PDF	http://arxiv.org/pdf/1710.04584v4.pdf
PWC	https://paperswithcode.com/paper/towards-scalable-spectral-clustering-via
Repo
Framework

Intrinsic Grassmann Averages for Online Linear, Robust and Nonlinear Subspace Learning


Title	Intrinsic Grassmann Averages for Online Linear, Robust and Nonlinear Subspace Learning
Authors	Rudrasis Chakraborty, Søren Hauberg, Baba C. Vemuri
Abstract	Principal Component Analysis (PCA) and Kernel Principal Component Analysis (KPCA) are fundamental methods in machine learning for dimensionality reduction. The former is a technique for finding this approximation in finite dimensions and the latter is often in an infinite dimensional Reproducing Kernel Hilbert-space (RKHS). In this paper, we present a geometric framework for computing the principal linear subspaces in both situations as well as for the robust PCA case, that amounts to computing the intrinsic average on the space of all subspaces: the Grassmann manifold. Points on this manifold are defined as the subspaces spanned by $K$-tuples of observations. The intrinsic Grassmann average of these subspaces are shown to coincide with the principal components of the observations when they are drawn from a Gaussian distribution. We show similar results in the RKHS case and provide an efficient algorithm for computing the projection onto the this average subspace. The result is a method akin to KPCA which is substantially faster. Further, we present a novel online version of the KPCA using our geometric framework. Competitive performance of all our algorithms are demonstrated on a variety of real and synthetic data sets.
Tasks	Dimensionality Reduction
Published	2017-02-03
URL	http://arxiv.org/abs/1702.01005v2
PDF	http://arxiv.org/pdf/1702.01005v2.pdf
PWC	https://paperswithcode.com/paper/intrinsic-grassmann-averages-for-online
Repo
Framework

Generalized notions of sparsity and restricted isometry property. Part II: Applications


Title	Generalized notions of sparsity and restricted isometry property. Part II: Applications
Authors	Marius Junge, Kiryung Lee
Abstract	The restricted isometry property (RIP) is a universal tool for data recovery. We explore the implication of the RIP in the framework of generalized sparsity and group measurements introduced in the Part I paper. It turns out that for a given measurement instrument the number of measurements for RIP can be improved by optimizing over families of Banach spaces. Second, we investigate the preservation of difference of two sparse vectors, which is not trivial in generalized models. Third, we extend the RIP of partial Fourier measurements at optimal scaling of number of measurements with random sign to far more general group structured measurements. Lastly, we also obtain RIP in infinite dimension in the context of Fourier measurement concepts with sparsity naturally replaced by smoothness assumptions.
Tasks
Published	2017-06-28
URL	http://arxiv.org/abs/1706.09411v2
PDF	http://arxiv.org/pdf/1706.09411v2.pdf
PWC	https://paperswithcode.com/paper/generalized-notions-of-sparsity-and
Repo
Framework

A giant with feet of clay: on the validity of the data that feed machine learning in medicine


Title	A giant with feet of clay: on the validity of the data that feed machine learning in medicine
Authors	Federico Cabitza, Davide Ciucci, Raffaele Rasoini
Abstract	This paper considers the use of Machine Learning (ML) in medicine by focusing on the main problem that this computational approach has been aimed at solving or at least minimizing: uncertainty. To this aim, we point out how uncertainty is so ingrained in medicine that it biases also the representation of clinical phenomena, that is the very input of ML models, thus undermining the clinical significance of their output. Recognizing this can motivate both medical doctors, in taking more responsibility in the development and use of these decision aids, and the researchers, in pursuing different ways to assess the value of these systems. In so doing, both designers and users could take this intrinsic characteristic of medicine more seriously and consider alternative approaches that do not “sweep uncertainty under the rug” within an objectivist fiction, which everyone can come up by believing as true.
Tasks
Published	2017-06-21
URL	http://arxiv.org/abs/1706.06838v3
PDF	http://arxiv.org/pdf/1706.06838v3.pdf
PWC	https://paperswithcode.com/paper/a-giant-with-feet-of-clay-on-the-validity-of
Repo
Framework

Hierarchical Deep Recurrent Architecture for Video Understanding


Title	Hierarchical Deep Recurrent Architecture for Video Understanding
Authors	Luming Tang, Boyang Deng, Haiyu Zhao, Shuai Yi
Abstract	This paper introduces the system we developed for the Youtube-8M Video Understanding Challenge, in which a large-scale benchmark dataset was used for multi-label video classification. The proposed framework contains hierarchical deep architecture, including the frame-level sequence modeling part and the video-level classification part. In the frame-level sequence modelling part, we explore a set of methods including Pooling-LSTM (PLSTM), Hierarchical-LSTM (HLSTM), Random-LSTM (RLSTM) in order to address the problem of large amount of frames in a video. We also introduce two attention pooling methods, single attention pooling (ATT) and multiply attention pooling (Multi-ATT) so that we can pay more attention to the informative frames in a video and ignore the useless frames. In the video-level classification part, two methods are proposed to increase the classification performance, i.e. Hierarchical-Mixture-of-Experts (HMoE) and Classifier Chains (CC). Our final submission is an ensemble consisting of 18 sub-models. In terms of the official evaluation metric Global Average Precision (GAP) at 20, our best submission achieves 0.84346 on the public 50% of test dataset and 0.84333 on the private 50% of test data.
Tasks	Video Classification, Video Understanding
Published	2017-07-11
URL	http://arxiv.org/abs/1707.03296v1
PDF	http://arxiv.org/pdf/1707.03296v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-deep-recurrent-architecture-for
Repo
Framework

Land Cover Classification via Multi-temporal Spatial Data by Recurrent Neural Networks


Title	Land Cover Classification via Multi-temporal Spatial Data by Recurrent Neural Networks
Authors	Dino Ienco, Raffaele Gaetano, Claire Dupaquier, Pierre Maurel
Abstract	Nowadays, modern earth observation programs produce huge volumes of satellite images time series (SITS) that can be useful to monitor geographical areas through time. How to efficiently analyze such kind of information is still an open question in the remote sensing field. Recently, deep learning methods proved suitable to deal with remote sensing data mainly for scene classification (i.e. Convolutional Neural Networks - CNNs - on single images) while only very few studies exist involving temporal deep learning approaches (i.e Recurrent Neural Networks - RNNs) to deal with remote sensing time series. In this letter we evaluate the ability of Recurrent Neural Networks, in particular the Long-Short Term Memory (LSTM) model, to perform land cover classification considering multi-temporal spatial data derived from a time series of satellite images. We carried out experiments on two different datasets considering both pixel-based and object-based classification. The obtained results show that Recurrent Neural Networks are competitive compared to state-of-the-art classifiers, and may outperform classical approaches in presence of low represented and/or highly mixed classes. We also show that using the alternative feature representation generated by LSTM can improve the performances of standard classifiers.
Tasks	Scene Classification, Time Series
Published	2017-04-13
URL	http://arxiv.org/abs/1704.04055v1
PDF	http://arxiv.org/pdf/1704.04055v1.pdf
PWC	https://paperswithcode.com/paper/land-cover-classification-via-multi-temporal
Repo
Framework

Aggregating Frame-level Features for Large-Scale Video Classification


Title	Aggregating Frame-level Features for Large-Scale Video Classification
Authors	Shaoxiang Chen, Xi Wang, Yongyi Tang, Xinpeng Chen, Zuxuan Wu, Yu-Gang Jiang
Abstract	This paper introduces the system we developed for the Google Cloud & YouTube-8M Video Understanding Challenge, which can be considered as a multi-label classification problem defined on top of the large scale YouTube-8M Dataset. We employ a large set of techniques to aggregate the provided frame-level feature representations and generate video-level predictions, including several variants of recurrent neural networks (RNN) and generalized VLAD. We also adopt several fusion strategies to explore the complementarity among the models. In terms of the official metric GAP@20 (global average precision at 20), our best fusion model attains 0.84198 on the public 50% of test data and 0.84193 on the private 50% of test data, ranking 4th out of 650 teams worldwide in the competition.
Tasks	Multi-Label Classification, Video Classification, Video Understanding
Published	2017-07-04
URL	http://arxiv.org/abs/1707.00803v1
PDF	http://arxiv.org/pdf/1707.00803v1.pdf
PWC	https://paperswithcode.com/paper/aggregating-frame-level-features-for-large
Repo
Framework

Linear Time Clustering for High Dimensional Mixtures of Gaussian Clouds


Title	Linear Time Clustering for High Dimensional Mixtures of Gaussian Clouds
Authors	Dan Kushnir, Shirin Jalali, Iraj Saniee
Abstract	Clustering mixtures of Gaussian distributions is a fundamental and challenging problem that is ubiquitous in various high-dimensional data processing tasks. While state-of-the-art work on learning Gaussian mixture models has focused primarily on improving separation bounds and their generalization to arbitrary classes of mixture models, less emphasis has been paid to practical computational efficiency of the proposed solutions. In this paper, we propose a novel and highly efficient clustering algorithm for $n$ points drawn from a mixture of two arbitrary Gaussian distributions in $\mathbb{R}^p$. The algorithm involves performing random 1-dimensional projections until a direction is found that yields a user-specified clustering error $e$. For a 1-dimensional separation parameter $\gamma$ satisfying $\gamma=Q^{-1}(e)$, the expected number of such projections is shown to be bounded by $o(\ln p)$, when $\gamma$ satisfies $\gamma\leq c\sqrt{\ln{\ln{p}}}$, with $c$ as the separability parameter of the two Gaussians in $\mathbb{R}^p$. Consequently, the expected overall running time of the algorithm is linear in $n$ and quasi-linear in $p$ at $o(\ln{p})O(np)$, and the sample complexity is independent of $p$. This result stands in contrast to prior works which provide polynomial, with at-best quadratic, running time in $p$ and $n$. We show that our bound on the expected number of 1-dimensional projections extends to the case of three or more Gaussian components, and we present a generalization of our results to mixture distributions beyond the Gaussian model.
Tasks
Published	2017-12-19
URL	http://arxiv.org/abs/1712.07242v3
PDF	http://arxiv.org/pdf/1712.07242v3.pdf
PWC	https://paperswithcode.com/paper/linear-time-clustering-for-high-dimensional
Repo
Framework

Sparse Diffusion-Convolutional Neural Networks


Title	Sparse Diffusion-Convolutional Neural Networks
Authors	James Atwood, Siddharth Pal, Don Towsley, Ananthram Swami
Abstract	The predictive power and overall computational efficiency of Diffusion-convolutional neural networks make them an attractive choice for node classification tasks. However, a naive dense-tensor-based implementation of DCNNs leads to $\mathcal{O}(N^2)$ memory complexity which is prohibitive for large graphs. In this paper, we introduce a simple method for thresholding input graphs that provably reduces memory requirements of DCNNs to O(N) (i.e. linear in the number of nodes in the input) without significantly affecting predictive performance.
Tasks	Node Classification
Published	2017-10-26
URL	http://arxiv.org/abs/1710.09813v1
PDF	http://arxiv.org/pdf/1710.09813v1.pdf
PWC	https://paperswithcode.com/paper/sparse-diffusion-convolutional-neural
Repo
Framework

Neural Cross-Lingual Entity Linking


Title	Neural Cross-Lingual Entity Linking
Authors	Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza
Abstract	A major challenge in Entity Linking (EL) is making effective use of contextual information to disambiguate mentions to Wikipedia that might refer to different entities in different contexts. The problem exacerbates with cross-lingual EL which involves linking mentions written in non-English documents to entries in the English Wikipedia: to compare textual clues across languages we need to compute similarity between textual fragments across languages. In this paper, we propose a neural EL model that trains fine-grained similarities and dissimilarities between the query and candidate document from multiple perspectives, combined with convolution and tensor networks. Further, we show that this English-trained system can be applied, in zero-shot learning, to other languages by making surprisingly effective use of multi-lingual embeddings. The proposed system has strong empirical evidence yielding state-of-the-art results in English as well as cross-lingual: Spanish and Chinese TAC 2015 datasets.
Tasks	Cross-Lingual Entity Linking, Entity Linking, Tensor Networks, Zero-Shot Learning
Published	2017-12-05
URL	http://arxiv.org/abs/1712.01813v1
PDF	http://arxiv.org/pdf/1712.01813v1.pdf
PWC	https://paperswithcode.com/paper/neural-cross-lingual-entity-linking
Repo
Framework

On the Long-Term Memory of Deep Recurrent Networks


Title	On the Long-Term Memory of Deep Recurrent Networks
Authors	Yoav Levine, Or Sharir, Alon Ziv, Amnon Shashua
Abstract	A key attribute that drives the unprecedented success of modern Recurrent Neural Networks (RNNs) on learning tasks which involve sequential data, is their ability to model intricate long-term temporal dependencies. However, a well established measure of RNNs long-term memory capacity is lacking, and thus formal understanding of the effect of depth on their ability to correlate data throughout time is limited. Specifically, existing depth efficiency results on convolutional networks do not suffice in order to account for the success of deep RNNs on data of varying lengths. In order to address this, we introduce a measure of the network’s ability to support information flow across time, referred to as the Start-End separation rank, which reflects the distance of the function realized by the recurrent network from modeling no dependency between the beginning and end of the input sequence. We prove that deep recurrent networks support Start-End separation ranks which are combinatorially higher than those supported by their shallow counterparts. Thus, we establish that depth brings forth an overwhelming advantage in the ability of recurrent networks to model long-term dependencies, and provide an exemplar of quantifying this key attribute which may be readily extended to other RNN architectures of interest, e.g. variants of LSTM networks. We obtain our results by considering a class of recurrent networks referred to as Recurrent Arithmetic Circuits, which merge the hidden state with the input via the Multiplicative Integration operation, and empirically demonstrate the discussed phenomena on common RNNs. Finally, we employ the tool of quantum Tensor Networks to gain additional graphic insight regarding the complexity brought forth by depth in recurrent networks.
Tasks	Tensor Networks
Published	2017-10-25
URL	http://arxiv.org/abs/1710.09431v2
PDF	http://arxiv.org/pdf/1710.09431v2.pdf
PWC	https://paperswithcode.com/paper/on-the-long-term-memory-of-deep-recurrent
Repo
Framework

An Inversion-Based Learning Approach for Improving Impromptu Trajectory Tracking of Robots with Non-Minimum Phase Dynamics


Title	An Inversion-Based Learning Approach for Improving Impromptu Trajectory Tracking of Robots with Non-Minimum Phase Dynamics
Authors	Siqi Zhou, Mohamed K. Helwa, Angela P. Schoellig
Abstract	This paper presents a learning-based approach for impromptu trajectory tracking for non-minimum phase systems, i.e., systems with unstable inverse dynamics. Inversion-based feedforward approaches are commonly used for improving tracking performance; however, these approaches are not directly applicable to non-minimum phase systems due to their inherent instability. In order to resolve the instability issue, existing methods have assumed that the system model is known and used pre-actuation or inverse approximation techniques. In this work, we propose an approach for learning a stable, approximate inverse of a non-minimum phase baseline system directly from its input-output data. Through theoretical discussions, simulations, and experiments on two different platforms, we show the stability of our proposed approach and its effectiveness for high-accuracy, impromptu tracking. Our approach also shows that including more information in the training, as is commonly assumed to be useful, does not lead to better performance but may trigger instability and impact the effectiveness of the overall approach.
Tasks
Published	2017-09-13
URL	http://arxiv.org/abs/1709.04407v2
PDF	http://arxiv.org/pdf/1709.04407v2.pdf
PWC	https://paperswithcode.com/paper/an-inversion-based-learning-approach-for
Repo
Framework