October 21, 2019

2966 words 14 mins read

Paper Group AWR 111

Paper Group AWR 111

Temporal Human Action Segmentation via Dynamic Clustering. Beyond the Selected Completely At Random Assumption for Learning from Positive and Unlabeled Data. Crawling in Rogue’s dungeons with (partitioned) A3C. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. PoPPy: A Point Process Toolbox Based on PyTo …

Temporal Human Action Segmentation via Dynamic Clustering

Title Temporal Human Action Segmentation via Dynamic Clustering
Authors Yan Zhang, He Sun, Siyu Tang, Heiko Neumann
Abstract We present an effective dynamic clustering algorithm for the task of temporal human action segmentation, which has comprehensive applications such as robotics, motion analysis, and patient monitoring. Our proposed algorithm is unsupervised, fast, generic to process various types of features, and applicable in both the online and offline settings. We perform extensive experiments of processing data streams, and show that our algorithm achieves the state-of-the-art results for both online and offline settings.
Tasks action segmentation
Published 2018-03-15
URL http://arxiv.org/abs/1803.05790v2
PDF http://arxiv.org/pdf/1803.05790v2.pdf
PWC https://paperswithcode.com/paper/temporal-human-action-segmentation-via
Repo https://github.com/yz-cnsdqz/dynamic_clustering
Framework none

Beyond the Selected Completely At Random Assumption for Learning from Positive and Unlabeled Data

Title Beyond the Selected Completely At Random Assumption for Learning from Positive and Unlabeled Data
Authors Jessa Bekker, Pieter Robberechts, Jesse Davis
Abstract Most positive and unlabeled data is subject to selection biases. The labeled examples can, for example, be selected from the positive set because they are easier to obtain or more obviously positive. This paper investigates how learning can be ena BHbled in this setting. We propose and theoretically analyze an empirical-risk-based method for incorporating the labeling mechanism. Additionally, we investigate under which assumptions learning is possible when the labeling mechanism is not fully understood and propose a practical method to enable this. Our empirical analysis supports the theoretical results and shows that taking into account the possibility of a selection bias, even when the labeling mechanism is unknown, improves the trained classifiers.
Tasks
Published 2018-09-10
URL https://arxiv.org/abs/1809.03207v4
PDF https://arxiv.org/pdf/1809.03207v4.pdf
PWC https://paperswithcode.com/paper/beyond-the-selected-completely-at-random
Repo https://github.com/ML-KULeuven/SAR-PU
Framework none

Crawling in Rogue’s dungeons with (partitioned) A3C

Title Crawling in Rogue’s dungeons with (partitioned) A3C
Authors Andrea Asperti, Daniele Cortesi, Francesco Sovrano
Abstract Rogue is a famous dungeon-crawling video-game of the 80ies, the ancestor of its gender. Rogue-like games are known for the necessity to explore partially observable and always different randomly-generated labyrinths, preventing any form of level replay. As such, they serve as a very natural and challenging task for reinforcement learning, requiring the acquisition of complex, non-reactive behaviors involving memory and planning. In this article we show how, exploiting a version of A3C partitioned on different situations, the agent is able to reach the stairs and descend to the next level in 98% of cases.
Tasks
Published 2018-04-23
URL http://arxiv.org/abs/1804.08685v3
PDF http://arxiv.org/pdf/1804.08685v3.pdf
PWC https://paperswithcode.com/paper/crawling-in-rogues-dungeons-with-partitioned
Repo https://github.com/Francesco-Sovrano/Partitioned-A3C-for-RogueInABox
Framework tf

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Title An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Authors Shaojie Bai, J. Zico Kolter, Vladlen Koltun
Abstract For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks. To assist related work, we have made code available at http://github.com/locuslab/TCN .
Tasks Language Modelling, Machine Translation, Music Modeling, Sequential Image Classification
Published 2018-03-04
URL http://arxiv.org/abs/1803.01271v2
PDF http://arxiv.org/pdf/1803.01271v2.pdf
PWC https://paperswithcode.com/paper/an-empirical-evaluation-of-generic
Repo https://github.com/mhjabreel/CharCnn_Keras
Framework tf

PoPPy: A Point Process Toolbox Based on PyTorch

Title PoPPy: A Point Process Toolbox Based on PyTorch
Authors Hongteng Xu
Abstract PoPPy is a Point Process toolbox based on PyTorch, which achieves flexible designing and efficient learning of point process models. It can be used for interpretable sequential data modeling and analysis, e.g., Granger causality analysis of multi-variate point processes, point process-based simulation and prediction of event sequences. In practice, the key points of point process-based sequential data modeling include: 1) How to design intensity functions to describe the mechanism behind observed data? 2) How to learn the proposed intensity functions from observed data? The goal of PoPPy is providing a user-friendly solution to the key points above and achieving large-scale point process-based sequential data analysis, simulation and prediction.
Tasks Point Processes
Published 2018-10-23
URL https://arxiv.org/abs/1810.10122v3
PDF https://arxiv.org/pdf/1810.10122v3.pdf
PWC https://paperswithcode.com/paper/poppy-a-point-process-toolbox-based-on
Repo https://github.com/HongtengXu/PoPPy
Framework pytorch

IGCV$2$: Interleaved Structured Sparse Convolutional Neural Networks

Title IGCV$2$: Interleaved Structured Sparse Convolutional Neural Networks
Authors Guotian Xie, Jingdong Wang, Ting Zhang, Jianhuang Lai, Richang Hong, Guo-Jun Qi
Abstract In this paper, we study the problem of designing efficient convolutional neural network architectures with the interest in eliminating the redundancy in convolution kernels. In addition to structured sparse kernels, low-rank kernels and the product of low-rank kernels, the product of structured sparse kernels, which is a framework for interpreting the recently-developed interleaved group convolutions (IGC) and its variants (e.g., Xception), has been attracting increasing interests. Motivated by the observation that the convolutions contained in a group convolution in IGC can be further decomposed in the same manner, we present a modularized building block, {IGCV$2$:} interleaved structured sparse convolutions. It generalizes interleaved group convolutions, which is composed of two structured sparse kernels, to the product of more structured sparse kernels, further eliminating the redundancy. We present the complementary condition and the balance condition to guide the design of structured sparse kernels, obtaining a balance among three aspects: model size, computation complexity and classification accuracy. Experimental results demonstrate the advantage on the balance among these three aspects compared to interleaved group convolutions and Xception, and competitive performance compared to other state-of-the-art architecture design methods.
Tasks
Published 2018-04-17
URL http://arxiv.org/abs/1804.06202v1
PDF http://arxiv.org/pdf/1804.06202v1.pdf
PWC https://paperswithcode.com/paper/igcv2-interleaved-structured-sparse
Repo https://github.com/homles11/IGCV3
Framework tf

Towards the first adversarially robust neural network model on MNIST

Title Towards the first adversarially robust neural network model on MNIST
Authors Lukas Schott, Jonas Rauber, Matthias Bethge, Wieland Brendel
Abstract Despite much effort, deep neural networks remain highly susceptible to tiny input perturbations and even for MNIST, one of the most common toy datasets in computer vision, no neural network model exists for which adversarial perturbations are large and make semantic sense to humans. We show that even the widely recognized and by far most successful defense by Madry et al. (1) overfits on the L-infinity metric (it’s highly susceptible to L2 and L0 perturbations), (2) classifies unrecognizable images with high certainty, (3) performs not much better than simple input binarization and (4) features adversarial perturbations that make little sense to humans. These results suggest that MNIST is far from being solved in terms of adversarial robustness. We present a novel robust classification model that performs analysis by synthesis using learned class-conditional data distributions. We derive bounds on the robustness and go to great length to empirically evaluate our model using maximally effective adversarial attacks by (a) applying decision-based, score-based, gradient-based and transfer-based attacks for several different Lp norms, (b) by designing a new attack that exploits the structure of our defended model and (c) by devising a novel decision-based attack that seeks to minimize the number of perturbed pixels (L0). The results suggest that our approach yields state-of-the-art robustness on MNIST against L0, L2 and L-infinity perturbations and we demonstrate that most adversarial examples are strongly perturbed towards the perceptual boundary between the original and the adversarial class.
Tasks
Published 2018-05-23
URL http://arxiv.org/abs/1805.09190v3
PDF http://arxiv.org/pdf/1805.09190v3.pdf
PWC https://paperswithcode.com/paper/towards-the-first-adversarially-robust-neural
Repo https://github.com/bethgelab/AnalysisBySynthesis
Framework pytorch

HG-means: A scalable hybrid genetic algorithm for minimum sum-of-squares clustering

Title HG-means: A scalable hybrid genetic algorithm for minimum sum-of-squares clustering
Authors Daniel Gribel, Thibaut Vidal
Abstract Minimum sum-of-squares clustering (MSSC) is a widely used clustering model, of which the popular K-means algorithm constitutes a local minimizer. It is well known that the solutions of K-means can be arbitrarily distant from the true MSSC global optimum, and dozens of alternative heuristics have been proposed for this problem. However, no other algorithm has been predominantly adopted in the literature. This may be related to differences of computational effort, or to the assumption that a near-optimal solution of the MSSC has only a marginal impact on clustering validity. In this article, we dispute this belief. We introduce an efficient population-based metaheuristic that uses K-means as a local search in combination with problem-tailored crossover, mutation, and diversification operators. This algorithm can be interpreted as a multi-start K-means, in which the initial center positions are carefully sampled based on the search history. The approach is scalable and accurate, outperforming all recent state-of-the-art algorithms for MSSC in terms of solution quality, measured by the depth of local minima. This enhanced accuracy leads to clusters which are significantly closer to the ground truth than those of other algorithms, for overlapping Gaussian-mixture datasets with a large number of features. Therefore, improved global optimization methods appear to be essential to better exploit the MSSC model in high dimension.
Tasks
Published 2018-04-25
URL http://arxiv.org/abs/1804.09813v2
PDF http://arxiv.org/pdf/1804.09813v2.pdf
PWC https://paperswithcode.com/paper/hg-means-a-scalable-hybrid-genetic-algorithm
Repo https://github.com/danielgribel/hg-means
Framework none

An Empirical Evaluation of Deep Learning for ICD-9 Code Assignment using MIMIC-III Clinical Notes

Title An Empirical Evaluation of Deep Learning for ICD-9 Code Assignment using MIMIC-III Clinical Notes
Authors Jinmiao Huang, Cesar Osorio, Luke Wicent Sy
Abstract Background and Objective: Code assignment is of paramount importance in many levels in modern hospitals, from ensuring accurate billing process to creating a valid record of patient care history. However, the coding process is tedious and subjective, and it requires medical coders with extensive training. This study aims to evaluate the performance of deep-learning-based systems to automatically map clinical notes to ICD-9 medical codes. Methods: The evaluations of this research are focused on end-to-end learning methods without manually defined rules. Traditional machine learning algorithms, as well as state-of-the-art deep learning methods such as Recurrent Neural Networks and Convolution Neural Networks, were applied to the Medical Information Mart for Intensive Care (MIMIC-III) dataset. An extensive number of experiments was applied to different settings of the tested algorithm. Results: Findings showed that the deep learning-based methods outperformed other conventional machine learning methods. From our assessment, the best models could predict the top 10 ICD-9 codes with 0.6957 F1 and 0.8967 accuracy and could estimate the top 10 ICD-9 categories with 0.7233 F1 and 0.8588 accuracy. Our implementation also outperformed existing work under certain evaluation metrics. Conclusion: A set of standard metrics was utilized in assessing the performance of ICD-9 code assignment on MIMIC-III dataset. All the developed evaluation tools and resources are available online, which can be used as a baseline for further research.
Tasks
Published 2018-02-07
URL https://arxiv.org/abs/1802.02311v2
PDF https://arxiv.org/pdf/1802.02311v2.pdf
PWC https://paperswithcode.com/paper/an-empirical-evaluation-of-deep-learning-for
Repo https://github.com/lsy3/clinical-notes-diagnosis-dl-nlp
Framework tf

Ionospheric activity prediction using convolutional recurrent neural networks

Title Ionospheric activity prediction using convolutional recurrent neural networks
Authors Alexandre Boulch, Noëlie Cherrier, Thibaut Castaings
Abstract The ionosphere electromagnetic activity is a major factor of the quality of satellite telecommunications, Global Navigation Satellite Systems (GNSS) and other vital space applications. Being able to forecast globally the Total Electron Content (TEC) would enable a better anticipation of potential performance degradations. A few studies have proposed models able to predict the TEC locally, but not worldwide for most of them. Thanks to a large record of past TEC maps publicly available, we propose a method based on Deep Neural Networks (DNN) to forecast a sequence of global TEC maps consecutive to an input sequence of TEC maps, without introducing any prior knowledge other than Earth rotation periodicity. By combining several state-of-the-art architectures, the proposed approach is competitive with previous works on TEC forecasting while predicting the TEC globally.
Tasks Activity Prediction
Published 2018-10-31
URL http://arxiv.org/abs/1810.13273v2
PDF http://arxiv.org/pdf/1810.13273v2.pdf
PWC https://paperswithcode.com/paper/ionospheric-activity-prediction-using
Repo https://github.com/aboulch/tec_prediction
Framework pytorch

VERSE: Versatile Graph Embeddings from Similarity Measures

Title VERSE: Versatile Graph Embeddings from Similarity Measures
Authors Anton Tsitsulin, Davide Mottin, Panagiotis Karras, Emmanuel Müller
Abstract Embedding a web-scale information network into a low-dimensional vector space facilitates tasks such as link prediction, classification, and visualization. Past research has addressed the problem of extracting such embeddings by adopting methods from words to graphs, without defining a clearly comprehensible graph-related objective. Yet, as we show, the objectives used in past works implicitly utilize similarity measures among graph nodes. In this paper, we carry the similarity orientation of previous works to its logical conclusion; we propose VERtex Similarity Embeddings (VERSE), a simple, versatile, and memory-efficient method that derives graph embeddings explicitly calibrated to preserve the distributions of a selected vertex-to-vertex similarity measure. VERSE learns such embeddings by training a single-layer neural network. While its default, scalable version does so via sampling similarity information, we also develop a variant using the full information per vertex. Our experimental study on standard benchmarks and real-world datasets demonstrates that VERSE, instantiated with diverse similarity measures, outperforms state-of-the-art methods in terms of precision and recall in major data mining tasks and supersedes them in time and space efficiency, while the scalable sampling-based variant achieves equally good results as the non-scalable full variant.
Tasks Link Prediction
Published 2018-03-13
URL http://arxiv.org/abs/1803.04742v1
PDF http://arxiv.org/pdf/1803.04742v1.pdf
PWC https://paperswithcode.com/paper/verse-versatile-graph-embeddings-from
Repo https://github.com/xgfs/verse
Framework none

Characterising Across-Stack Optimisations for Deep Convolutional Neural Networks

Title Characterising Across-Stack Optimisations for Deep Convolutional Neural Networks
Authors Jack Turner, José Cano, Valentin Radu, Elliot J. Crowley, Michael O’Boyle, Amos Storkey
Abstract Convolutional Neural Networks (CNNs) are extremely computationally demanding, presenting a large barrier to their deployment on resource-constrained devices. Since such systems are where some of their most useful applications lie (e.g. obstacle detection for mobile robots, vision-based medical assistive technology), significant bodies of work from both machine learning and systems communities have attempted to provide optimisations that will make CNNs available to edge devices. In this paper we unify the two viewpoints in a Deep Learning Inference Stack and take an across-stack approach by implementing and evaluating the most common neural network compression techniques (weight pruning, channel pruning, and quantisation) and optimising their parallel execution with a range of programming approaches (OpenMP, OpenCL) and hardware architectures (CPU, GPU). We provide comprehensive Pareto curves to instruct trade-offs under constraints of accuracy, execution time, and memory space.
Tasks Neural Network Compression
Published 2018-09-19
URL http://arxiv.org/abs/1809.07196v1
PDF http://arxiv.org/pdf/1809.07196v1.pdf
PWC https://paperswithcode.com/paper/characterising-across-stack-optimisations-for
Repo https://github.com/jack-willturner/characterising-neural-compression
Framework pytorch

Towards Dynamic Computation Graphs via Sparse Latent Structure

Title Towards Dynamic Computation Graphs via Sparse Latent Structure
Authors Vlad Niculae, André F. T. Martins, Claire Cardie
Abstract Deep NLP models benefit from underlying structures in the data—e.g., parse trees—typically extracted using off-the-shelf parsers. Recent attempts to jointly learn the latent structure encounter a tradeoff: either make factorization assumptions that limit expressiveness, or sacrifice end-to-end differentiability. Using the recently proposed SparseMAP inference, which retrieves a sparse distribution over latent structures, we propose a novel approach for end-to-end learning of latent structure predictors jointly with a downstream predictor. To the best of our knowledge, our method is the first to enable unrestricted dynamic computation graph construction from the global latent structure, while maintaining differentiability.
Tasks graph construction
Published 2018-09-03
URL http://arxiv.org/abs/1809.00653v1
PDF http://arxiv.org/pdf/1809.00653v1.pdf
PWC https://paperswithcode.com/paper/towards-dynamic-computation-graphs-via-sparse
Repo https://github.com/vene/sparsemap
Framework pytorch

Generative replay with feedback connections as a general strategy for continual learning

Title Generative replay with feedback connections as a general strategy for continual learning
Authors Gido M. van de Ven, Andreas S. Tolias
Abstract A major obstacle to developing artificial intelligence applications capable of true lifelong learning is that artificial neural networks quickly or catastrophically forget previously learned tasks when trained on a new one. Numerous methods for alleviating catastrophic forgetting are currently being proposed, but differences in evaluation protocols make it difficult to directly compare their performance. To enable more meaningful comparisons, here we identified three distinct scenarios for continual learning based on whether task identity is known and, if it is not, whether it needs to be inferred. Performing the split and permuted MNIST task protocols according to each of these scenarios, we found that regularization-based approaches (e.g., elastic weight consolidation) failed when task identity needed to be inferred. In contrast, generative replay combined with distillation (i.e., using class probabilities as “soft targets”) achieved superior performance in all three scenarios. Addressing the issue of efficiency, we reduced the computational cost of generative replay by integrating the generative model into the main model by equipping it with generative feedback or backward connections. This Replay-through-Feedback approach substantially shortened training time with no or negligible loss in performance. We believe this to be an important first step towards making the powerful technique of generative replay scalable to real-world continual learning applications.
Tasks Continual Learning
Published 2018-09-27
URL http://arxiv.org/abs/1809.10635v2
PDF http://arxiv.org/pdf/1809.10635v2.pdf
PWC https://paperswithcode.com/paper/generative-replay-with-feedback-connections
Repo https://github.com/GMvandeVen/continual-learning
Framework pytorch

Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization

Title Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization
Authors Kian Kenyon-Dean, Jackie Chi Kit Cheung, Doina Precup
Abstract We present an approach to event coreference resolution by developing a general framework for clustering that uses supervised representation learning. We propose a neural network architecture with novel Clustering-Oriented Regularization (CORE) terms in the objective function. These terms encourage the model to create embeddings of event mentions that are amenable to clustering. We then use agglomerative clustering on these embeddings to build event coreference chains. For both within- and cross-document coreference on the ECB+ corpus, our model obtains better results than models that require significantly more pre-annotated information. This work provides insight and motivating results for a new general approach to solving coreference and clustering problems with representation learning.
Tasks Coreference Resolution, Representation Learning
Published 2018-05-28
URL http://arxiv.org/abs/1805.10985v1
PDF http://arxiv.org/pdf/1805.10985v1.pdf
PWC https://paperswithcode.com/paper/resolving-event-coreference-with-supervised
Repo https://github.com/kiankd/events
Framework pytorch
comments powered by Disqus