January 27, 2020

3191 words 15 mins read

Paper Group ANR 1342

X-TrainCaps: Accelerated Training of Capsule Nets through Lightweight Software Optimizations. Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer. Identifying Patient Groups based on Frequent Patterns of Patient Samples. Few-Shot Learning with Embedded Class Models and Shot-Free Meta Training. Mixed-Precision Quantized …

X-TrainCaps: Accelerated Training of Capsule Nets through Lightweight Software Optimizations


Title	X-TrainCaps: Accelerated Training of Capsule Nets through Lightweight Software Optimizations
Authors	Alberto Marchisio, Beatrice Bussolino, Alessio Colucci, Muhammad Abdullah Hanif, Maurizio Martina, Guido Masera, Muhammad Shafique
Abstract	Convolutional Neural Networks (CNNs) are extensively in use due to their excellent results in various machine learning (ML) tasks like image classification and object detection. Recently, Capsule Networks (CapsNets) have shown improved performances compared to the traditional CNNs, by encoding and preserving spatial relationships between the detected features in a better way. This is achieved through the so-called Capsules (i.e., groups of neurons) that encode both the instantiation probability and the spatial information. However, one of the major hurdles in the wide adoption of CapsNets is its gigantic training time, which is primarily due to the relatively higher complexity of its constituting elements. In this paper, we illustrate how can we devise new optimizations in the training process to achieve fast training of CapsNets, and if such optimizations affect the network accuracy or not. Towards this, we propose a novel framework “X-TrainCaps” that employs lightweight software-level optimizations, including a novel learning rate policy called WarmAdaBatch that jointly performs warm restarts and adaptive batch size, as well as weight sharing for capsule layers to reduce the hardware requirements of CapsNets by removing unused/redundant connections and capsules, while keeping high accuracy through tests of different learning rate policies and batch sizes. We demonstrate that one of the solutions generated by X-TrainCaps framework can achieve 58.6% training time reduction while preserving the accuracy (even 0.9% accuracy improvement), compared to the CapsNet in the original paper by Sabour et al. (2017), while other Pareto-optimal solutions can be leveraged to realize trade-offs between training time and achieved accuracy.
Tasks	Image Classification, Object Detection
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10142v1
PDF	https://arxiv.org/pdf/1905.10142v1.pdf
PWC	https://paperswithcode.com/paper/x-traincaps-accelerated-training-of-capsule
Repo
Framework

Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer


Title	Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer
Authors	Daniel Tanneberg, Elmar Rueckert, Jan Peters
Abstract	A key feature of intelligent behavior is the ability to learn abstract strategies that transfer to unfamiliar problems. Therefore, we present a novel architecture, based on memory-augmented networks, that is inspired by the von Neumann and Harvard architectures of modern computers. This architecture enables the learning of abstract algorithmic solutions via Evolution Strategies in a reinforcement learning setting. Applied to Sokoban, sliding block puzzle and robotic manipulation tasks, we show that the architecture can learn algorithmic solutions with strong generalization and abstraction: scaling to arbitrary task configurations and complexities, and being independent of both the data representation and the task domain.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1911.00926v1
PDF	https://arxiv.org/pdf/1911.00926v1.pdf
PWC	https://paperswithcode.com/paper/learning-algorithmic-solutions-to-symbolic
Repo
Framework

Identifying Patient Groups based on Frequent Patterns of Patient Samples


Title	Identifying Patient Groups based on Frequent Patterns of Patient Samples
Authors	Seyed Amin Tabatabaei, Xixi Lu, Mark Hoogendoorn, Hajo A. Reijers
Abstract	Grouping patients meaningfully can give insights about the different types of patients, their needs, and the priorities. Finding groups that are meaningful is however very challenging as background knowledge is often required to determine what a useful grouping is. In this paper we propose an approach that is able to find groups of patients based on a small sample of positive examples given by a domain expert. Because of that, the approach relies on very limited efforts by the domain experts. The approach groups based on the activities and diagnostic/billing codes within health pathways of patients. To define such a grouping based on the sample of patients efficiently, frequent patterns of activities are discovered and used to measure the similarity between the care pathways of other patients to the patients in the sample group. This approach results in an insightful definition of the group. The proposed approach is evaluated using several datasets obtained from a large university medical center. The evaluation shows F1-scores of around 0.7 for grouping kidney injury and around 0.6 for diabetes.
Tasks
Published	2019-04-03
URL	http://arxiv.org/abs/1904.01863v1
PDF	http://arxiv.org/pdf/1904.01863v1.pdf
PWC	https://paperswithcode.com/paper/identifying-patient-groups-based-on-frequent
Repo
Framework

Few-Shot Learning with Embedded Class Models and Shot-Free Meta Training


Title	Few-Shot Learning with Embedded Class Models and Shot-Free Meta Training
Authors	Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
Abstract	We propose a method for learning embeddings for few-shot learning that is suitable for use with any number of ways and any number of shots (shot-free). Rather than fixing the class prototypes to be the Euclidean average of sample embeddings, we allow them to live in a higher-dimensional space (embedded class models) and learn the prototypes along with the model parameters. The class representation function is defined implicitly, which allows us to deal with a variable number of shots per each class with a simple constant-size architecture. The class embedding encompasses metric learning, that facilitates adding new classes without crowding the class representation space. Despite being general and not tuned to the benchmark, our approach achieves state-of-the-art performance on the standard few-shot benchmark datasets.
Tasks	Few-Shot Learning, Metric Learning
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04398v1
PDF	https://arxiv.org/pdf/1905.04398v1.pdf
PWC	https://paperswithcode.com/paper/few-shot-learning-with-embedded-class-models
Repo
Framework

Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection


Title	Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection
Authors	Tianshu Chu, Qin Luo, Jie Yang, Xiaolin Huang
Abstract	Efficient model inference is an important and practical issue in the deployment of deep neural network on resource constraint platforms. Network quantization addresses this problem effectively by leveraging low-bit representation and arithmetic that could be conducted on dedicated embedded systems. In the previous works, the parameter bitwidth is set homogeneously and there is a trade-off between superior performance and aggressive compression. Actually the stacked network layers, which are generally regarded as hierarchical feature extractors, contribute diversely to the overall performance. For a well-trained neural network, the feature distributions of different categories differentiate gradually as the network propagates forward. Hence the capability requirement on the subsequent feature extractors is reduced. It indicates that the neurons in posterior layers could be assigned with lower bitwidth for quantized neural networks. Based on this observation, a simple but effective mixed-precision quantized neural network with progressively ecreasing bitwidth is proposed to improve the trade-off between accuracy and compression. Extensive experiments on typical network architectures and benchmark datasets demonstrate that the proposed method could achieve better or comparable results while reducing the memory space for quantized parameters by more than 30% in comparison with the homogeneous counterparts. In addition, the results also demonstrate that the higher-precision bottom layers could boost the 1-bit network performance appreciably due to a better preservation of the original image information while the lower-precision posterior layers contribute to the regularization of $k-$bit networks.
Tasks	Image Classification, Object Detection, Quantization
Published	2019-12-29
URL	https://arxiv.org/abs/1912.12656v1
PDF	https://arxiv.org/pdf/1912.12656v1.pdf
PWC	https://paperswithcode.com/paper/mixed-precision-quantized-neural-network-with
Repo
Framework

Counting to Ten with Two Fingers: Compressed Counting with Spiking Neurons


Title	Counting to Ten with Two Fingers: Compressed Counting with Spiking Neurons
Authors	Yael Hitron, Merav Parter
Abstract	We consider the task of measuring time with probabilistic threshold gates implemented by bio-inspired spiking neurons. In the model of spiking neural networks, network evolves in discrete rounds, where in each round, neurons fire in pulses in response to a sufficiently high membrane potential. This potential is induced by spikes from neighboring neurons that fired in the previous round, which can have either an excitatory or inhibitory effect. We first consider a deterministic implementation of a neural timer and show that $\Theta(\log t)$ (deterministic) threshold gates are both sufficient and necessary. This raised the question of whether randomness can be leveraged to reduce the number of neurons. We answer this question in the affirmative by considering neural timers with spiking neurons where the neuron $y$ is required to fire for $t$ consecutive rounds with probability at least $1-\delta$, and should stop firing after at most $2t$ rounds with probability $1-\delta$ for some input parameter $\delta \in (0,1)$. Our key result is a construction of a neural timer with $O(\log\log 1/\delta)$ spiking neurons. Interestingly, this construction uses only one spiking neuron, while the remaining neurons can be deterministic threshold gates. We complement this construction with a matching lower bound of $\Omega(\min{\log\log 1/\delta, \log t})$ neurons. This provides the first separation between deterministic and randomized constructions in the setting of spiking neural networks. Finally, we demonstrate the usefulness of compressed counting networks for synchronizing neural networks.
Tasks
Published	2019-02-27
URL	https://arxiv.org/abs/1902.10369v3
PDF	https://arxiv.org/pdf/1902.10369v3.pdf
PWC	https://paperswithcode.com/paper/counting-to-ten-with-two-fingers-compressed
Repo
Framework

Adaptive particle-based approximations of the Gibbs posterior for inverse problems


Title	Adaptive particle-based approximations of the Gibbs posterior for inverse problems
Authors	Zilong Zou, Sayan Mukherjee, Harbir Antil, Wilkins Aquino
Abstract	In this work, we adopt a general framework based on the Gibbs posterior to update belief distributions for inverse problems governed by partial differential equations (PDEs). The Gibbs posterior formulation is a generalization of standard Bayesian inference that only relies on a loss function connecting the unknown parameters to the data. It is particularly useful when the true data generating mechanism (or noise distribution) is unknown or difficult to specify. The Gibbs posterior coincides with Bayesian updating when a true likelihood function is known and the loss function corresponds to the negative log-likelihood, yet provides subjective inference in more general settings. We employ a sequential Monte Carlo (SMC) approach to approximate the Gibbs posterior using particles. To manage the computational cost of propagating increasing numbers of particles through the loss function, we employ a recently developed local reduced basis method to build an efficient surrogate loss function that is used in the Gibbs update formula in place of the true loss. We derive error bounds for our approximation and propose an adaptive approach to construct the surrogate model in an efficient manner. We demonstrate the efficiency of our approach through several numerical examples.
Tasks	Bayesian Inference
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01551v1
PDF	https://arxiv.org/pdf/1907.01551v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-particle-based-approximations-of-the
Repo
Framework

Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds


Title	Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds
Authors	Masafumi Yamazaki, Akihiko Kasagi, Akihiro Tabuchi, Takumi Honda, Masahiro Miwa, Naoto Fukumoto, Tsuguchika Tabaru, Atsushi Ike, Kohta Nakashima
Abstract	There has been a strong demand for algorithms that can execute machine learning as faster as possible and the speed of deep learning has accelerated by 30 times only in the past two years. Distributed deep learning using the large mini-batch is a key technology to address the demand and is a great challenge as it is difficult to achieve high scalability on large clusters without compromising accuracy. In this paper, we introduce optimization methods which we applied to this challenge. We achieved the training time of 74.7 seconds using 2,048 GPUs on ABCI cluster applying these methods. The training throughput is over 1.73 million images/sec and the top-1 validation accuracy is 75.08%.
Tasks
Published	2019-03-29
URL	http://arxiv.org/abs/1903.12650v1
PDF	http://arxiv.org/pdf/1903.12650v1.pdf
PWC	https://paperswithcode.com/paper/yet-another-accelerated-sgd-resnet-50
Repo
Framework

Web Based Brain Volume Calculation for Magnetic Resonance Images


Title	Web Based Brain Volume Calculation for Magnetic Resonance Images
Authors	Kevin Karsch, Brian Grinstead, Qing He, Ye Duan
Abstract	Brain volume calculations are crucial in modern medical research, especially in the study of neurodevelopmental disorders. In this paper, we present an algorithm for calculating two classifications of brain volume, total brain volume (TBV) and intracranial volume (ICV). Our algorithm takes MRI data as input, performs several preprocessing and intermediate steps, and then returns each of the two calculated volumes. To simplify this process and make our algorithm publicly accessible to anyone, we have created a web-based interface that allows users to upload their own MRI data and calculate the TBV and ICV for the given data. This interface provides a simple and efficient method for calculating these two classifications of brain volume, and it also removes the need for the user to download or install any applications.
Tasks
Published	2019-04-21
URL	http://arxiv.org/abs/1904.09977v1
PDF	http://arxiv.org/pdf/1904.09977v1.pdf
PWC	https://paperswithcode.com/paper/web-based-brain-volume-calculation-for
Repo
Framework

Strong Equivalence and Program’s Structure in Arguing Essential Equivalence between Logic Programs


Title	Strong Equivalence and Program’s Structure in Arguing Essential Equivalence between Logic Programs
Authors	Yuliya Lierler
Abstract	Answer set programming is a prominent declarative programming paradigm used in formulating combinatorial search problems and implementing distinct knowledge representation formalisms. It is common that several related and yet substantially different answer set programs exist for a given problem. Sometimes these encodings may display significantly different performance. Uncovering {\em precise formal} links between these programs is often important and yet far from trivial. This paper claims the correctness of a number of interesting program rewritings.
Tasks
Published	2019-01-26
URL	https://arxiv.org/abs/1901.09127v2
PDF	https://arxiv.org/pdf/1901.09127v2.pdf
PWC	https://paperswithcode.com/paper/strong-equivalence-and-programs-structure-in
Repo
Framework

Deleter: Leveraging BERT to Perform Unsupervised Successive Text Compression


Title	Deleter: Leveraging BERT to Perform Unsupervised Successive Text Compression
Authors	Tong Niu, Caiming Xiong, Richard Socher
Abstract	Text compression has diverse applications such as Summarization, Reading Comprehension and Text Editing. However, almost all existing approaches require either hand-crafted features, syntactic labels or parallel data. Even for one that achieves this task in an unsupervised setting, its architecture necessitates a task-specific autoencoder. Moreover, these models only generate one compressed sentence for each source input, so that adapting to different style requirements (e.g. length) for the final output usually implies retraining the model from scratch. In this work, we propose a fully unsupervised model, Deleter, that is able to discover an “optimal deletion path” for an arbitrary sentence, where each intermediate sequence along the path is a coherent subsequence of the previous one. This approach relies exclusively on a pretrained bidirectional language model (BERT) to score each candidate deletion based on the average Perplexity of the resulting sentence and performs progressive greedy lookahead search to select the best deletion for each step. We apply Deleter to the task of extractive Sentence Compression, and found that our model is competitive with state-of-the-art supervised models trained on 1.02 million in-domain examples with similar compression ratio. Qualitative analysis, as well as automatic and human evaluations both verify that our model produces high-quality compression.
Tasks	Language Modelling, Reading Comprehension, Sentence Compression
Published	2019-09-07
URL	https://arxiv.org/abs/1909.03223v1
PDF	https://arxiv.org/pdf/1909.03223v1.pdf
PWC	https://paperswithcode.com/paper/deleter-leveraging-bert-to-perform
Repo
Framework

A new hybrid genetic algorithm for protein structure prediction on the 2D triangular lattice


Title	A new hybrid genetic algorithm for protein structure prediction on the 2D triangular lattice
Authors	Nabil Boumedine, Sadek Bouroubi
Abstract	The flawless functioning of a protein is essentially linked to its own three-dimensional structure. Therefore, the prediction of a protein structure from its amino acid sequence is a fundamental problem in many fields that draws researchers attention. This problem can be formulated as a combinatorial optimization problem based on simplified lattice models such as the hydrophobic-polar model. In this paper, we propose a new hybrid algorithm combining three different well-known heuristic algorithms: genetic algorithm, tabu search strategy and local search algorithm in order to solve the PSP problem. Regarding the assessment of suggested algorithm, an experimental study is included, where we considered the quality of the produced solution as the main quality criterion. Furthermore, we compared the suggested algorithm with state-of-the-art algorithms using a selection of well-studied benchmark instances.
Tasks	Combinatorial Optimization
Published	2019-07-08
URL	https://arxiv.org/abs/1907.04190v1
PDF	https://arxiv.org/pdf/1907.04190v1.pdf
PWC	https://paperswithcode.com/paper/a-new-hybrid-genetic-algorithm-for-protein
Repo
Framework

Unsupervised Clustering of Quantitative Imaging Phenotypes using Autoencoder and Gaussian Mixture Model


Title	Unsupervised Clustering of Quantitative Imaging Phenotypes using Autoencoder and Gaussian Mixture Model
Authors	Jianan Chen, Laurent Milot, Helen M. C. Cheung, Anne L. Martel
Abstract	Quantitative medical image computing (radiomics) has been widely applied to build prediction models from medical images. However, overfitting is a significant issue in conventional radiomics, where a large number of radiomic features are directly used to train and test models that predict genotypes or clinical outcomes. In order to tackle this problem, we propose an unsupervised learning pipeline composed of an autoencoder for representation learning of radiomic features and a Gaussian mixture model based on minimum message length criterion for clustering. By incorporating probabilistic modeling, disease heterogeneity has been taken into account. The performance of the proposed pipeline was evaluated on an institutional MRI cohort of 108 patients with colorectal cancer liver metastases. Our approach is capable of automatically selecting the optimal number of clusters and assigns patients into clusters (imaging subtypes) with significantly different survival rates. Our method outperforms other unsupervised clustering methods that have been used for radiomics analysis and has comparable performance to a state-of-the-art imaging biomarker.
Tasks	Representation Learning
Published	2019-09-06
URL	https://arxiv.org/abs/1909.02953v1
PDF	https://arxiv.org/pdf/1909.02953v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-clustering-of-quantitative
Repo
Framework

Anomaly Detection for an E-commerce Pricing System


Title	Anomaly Detection for an E-commerce Pricing System
Authors	Jagdish Ramakrishnan, Elham Shaabani, Chao Li, Mátyás A. Sustik
Abstract	Online retailers execute a very large number of price updates when compared to brick-and-mortar stores. Even a few mis-priced items can have a significant business impact and result in a loss of customer trust. Early detection of anomalies in an automated real-time fashion is an important part of such a pricing system. In this paper, we describe unsupervised and supervised anomaly detection approaches we developed and deployed for a large-scale online pricing system at Walmart. Our system detects anomalies both in batch and real-time streaming settings, and the items flagged are reviewed and actioned based on priority and business impact. We found that having the right architecture design was critical to facilitate model performance at scale, and business impact and speed were important factors influencing model selection, parameter choice, and prioritization in a production environment for a large-scale system. We conducted analyses on the performance of various approaches on a test set using real-world retail data and fully deployed our approach into production. We found that our approach was able to detect the most important anomalies with high precision.
Tasks	Anomaly Detection, Model Selection
Published	2019-02-25
URL	https://arxiv.org/abs/1902.09566v5
PDF	https://arxiv.org/pdf/1902.09566v5.pdf
PWC	https://paperswithcode.com/paper/anomaly-detection-for-an-e-commerce-pricing
Repo
Framework

Brain Tissues Segmentation on MR Perfusion Images Using CUSUM Filter for Boundary Pixels


Title	Brain Tissues Segmentation on MR Perfusion Images Using CUSUM Filter for Boundary Pixels
Authors	S. M. Alkhimova, A. P. Krenevych
Abstract	The fully automated and relatively accurate method of brain tissues segmentation on T2-weighted magnetic resonance perfusion images is proposed. Segmentation with this method provides a possibility to obtain perfusion region of interest on images with abnormal brain anatomy that is very important for perfusion analysis. In the proposed method the result is presented as a binary mask, which marks two regions: brain tissues pixels with unity values and skull, extracranial soft tissue and background pixels with zero values. The binary mask is produced based on the location of boundary between two studied regions. Each boundary point is detected with CUSUM filter as a change point for iteratively accumulated points at time of moving on a sinusoidal-like path along the boundary from one region to another. The evaluation results for 20 clinical cases showed that proposed segmentation method could significantly reduce the time and efforts required to obtain desirable results for perfusion region of interest detection on T2-weighted magnetic resonance perfusion images with abnormal brain anatomy.
Tasks
Published	2019-07-08
URL	https://arxiv.org/abs/1907.03865v1
PDF	https://arxiv.org/pdf/1907.03865v1.pdf
PWC	https://paperswithcode.com/paper/brain-tissues-segmentation-on-mr-perfusion
Repo
Framework