Paper Group ANR 1342
X-TrainCaps: Accelerated Training of Capsule Nets through Lightweight Software Optimizations. Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer. Identifying Patient Groups based on Frequent Patterns of Patient Samples. Few-Shot Learning with Embedded Class Models and Shot-Free Meta Training. Mixed-Precision Quantized …
X-TrainCaps: Accelerated Training of Capsule Nets through Lightweight Software Optimizations
Title | X-TrainCaps: Accelerated Training of Capsule Nets through Lightweight Software Optimizations |
Authors | Alberto Marchisio, Beatrice Bussolino, Alessio Colucci, Muhammad Abdullah Hanif, Maurizio Martina, Guido Masera, Muhammad Shafique |
Abstract | Convolutional Neural Networks (CNNs) are extensively in use due to their excellent results in various machine learning (ML) tasks like image classification and object detection. Recently, Capsule Networks (CapsNets) have shown improved performances compared to the traditional CNNs, by encoding and preserving spatial relationships between the detected features in a better way. This is achieved through the so-called Capsules (i.e., groups of neurons) that encode both the instantiation probability and the spatial information. However, one of the major hurdles in the wide adoption of CapsNets is its gigantic training time, which is primarily due to the relatively higher complexity of its constituting elements. In this paper, we illustrate how can we devise new optimizations in the training process to achieve fast training of CapsNets, and if such optimizations affect the network accuracy or not. Towards this, we propose a novel framework “X-TrainCaps” that employs lightweight software-level optimizations, including a novel learning rate policy called WarmAdaBatch that jointly performs warm restarts and adaptive batch size, as well as weight sharing for capsule layers to reduce the hardware requirements of CapsNets by removing unused/redundant connections and capsules, while keeping high accuracy through tests of different learning rate policies and batch sizes. We demonstrate that one of the solutions generated by X-TrainCaps framework can achieve 58.6% training time reduction while preserving the accuracy (even 0.9% accuracy improvement), compared to the CapsNet in the original paper by Sabour et al. (2017), while other Pareto-optimal solutions can be leveraged to realize trade-offs between training time and achieved accuracy. |
Tasks | Image Classification, Object Detection |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10142v1 |
https://arxiv.org/pdf/1905.10142v1.pdf | |
PWC | https://paperswithcode.com/paper/x-traincaps-accelerated-training-of-capsule |
Repo | |
Framework | |
Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer
Title | Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer |
Authors | Daniel Tanneberg, Elmar Rueckert, Jan Peters |
Abstract | A key feature of intelligent behavior is the ability to learn abstract strategies that transfer to unfamiliar problems. Therefore, we present a novel architecture, based on memory-augmented networks, that is inspired by the von Neumann and Harvard architectures of modern computers. This architecture enables the learning of abstract algorithmic solutions via Evolution Strategies in a reinforcement learning setting. Applied to Sokoban, sliding block puzzle and robotic manipulation tasks, we show that the architecture can learn algorithmic solutions with strong generalization and abstraction: scaling to arbitrary task configurations and complexities, and being independent of both the data representation and the task domain. |
Tasks | |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1911.00926v1 |
https://arxiv.org/pdf/1911.00926v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-algorithmic-solutions-to-symbolic |
Repo | |
Framework | |
Identifying Patient Groups based on Frequent Patterns of Patient Samples
Title | Identifying Patient Groups based on Frequent Patterns of Patient Samples |
Authors | Seyed Amin Tabatabaei, Xixi Lu, Mark Hoogendoorn, Hajo A. Reijers |
Abstract | Grouping patients meaningfully can give insights about the different types of patients, their needs, and the priorities. Finding groups that are meaningful is however very challenging as background knowledge is often required to determine what a useful grouping is. In this paper we propose an approach that is able to find groups of patients based on a small sample of positive examples given by a domain expert. Because of that, the approach relies on very limited efforts by the domain experts. The approach groups based on the activities and diagnostic/billing codes within health pathways of patients. To define such a grouping based on the sample of patients efficiently, frequent patterns of activities are discovered and used to measure the similarity between the care pathways of other patients to the patients in the sample group. This approach results in an insightful definition of the group. The proposed approach is evaluated using several datasets obtained from a large university medical center. The evaluation shows F1-scores of around 0.7 for grouping kidney injury and around 0.6 for diabetes. |
Tasks | |
Published | 2019-04-03 |
URL | http://arxiv.org/abs/1904.01863v1 |
http://arxiv.org/pdf/1904.01863v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-patient-groups-based-on-frequent |
Repo | |
Framework | |
Few-Shot Learning with Embedded Class Models and Shot-Free Meta Training
Title | Few-Shot Learning with Embedded Class Models and Shot-Free Meta Training |
Authors | Avinash Ravichandran, Rahul Bhotika, Stefano Soatto |
Abstract | We propose a method for learning embeddings for few-shot learning that is suitable for use with any number of ways and any number of shots (shot-free). Rather than fixing the class prototypes to be the Euclidean average of sample embeddings, we allow them to live in a higher-dimensional space (embedded class models) and learn the prototypes along with the model parameters. The class representation function is defined implicitly, which allows us to deal with a variable number of shots per each class with a simple constant-size architecture. The class embedding encompasses metric learning, that facilitates adding new classes without crowding the class representation space. Despite being general and not tuned to the benchmark, our approach achieves state-of-the-art performance on the standard few-shot benchmark datasets. |
Tasks | Few-Shot Learning, Metric Learning |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04398v1 |
https://arxiv.org/pdf/1905.04398v1.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-learning-with-embedded-class-models |
Repo | |
Framework | |
Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection
Title | Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection |
Authors | Tianshu Chu, Qin Luo, Jie Yang, Xiaolin Huang |
Abstract | Efficient model inference is an important and practical issue in the deployment of deep neural network on resource constraint platforms. Network quantization addresses this problem effectively by leveraging low-bit representation and arithmetic that could be conducted on dedicated embedded systems. In the previous works, the parameter bitwidth is set homogeneously and there is a trade-off between superior performance and aggressive compression. Actually the stacked network layers, which are generally regarded as hierarchical feature extractors, contribute diversely to the overall performance. For a well-trained neural network, the feature distributions of different categories differentiate gradually as the network propagates forward. Hence the capability requirement on the subsequent feature extractors is reduced. It indicates that the neurons in posterior layers could be assigned with lower bitwidth for quantized neural networks. Based on this observation, a simple but effective mixed-precision quantized neural network with progressively ecreasing bitwidth is proposed to improve the trade-off between accuracy and compression. Extensive experiments on typical network architectures and benchmark datasets demonstrate that the proposed method could achieve better or comparable results while reducing the memory space for quantized parameters by more than 30% in comparison with the homogeneous counterparts. In addition, the results also demonstrate that the higher-precision bottom layers could boost the 1-bit network performance appreciably due to a better preservation of the original image information while the lower-precision posterior layers contribute to the regularization of $k-$bit networks. |
Tasks | Image Classification, Object Detection, Quantization |
Published | 2019-12-29 |
URL | https://arxiv.org/abs/1912.12656v1 |
https://arxiv.org/pdf/1912.12656v1.pdf | |
PWC | https://paperswithcode.com/paper/mixed-precision-quantized-neural-network-with |
Repo | |
Framework | |
Counting to Ten with Two Fingers: Compressed Counting with Spiking Neurons
Title | Counting to Ten with Two Fingers: Compressed Counting with Spiking Neurons |
Authors | Yael Hitron, Merav Parter |
Abstract | We consider the task of measuring time with probabilistic threshold gates implemented by bio-inspired spiking neurons. In the model of spiking neural networks, network evolves in discrete rounds, where in each round, neurons fire in pulses in response to a sufficiently high membrane potential. This potential is induced by spikes from neighboring neurons that fired in the previous round, which can have either an excitatory or inhibitory effect. We first consider a deterministic implementation of a neural timer and show that $\Theta(\log t)$ (deterministic) threshold gates are both sufficient and necessary. This raised the question of whether randomness can be leveraged to reduce the number of neurons. We answer this question in the affirmative by considering neural timers with spiking neurons where the neuron $y$ is required to fire for $t$ consecutive rounds with probability at least $1-\delta$, and should stop firing after at most $2t$ rounds with probability $1-\delta$ for some input parameter $\delta \in (0,1)$. Our key result is a construction of a neural timer with $O(\log\log 1/\delta)$ spiking neurons. Interestingly, this construction uses only one spiking neuron, while the remaining neurons can be deterministic threshold gates. We complement this construction with a matching lower bound of $\Omega(\min{\log\log 1/\delta, \log t})$ neurons. This provides the first separation between deterministic and randomized constructions in the setting of spiking neural networks. Finally, we demonstrate the usefulness of compressed counting networks for synchronizing neural networks. |
Tasks | |
Published | 2019-02-27 |
URL | https://arxiv.org/abs/1902.10369v3 |
https://arxiv.org/pdf/1902.10369v3.pdf | |
PWC | https://paperswithcode.com/paper/counting-to-ten-with-two-fingers-compressed |
Repo | |
Framework | |
Adaptive particle-based approximations of the Gibbs posterior for inverse problems
Title | Adaptive particle-based approximations of the Gibbs posterior for inverse problems |
Authors | Zilong Zou, Sayan Mukherjee, Harbir Antil, Wilkins Aquino |
Abstract | In this work, we adopt a general framework based on the Gibbs posterior to update belief distributions for inverse problems governed by partial differential equations (PDEs). The Gibbs posterior formulation is a generalization of standard Bayesian inference that only relies on a loss function connecting the unknown parameters to the data. It is particularly useful when the true data generating mechanism (or noise distribution) is unknown or difficult to specify. The Gibbs posterior coincides with Bayesian updating when a true likelihood function is known and the loss function corresponds to the negative log-likelihood, yet provides subjective inference in more general settings. We employ a sequential Monte Carlo (SMC) approach to approximate the Gibbs posterior using particles. To manage the computational cost of propagating increasing numbers of particles through the loss function, we employ a recently developed local reduced basis method to build an efficient surrogate loss function that is used in the Gibbs update formula in place of the true loss. We derive error bounds for our approximation and propose an adaptive approach to construct the surrogate model in an efficient manner. We demonstrate the efficiency of our approach through several numerical examples. |
Tasks | Bayesian Inference |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01551v1 |
https://arxiv.org/pdf/1907.01551v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-particle-based-approximations-of-the |
Repo | |
Framework | |
Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds
Title | Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds |
Authors | Masafumi Yamazaki, Akihiko Kasagi, Akihiro Tabuchi, Takumi Honda, Masahiro Miwa, Naoto Fukumoto, Tsuguchika Tabaru, Atsushi Ike, Kohta Nakashima |
Abstract | There has been a strong demand for algorithms that can execute machine learning as faster as possible and the speed of deep learning has accelerated by 30 times only in the past two years. Distributed deep learning using the large mini-batch is a key technology to address the demand and is a great challenge as it is difficult to achieve high scalability on large clusters without compromising accuracy. In this paper, we introduce optimization methods which we applied to this challenge. We achieved the training time of 74.7 seconds using 2,048 GPUs on ABCI cluster applying these methods. The training throughput is over 1.73 million images/sec and the top-1 validation accuracy is 75.08%. |
Tasks | |
Published | 2019-03-29 |
URL | http://arxiv.org/abs/1903.12650v1 |
http://arxiv.org/pdf/1903.12650v1.pdf | |
PWC | https://paperswithcode.com/paper/yet-another-accelerated-sgd-resnet-50 |
Repo | |
Framework | |
Web Based Brain Volume Calculation for Magnetic Resonance Images
Title | Web Based Brain Volume Calculation for Magnetic Resonance Images |
Authors | Kevin Karsch, Brian Grinstead, Qing He, Ye Duan |
Abstract | Brain volume calculations are crucial in modern medical research, especially in the study of neurodevelopmental disorders. In this paper, we present an algorithm for calculating two classifications of brain volume, total brain volume (TBV) and intracranial volume (ICV). Our algorithm takes MRI data as input, performs several preprocessing and intermediate steps, and then returns each of the two calculated volumes. To simplify this process and make our algorithm publicly accessible to anyone, we have created a web-based interface that allows users to upload their own MRI data and calculate the TBV and ICV for the given data. This interface provides a simple and efficient method for calculating these two classifications of brain volume, and it also removes the need for the user to download or install any applications. |
Tasks | |
Published | 2019-04-21 |
URL | http://arxiv.org/abs/1904.09977v1 |
http://arxiv.org/pdf/1904.09977v1.pdf | |
PWC | https://paperswithcode.com/paper/web-based-brain-volume-calculation-for |
Repo | |
Framework | |
Strong Equivalence and Program’s Structure in Arguing Essential Equivalence between Logic Programs
Title | Strong Equivalence and Program’s Structure in Arguing Essential Equivalence between Logic Programs |
Authors | Yuliya Lierler |
Abstract | Answer set programming is a prominent declarative programming paradigm used in formulating combinatorial search problems and implementing distinct knowledge representation formalisms. It is common that several related and yet substantially different answer set programs exist for a given problem. Sometimes these encodings may display significantly different performance. Uncovering {\em precise formal} links between these programs is often important and yet far from trivial. This paper claims the correctness of a number of interesting program rewritings. |
Tasks | |
Published | 2019-01-26 |
URL | https://arxiv.org/abs/1901.09127v2 |
https://arxiv.org/pdf/1901.09127v2.pdf | |
PWC | https://paperswithcode.com/paper/strong-equivalence-and-programs-structure-in |
Repo | |
Framework | |
Deleter: Leveraging BERT to Perform Unsupervised Successive Text Compression
Title | Deleter: Leveraging BERT to Perform Unsupervised Successive Text Compression |
Authors | Tong Niu, Caiming Xiong, Richard Socher |
Abstract | Text compression has diverse applications such as Summarization, Reading Comprehension and Text Editing. However, almost all existing approaches require either hand-crafted features, syntactic labels or parallel data. Even for one that achieves this task in an unsupervised setting, its architecture necessitates a task-specific autoencoder. Moreover, these models only generate one compressed sentence for each source input, so that adapting to different style requirements (e.g. length) for the final output usually implies retraining the model from scratch. In this work, we propose a fully unsupervised model, Deleter, that is able to discover an “optimal deletion path” for an arbitrary sentence, where each intermediate sequence along the path is a coherent subsequence of the previous one. This approach relies exclusively on a pretrained bidirectional language model (BERT) to score each candidate deletion based on the average Perplexity of the resulting sentence and performs progressive greedy lookahead search to select the best deletion for each step. We apply Deleter to the task of extractive Sentence Compression, and found that our model is competitive with state-of-the-art supervised models trained on 1.02 million in-domain examples with similar compression ratio. Qualitative analysis, as well as automatic and human evaluations both verify that our model produces high-quality compression. |
Tasks | Language Modelling, Reading Comprehension, Sentence Compression |
Published | 2019-09-07 |
URL | https://arxiv.org/abs/1909.03223v1 |
https://arxiv.org/pdf/1909.03223v1.pdf | |
PWC | https://paperswithcode.com/paper/deleter-leveraging-bert-to-perform |
Repo | |
Framework | |
A new hybrid genetic algorithm for protein structure prediction on the 2D triangular lattice
Title | A new hybrid genetic algorithm for protein structure prediction on the 2D triangular lattice |
Authors | Nabil Boumedine, Sadek Bouroubi |
Abstract | The flawless functioning of a protein is essentially linked to its own three-dimensional structure. Therefore, the prediction of a protein structure from its amino acid sequence is a fundamental problem in many fields that draws researchers attention. This problem can be formulated as a combinatorial optimization problem based on simplified lattice models such as the hydrophobic-polar model. In this paper, we propose a new hybrid algorithm combining three different well-known heuristic algorithms: genetic algorithm, tabu search strategy and local search algorithm in order to solve the PSP problem. Regarding the assessment of suggested algorithm, an experimental study is included, where we considered the quality of the produced solution as the main quality criterion. Furthermore, we compared the suggested algorithm with state-of-the-art algorithms using a selection of well-studied benchmark instances. |
Tasks | Combinatorial Optimization |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.04190v1 |
https://arxiv.org/pdf/1907.04190v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-hybrid-genetic-algorithm-for-protein |
Repo | |
Framework | |
Unsupervised Clustering of Quantitative Imaging Phenotypes using Autoencoder and Gaussian Mixture Model
Title | Unsupervised Clustering of Quantitative Imaging Phenotypes using Autoencoder and Gaussian Mixture Model |
Authors | Jianan Chen, Laurent Milot, Helen M. C. Cheung, Anne L. Martel |
Abstract | Quantitative medical image computing (radiomics) has been widely applied to build prediction models from medical images. However, overfitting is a significant issue in conventional radiomics, where a large number of radiomic features are directly used to train and test models that predict genotypes or clinical outcomes. In order to tackle this problem, we propose an unsupervised learning pipeline composed of an autoencoder for representation learning of radiomic features and a Gaussian mixture model based on minimum message length criterion for clustering. By incorporating probabilistic modeling, disease heterogeneity has been taken into account. The performance of the proposed pipeline was evaluated on an institutional MRI cohort of 108 patients with colorectal cancer liver metastases. Our approach is capable of automatically selecting the optimal number of clusters and assigns patients into clusters (imaging subtypes) with significantly different survival rates. Our method outperforms other unsupervised clustering methods that have been used for radiomics analysis and has comparable performance to a state-of-the-art imaging biomarker. |
Tasks | Representation Learning |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.02953v1 |
https://arxiv.org/pdf/1909.02953v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-clustering-of-quantitative |
Repo | |
Framework | |
Anomaly Detection for an E-commerce Pricing System
Title | Anomaly Detection for an E-commerce Pricing System |
Authors | Jagdish Ramakrishnan, Elham Shaabani, Chao Li, Mátyás A. Sustik |
Abstract | Online retailers execute a very large number of price updates when compared to brick-and-mortar stores. Even a few mis-priced items can have a significant business impact and result in a loss of customer trust. Early detection of anomalies in an automated real-time fashion is an important part of such a pricing system. In this paper, we describe unsupervised and supervised anomaly detection approaches we developed and deployed for a large-scale online pricing system at Walmart. Our system detects anomalies both in batch and real-time streaming settings, and the items flagged are reviewed and actioned based on priority and business impact. We found that having the right architecture design was critical to facilitate model performance at scale, and business impact and speed were important factors influencing model selection, parameter choice, and prioritization in a production environment for a large-scale system. We conducted analyses on the performance of various approaches on a test set using real-world retail data and fully deployed our approach into production. We found that our approach was able to detect the most important anomalies with high precision. |
Tasks | Anomaly Detection, Model Selection |
Published | 2019-02-25 |
URL | https://arxiv.org/abs/1902.09566v5 |
https://arxiv.org/pdf/1902.09566v5.pdf | |
PWC | https://paperswithcode.com/paper/anomaly-detection-for-an-e-commerce-pricing |
Repo | |
Framework | |
Brain Tissues Segmentation on MR Perfusion Images Using CUSUM Filter for Boundary Pixels
Title | Brain Tissues Segmentation on MR Perfusion Images Using CUSUM Filter for Boundary Pixels |
Authors | S. M. Alkhimova, A. P. Krenevych |
Abstract | The fully automated and relatively accurate method of brain tissues segmentation on T2-weighted magnetic resonance perfusion images is proposed. Segmentation with this method provides a possibility to obtain perfusion region of interest on images with abnormal brain anatomy that is very important for perfusion analysis. In the proposed method the result is presented as a binary mask, which marks two regions: brain tissues pixels with unity values and skull, extracranial soft tissue and background pixels with zero values. The binary mask is produced based on the location of boundary between two studied regions. Each boundary point is detected with CUSUM filter as a change point for iteratively accumulated points at time of moving on a sinusoidal-like path along the boundary from one region to another. The evaluation results for 20 clinical cases showed that proposed segmentation method could significantly reduce the time and efforts required to obtain desirable results for perfusion region of interest detection on T2-weighted magnetic resonance perfusion images with abnormal brain anatomy. |
Tasks | |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.03865v1 |
https://arxiv.org/pdf/1907.03865v1.pdf | |
PWC | https://paperswithcode.com/paper/brain-tissues-segmentation-on-mr-perfusion |
Repo | |
Framework | |