Paper Group ANR 113
CardiacNET: Segmentation of Left Atrium and Proximal Pulmonary Veins from MRI Using Multi-View CNN. Why Adaptively Collected Data Have Negative Bias and How to Correct for It. A Summary Of The Kernel Matrix, And How To Learn It Effectively Using Semidefinite Programming. FastVentricle: Cardiac Segmentation with ENet. STDP Based Pruning of Connectio …
CardiacNET: Segmentation of Left Atrium and Proximal Pulmonary Veins from MRI Using Multi-View CNN
Title | CardiacNET: Segmentation of Left Atrium and Proximal Pulmonary Veins from MRI Using Multi-View CNN |
Authors | Aliasghar Mortazi, Rashed Karim, Kawal Rhode, Jeremy Burt, Ulas Bagci |
Abstract | Anatomical and biophysical modeling of left atrium (LA) and proximal pulmonary veins (PPVs) is important for clinical management of several cardiac diseases. Magnetic resonance imaging (MRI) allows qualitative assessment of LA and PPVs through visualization. However, there is a strong need for an advanced image segmentation method to be applied to cardiac MRI for quantitative analysis of LA and PPVs. In this study, we address this unmet clinical need by exploring a new deep learning-based segmentation strategy for quantification of LA and PPVs with high accuracy and heightened efficiency. Our approach is based on a multi-view convolutional neural network (CNN) with an adaptive fusion strategy and a new loss function that allows fast and more accurate convergence of the backpropagation based optimization. After training our network from scratch by using more than 60K 2D MRI images (slices), we have evaluated our segmentation strategy to the STACOM 2013 cardiac segmentation challenge benchmark. Qualitative and quantitative evaluations, obtained from the segmentation challenge, indicate that the proposed method achieved the state-of-the-art sensitivity (90%), specificity (99%), precision (94%), and efficiency levels (10 seconds in GPU, and 7.5 minutes in CPU). |
Tasks | Cardiac Segmentation, Semantic Segmentation |
Published | 2017-05-17 |
URL | http://arxiv.org/abs/1705.06333v2 |
http://arxiv.org/pdf/1705.06333v2.pdf | |
PWC | https://paperswithcode.com/paper/cardiacnet-segmentation-of-left-atrium-and |
Repo | |
Framework | |
Why Adaptively Collected Data Have Negative Bias and How to Correct for It
Title | Why Adaptively Collected Data Have Negative Bias and How to Correct for It |
Authors | Xinkun Nie, Xiaoying Tian, Jonathan Taylor, James Zou |
Abstract | From scientific experiments to online A/B testing, the previously observed data often affects how future experiments are performed, which in turn affects which data will be collected. Such adaptivity introduces complex correlations between the data and the collection procedure. In this paper, we prove that when the data collection procedure satisfies natural conditions, then sample means of the data have systematic \emph{negative} biases. As an example, consider an adaptive clinical trial where additional data points are more likely to be tested for treatments that show initial promise. Our surprising result implies that the average observed treatment effects would underestimate the true effects of each treatment. We quantitatively analyze the magnitude and behavior of this negative bias in a variety of settings. We also propose a novel debiasing algorithm based on selective inference techniques. In experiments, our method can effectively reduce bias and estimation error. |
Tasks | |
Published | 2017-08-07 |
URL | http://arxiv.org/abs/1708.01977v2 |
http://arxiv.org/pdf/1708.01977v2.pdf | |
PWC | https://paperswithcode.com/paper/why-adaptively-collected-data-have-negative |
Repo | |
Framework | |
A Summary Of The Kernel Matrix, And How To Learn It Effectively Using Semidefinite Programming
Title | A Summary Of The Kernel Matrix, And How To Learn It Effectively Using Semidefinite Programming |
Authors | Amir-Hossein Karimi |
Abstract | Kernel-based learning algorithms are widely used in machine learning for problems that make use of the similarity between object pairs. Such algorithms first embed all data points into an alternative space, where the inner product between object pairs specifies their distance in the embedding space. Applying kernel methods to partially labeled datasets is a classical challenge in this regard, requiring that the distances between unlabeled pairs must somehow be learnt using the labeled data. In this independent study, I will summarize the work of G. Lanckriet et al.‘s work on “Learning the Kernel Matrix with Semidefinite Programming” used in support vector machines (SVM) algorithms for the transduction problem. Throughout the report, I have provide alternative explanations / derivations / analysis related to this work which is designed to ease the understanding of the original article. |
Tasks | |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.06557v1 |
http://arxiv.org/pdf/1709.06557v1.pdf | |
PWC | https://paperswithcode.com/paper/a-summary-of-the-kernel-matrix-and-how-to |
Repo | |
Framework | |
FastVentricle: Cardiac Segmentation with ENet
Title | FastVentricle: Cardiac Segmentation with ENet |
Authors | Jesse Lieman-Sifry, Matthieu Le, Felix Lau, Sean Sall, Daniel Golden |
Abstract | Cardiac Magnetic Resonance (CMR) imaging is commonly used to assess cardiac structure and function. One disadvantage of CMR is that post-processing of exams is tedious. Without automation, precise assessment of cardiac function via CMR typically requires an annotator to spend tens of minutes per case manually contouring ventricular structures. Automatic contouring can lower the required time per patient by generating contour suggestions that can be lightly modified by the annotator. Fully convolutional networks (FCNs), a variant of convolutional neural networks, have been used to rapidly advance the state-of-the-art in automated segmentation, which makes FCNs a natural choice for ventricular segmentation. However, FCNs are limited by their computational cost, which increases the monetary cost and degrades the user experience of production systems. To combat this shortcoming, we have developed the FastVentricle architecture, an FCN architecture for ventricular segmentation based on the recently developed ENet architecture. FastVentricle is 4x faster and runs with 6x less memory than the previous state-of-the-art ventricular segmentation architecture while still maintaining excellent clinical accuracy. |
Tasks | Cardiac Segmentation |
Published | 2017-04-13 |
URL | http://arxiv.org/abs/1704.04296v1 |
http://arxiv.org/pdf/1704.04296v1.pdf | |
PWC | https://paperswithcode.com/paper/fastventricle-cardiac-segmentation-with-enet |
Repo | |
Framework | |
STDP Based Pruning of Connections and Weight Quantization in Spiking Neural Networks for Energy Efficient Recognition
Title | STDP Based Pruning of Connections and Weight Quantization in Spiking Neural Networks for Energy Efficient Recognition |
Authors | Nitin Rathi, Priyadarshini Panda, Kaushik Roy |
Abstract | Spiking Neural Networks (SNNs) with a large number of weights and varied weight distribution can be difficult to implement in emerging in-memory computing hardware due to the limitations on crossbar size (implementing dot product), the constrained number of conductance levels in non-CMOS devices and the power budget. We present a sparse SNN topology where non-critical connections are pruned to reduce the network size and the remaining critical synapses are weight quantized to accommodate for limited conductance levels. Pruning is based on the power law weight-dependent Spike Timing Dependent Plasticity (STDP) model; synapses between pre- and post-neuron with high spike correlation are retained, whereas synapses with low correlation or uncorrelated spiking activity are pruned. The weights of the retained connections are quantized to the available number of conductance levels. The process of pruning non-critical connections and quantizing the weights of critical synapses is performed at regular intervals during training. We evaluated our sparse and quantized network on MNIST dataset and on a subset of images from Caltech-101 dataset. The compressed topology achieved a classification accuracy of 90.1% (91.6%) on the MNIST (Caltech-101) dataset with 3.1x (2.2x) and 4x (2.6x) improvement in energy and area, respectively. The compressed topology is energy and area efficient while maintaining the same classification accuracy of a 2-layer fully connected SNN topology. |
Tasks | Quantization |
Published | 2017-10-12 |
URL | http://arxiv.org/abs/1710.04734v1 |
http://arxiv.org/pdf/1710.04734v1.pdf | |
PWC | https://paperswithcode.com/paper/stdp-based-pruning-of-connections-and-weight |
Repo | |
Framework | |
Near Optimal Hamiltonian-Control and Learning via Chattering
Title | Near Optimal Hamiltonian-Control and Learning via Chattering |
Authors | Peeyush Kumar, Wolf Kohn, Zelda B. Zabinsky |
Abstract | Many applications require solving non-linear control problems that are classically not well behaved. This paper develops a simple and efficient chattering algorithm that learns near optimal decision policies through an open-loop feedback strategy. The optimal control problem reduces to a series of linear optimization programs that can be easily solved to recover a relaxed optimal trajectory. This algorithm is implemented on a real-time enterprise scheduling and control process. |
Tasks | |
Published | 2017-03-19 |
URL | http://arxiv.org/abs/1703.06485v1 |
http://arxiv.org/pdf/1703.06485v1.pdf | |
PWC | https://paperswithcode.com/paper/near-optimal-hamiltonian-control-and-learning |
Repo | |
Framework | |
Multi-Entity Dependence Learning with Rich Context via Conditional Variational Auto-encoder
Title | Multi-Entity Dependence Learning with Rich Context via Conditional Variational Auto-encoder |
Authors | Luming Tang, Yexiang Xue, Di Chen, Carla P. Gomes |
Abstract | Multi-Entity Dependence Learning (MEDL) explores conditional correlations among multiple entities. The availability of rich contextual information requires a nimble learning scheme that tightly integrates with deep neural networks and has the ability to capture correlation structures among exponentially many outcomes. We propose MEDL_CVAE, which encodes a conditional multivariate distribution as a generating process. As a result, the variational lower bound of the joint likelihood can be optimized via a conditional variational auto-encoder and trained end-to-end on GPUs. Our MEDL_CVAE was motivated by two real-world applications in computational sustainability: one studies the spatial correlation among multiple bird species using the eBird data and the other models multi-dimensional landscape composition and human footprint in the Amazon rainforest with satellite images. We show that MEDL_CVAE captures rich dependency structures, scales better than previous methods, and further improves on the joint likelihood taking advantage of very large datasets that are beyond the capacity of previous methods. |
Tasks | |
Published | 2017-09-17 |
URL | http://arxiv.org/abs/1709.05612v1 |
http://arxiv.org/pdf/1709.05612v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-entity-dependence-learning-with-rich |
Repo | |
Framework | |
Spatio-temporal Learning with Arrays of Analog Nanosynapses
Title | Spatio-temporal Learning with Arrays of Analog Nanosynapses |
Authors | Christopher H. Bennett, Damien Querlioz, Jacques-Olivier Klein |
Abstract | Emerging nanodevices such as resistive memories are being considered for hardware realizations of a variety of artificial neural networks (ANNs), including highly promising online variants of the learning approaches known as reservoir computing (RC) and the extreme learning machine (ELM). We propose an RC/ELM inspired learning system built with nanosynapses that performs both on-chip projection and regression operations. To address time-dynamic tasks, the hidden neurons of our system perform spatio-temporal integration and can be further enhanced with variable sampling or multiple activation windows. We detail the system and show its use in conjunction with a highly analog nanosynapse device on a standard task with intrinsic timing dynamics- the TI-46 battery of spoken digits. The system achieves nearly perfect (99%) accuracy at sufficient hidden layer size, which compares favorably with software results. In addition, the model is extended to a larger dataset, the MNIST database of handwritten digits. By translating the database into the time domain and using variable integration windows, up to 95% classification accuracy is achieved. In addition to an intrinsically low-power programming style, the proposed architecture learns very quickly and can easily be converted into a spiking system with negligible loss in performance- all features that confer significant energy efficiency. |
Tasks | |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03849v1 |
http://arxiv.org/pdf/1709.03849v1.pdf | |
PWC | https://paperswithcode.com/paper/spatio-temporal-learning-with-arrays-of |
Repo | |
Framework | |
CERN: Confidence-Energy Recurrent Network for Group Activity Recognition
Title | CERN: Confidence-Energy Recurrent Network for Group Activity Recognition |
Authors | Tianmin Shu, Sinisa Todorovic, Song-Chun Zhu |
Abstract | This work is about recognizing human activities occurring in videos at distinct semantic levels, including individual actions, interactions, and group activities. The recognition is realized using a two-level hierarchy of Long Short-Term Memory (LSTM) networks, forming a feed-forward deep architecture, which can be trained end-to-end. In comparison with existing architectures of LSTMs, we make two key contributions giving the name to our approach as Confidence-Energy Recurrent Network – CERN. First, instead of using the common softmax layer for prediction, we specify a novel energy layer (EL) for estimating the energy of our predictions. Second, rather than finding the common minimum-energy class assignment, which may be numerically unstable under uncertainty, we specify that the EL additionally computes the p-values of the solutions, and in this way estimates the most confident energy minimum. The evaluation on the Collective Activity and Volleyball datasets demonstrates: (i) advantages of our two contributions relative to the common softmax and energy-minimization formulations and (ii) a superior performance relative to the state-of-the-art approaches. |
Tasks | Activity Recognition, Group Activity Recognition |
Published | 2017-04-10 |
URL | http://arxiv.org/abs/1704.03058v1 |
http://arxiv.org/pdf/1704.03058v1.pdf | |
PWC | https://paperswithcode.com/paper/cern-confidence-energy-recurrent-network-for |
Repo | |
Framework | |
Semi-Supervised Learning with Competitive Infection Models
Title | Semi-Supervised Learning with Competitive Infection Models |
Authors | Nir Rosenfeld, Amir Globerson |
Abstract | The goal in semi-supervised learning is to effectively combine labeled and unlabeled data. One way to do this is by encouraging smoothness across edges in a graph whose nodes correspond to input examples. In many graph-based methods, labels can be thought of as propagating over the graph, where the underlying propagation mechanism is based on random walks or on averaging dynamics. While theoretically elegant, these dynamics suffer from several drawbacks which can hurt predictive performance. Our goal in this work is to explore alternative mechanisms for propagating labels. In particular, we propose a method based on dynamic infection processes, where unlabeled nodes can be “infected” with the label of their already infected neighbors. Our algorithm is efficient and scalable, and an analysis of the underlying optimization objective reveals a surprising relation to other Laplacian approaches. We conclude with a thorough set of experiments across multiple benchmarks and various learning settings. |
Tasks | |
Published | 2017-03-19 |
URL | http://arxiv.org/abs/1703.06426v4 |
http://arxiv.org/pdf/1703.06426v4.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-learning-with-competitive |
Repo | |
Framework | |
PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference
Title | PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference |
Authors | Jonathan H. Huggins, Ryan P. Adams, Tamara Broderick |
Abstract | Generalized linear models (GLMs) – such as logistic regression, Poisson regression, and robust regression – provide interpretable models for diverse data types. Probabilistic approaches, particularly Bayesian ones, allow coherent estimates of uncertainty, incorporation of prior information, and sharing of power across experiments via hierarchical models. In practice, however, the approximate Bayesian methods necessary for inference have either failed to scale to large data sets or failed to provide theoretical guarantees on the quality of inference. We propose a new approach based on constructing polynomial approximate sufficient statistics for GLMs (PASS-GLM). We demonstrate that our method admits a simple algorithm as well as trivial streaming and distributed extensions that do not compound error across computations. We provide theoretical guarantees on the quality of point (MAP) estimates, the approximate posterior, and posterior mean and uncertainty estimates. We validate our approach empirically in the case of logistic regression using a quadratic approximation and show competitive performance with stochastic gradient descent, MCMC, and the Laplace approximation in terms of speed and multiple measures of accuracy – including on an advertising data set with 40 million data points and 20,000 covariates. |
Tasks | |
Published | 2017-09-26 |
URL | http://arxiv.org/abs/1709.09216v3 |
http://arxiv.org/pdf/1709.09216v3.pdf | |
PWC | https://paperswithcode.com/paper/pass-glm-polynomial-approximate-sufficient |
Repo | |
Framework | |
Pricing Football Players using Neural Networks
Title | Pricing Football Players using Neural Networks |
Authors | Sourya Dey |
Abstract | We designed a multilayer perceptron neural network to predict the price of a football (soccer) player using data on more than 15,000 players from the football simulation video game FIFA 2017. The network was optimized by experimenting with different activation functions, number of neurons and layers, learning rate and its decay, Nesterov momentum based stochastic gradient descent, L2 regularization, and early stopping. Simultaneous exploration of various aspects of neural network training is performed and their trade-offs are investigated. Our final model achieves a top-5 accuracy of 87.2% among 119 pricing categories, and places any footballer within 6.32% of his actual price on average. |
Tasks | Game of Football, L2 Regularization |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.05865v2 |
http://arxiv.org/pdf/1711.05865v2.pdf | |
PWC | https://paperswithcode.com/paper/pricing-football-players-using-neural |
Repo | |
Framework | |
Compressing Low Precision Deep Neural Networks Using Sparsity-Induced Regularization in Ternary Networks
Title | Compressing Low Precision Deep Neural Networks Using Sparsity-Induced Regularization in Ternary Networks |
Authors | Julian Faraone, Nicholas Fraser, Giulio Gambardella, Michaela Blott, Philip H. W. Leong |
Abstract | A low precision deep neural network training technique for producing sparse, ternary neural networks is presented. The technique incorporates hard- ware implementation costs during training to achieve significant model compression for inference. Training involves three stages: network training using L2 regularization and a quantization threshold regularizer, quantization pruning, and finally retraining. Resulting networks achieve improved accuracy, reduced memory footprint and reduced computational complexity compared with conventional methods, on MNIST and CIFAR10 datasets. Our networks are up to 98% sparse and 5 & 11 times smaller than equivalent binary and ternary models, translating to significant resource and speed benefits for hardware implementations. |
Tasks | L2 Regularization, Model Compression, Quantization |
Published | 2017-09-19 |
URL | http://arxiv.org/abs/1709.06262v2 |
http://arxiv.org/pdf/1709.06262v2.pdf | |
PWC | https://paperswithcode.com/paper/compressing-low-precision-deep-neural |
Repo | |
Framework | |
The model of an anomaly detector for HiLumi LHC magnets based on Recurrent Neural Networks and adaptive quantization
Title | The model of an anomaly detector for HiLumi LHC magnets based on Recurrent Neural Networks and adaptive quantization |
Authors | Maciej Wielgosz, Matej Mertik, Andrzej Skoczeń, Ernesto De Matteis |
Abstract | This paper focuses on an examination of an applicability of Recurrent Neural Network models for detecting anomalous behavior of the CERN superconducting magnets. In order to conduct the experiments, the authors designed and implemented an adaptive signal quantization algorithm and a custom GRU-based detector and developed a method for the detector parameters selection. Three different datasets were used for testing the detector. Two artificially generated datasets were used to assess the raw performance of the system whereas the 231 MB dataset composed of the signals acquired from HiLumi magnets was intended for real-life experiments and model training. Several different setups of the developed anomaly detection system were evaluated and compared with state-of-the-art OC-SVM reference model operating on the same data. The OC-SVM model was equipped with a rich set of feature extractors accounting for a range of the input signal properties. It was determined in the course of the experiments that the detector, along with its supporting design methodology, reaches F1 equal or very close to 1 for almost all test sets. Due to the profile of the data, the best_length setup of the detector turned out to perform the best among all five tested configuration schemes of the detection system. The quantization parameters have the biggest impact on the overall performance of the detector with the best values of input/output grid equal to 16 and 8, respectively. The proposed solution of the detection significantly outperformed OC-SVM-based detector in most of the cases, with much more stable performance across all the datasets. |
Tasks | Anomaly Detection, Quantization |
Published | 2017-09-28 |
URL | http://arxiv.org/abs/1709.09883v2 |
http://arxiv.org/pdf/1709.09883v2.pdf | |
PWC | https://paperswithcode.com/paper/the-model-of-an-anomaly-detector-for-hilumi |
Repo | |
Framework | |
Regularization techniques for fine-tuning in neural machine translation
Title | Regularization techniques for fine-tuning in neural machine translation |
Authors | Antonio Valerio Miceli Barone, Barry Haddow, Ulrich Germann, Rico Sennrich |
Abstract | We investigate techniques for supervised domain adaptation for neural machine translation where an existing model trained on a large out-of-domain dataset is adapted to a small in-domain dataset. In this scenario, overfitting is a major challenge. We investigate a number of techniques to reduce overfitting and improve transfer learning, including regularization techniques such as dropout and L2-regularization towards an out-of-domain prior. In addition, we introduce tuneout, a novel regularization technique inspired by dropout. We apply these techniques, alone and in combination, to neural machine translation, obtaining improvements on IWSLT datasets for English->German and English->Russian. We also investigate the amounts of in-domain training data needed for domain adaptation in NMT, and find a logarithmic relationship between the amount of training data and gain in BLEU score. |
Tasks | Domain Adaptation, L2 Regularization, Machine Translation, Transfer Learning |
Published | 2017-07-31 |
URL | http://arxiv.org/abs/1707.09920v1 |
http://arxiv.org/pdf/1707.09920v1.pdf | |
PWC | https://paperswithcode.com/paper/regularization-techniques-for-fine-tuning-in |
Repo | |
Framework | |