October 19, 2019

3063 words 15 mins read

Paper Group ANR 198

Model-Agnostic Private Learning via Stability. Profile-guided memory optimization for deep neural networks. Radon Inversion via Deep Learning. A Fully Convolutional Two-Stream Fusion Network for Interactive Image Segmentation. Classifying medical notes into standard disease codes using Machine Learning. A dataset of continuous affect annotations an …

Model-Agnostic Private Learning via Stability


Title	Model-Agnostic Private Learning via Stability
Authors	Raef Bassily, Om Thakkar, Abhradeep Thakurta
Abstract	We design differentially private learning algorithms that are agnostic to the learning model. Our algorithms are interactive in nature, i.e., instead of outputting a model based on the training data, they provide predictions for a set of $m$ feature vectors that arrive online. We show that, for the feature vectors on which an ensemble of models (trained on random disjoint subsets of a dataset) makes consistent predictions, there is almost no-cost of privacy in generating accurate predictions for those feature vectors. To that end, we provide a novel coupling of the distance to instability framework with the sparse vector technique. We provide algorithms with formal privacy and utility guarantees for both binary/multi-class classification, and soft-label classification. For binary classification in the standard (agnostic) PAC model, we show how to bootstrap from our privately generated predictions to construct a computationally efficient private learner that outputs a final accurate hypothesis. Our construction - to the best of our knowledge - is the first computationally efficient construction for a label-private learner. We prove sample complexity upper bounds for this setting. As in non-private sample complexity bounds, the only relevant property of the given concept class is its VC dimension. For soft-label classification, our techniques are based on exploiting the stability properties of traditional learning algorithms, like stochastic gradient descent (SGD). We provide a new technique to boost the average-case stability properties of learning algorithms to strong (worst-case) stability properties, and then exploit them to obtain private classification algorithms. In the process, we also show that a large class of SGD methods satisfy average-case stability properties, in contrast to a smaller class of SGD methods that are uniformly stable as shown in prior work.
Tasks
Published	2018-03-14
URL	http://arxiv.org/abs/1803.05101v1
PDF	http://arxiv.org/pdf/1803.05101v1.pdf
PWC	https://paperswithcode.com/paper/model-agnostic-private-learning-via-stability
Repo
Framework

Profile-guided memory optimization for deep neural networks


Title	Profile-guided memory optimization for deep neural networks
Authors	Taro Sekiyama, Takashi Imamichi, Haruki Imai, Rudy Raymond
Abstract	Recent years have seen deep neural networks (DNNs) becoming wider and deeper to achieve better performance in many applications of AI. Such DNNs however require huge amounts of memory to store weights and intermediate results (e.g., activations, feature maps, etc.) in propagation. This requirement makes it difficult to run the DNNs on devices with limited, hard-to-extend memory, degrades the running time performance, and restricts the design of network models. We address this challenge by developing a novel profile-guided memory optimization to efficiently and quickly allocate memory blocks during the propagation in DNNs. The optimization utilizes a simple and fast heuristic algorithm based on the two-dimensional rectangle packing problem. Experimenting with well-known neural network models, we confirm that our method not only reduces the memory consumption by up to $49.5%$ but also accelerates training and inference by up to a factor of four thanks to the rapidity of the memory allocation and the ability to use larger mini-batch sizes.
Tasks
Published	2018-04-26
URL	http://arxiv.org/abs/1804.10001v1
PDF	http://arxiv.org/pdf/1804.10001v1.pdf
PWC	https://paperswithcode.com/paper/profile-guided-memory-optimization-for-deep
Repo
Framework

Radon Inversion via Deep Learning


Title	Radon Inversion via Deep Learning
Authors	Ji He, Jianhua Ma
Abstract	Radon transform is widely used in physical and life sciences and one of its major applications is the X-ray computed tomography (X-ray CT), which is significant in modern health examination. The Radon inversion or image reconstruction is challenging due to the potentially defective radon projections. Conventionally, the reconstruction process contains several ad hoc stages to approximate the corresponding Radon inversion. Each of the stages is highly dependent on the results of the previous stage. In this paper, we propose a novel unified framework for Radon inversion via deep learning (DL). The Radon inversion can be approximated by the proposed framework with an end-to-end fashion instead of processing step-by-step with multiple stages. For simplicity, the proposed framework is short as iRadonMap (inverse Radon transform approximation). Specifically, we implement the iRadonMap as an appropriative neural network, of which the architecture can be divided into two segments. In the first segment, a learnable fully-connected filtering layer is used to filter the radon projections along the view-angle direction, which is followed by a learnable sinusoidal back-projection layer to transfer the filtered radon projections into an image. The second segment is a common neural network architecture to further improve the reconstruction performance in the image domain. The iRadonMap is overall optimized by training a large number of generic images from ImageNet database. To evaluate the performance of the iRadonMap, clinical patient data is used. Qualitative results show promising reconstruction performance of the iRadonMap.
Tasks	Image Reconstruction
Published	2018-08-09
URL	http://arxiv.org/abs/1808.03015v1
PDF	http://arxiv.org/pdf/1808.03015v1.pdf
PWC	https://paperswithcode.com/paper/radon-inversion-via-deep-learning
Repo
Framework

A Fully Convolutional Two-Stream Fusion Network for Interactive Image Segmentation


Title	A Fully Convolutional Two-Stream Fusion Network for Interactive Image Segmentation
Authors	Yang Hu, Andrea Soltoggio, Russell Lock, Steve Carter
Abstract	In this paper, we propose a novel fully convolutional two-stream fusion network (FCTSFN) for interactive image segmentation. The proposed network includes two sub-networks: a two-stream late fusion network (TSLFN) that predicts the foreground at a reduced resolution, and a multi-scale refining network (MSRN) that refines the foreground at full resolution. The TSLFN includes two distinct deep streams followed by a fusion network. The intuition is that, since user interactions are more direct information on foreground/background than the image itself, the two-stream structure of the TSLFN reduces the number of layers between the pure user interaction features and the network output, allowing the user interactions to have a more direct impact on the segmentation result. The MSRN fuses the features from different layers of TSLFN with different scales, in order to seek the local to global information on the foreground to refine the segmentation result at full resolution. We conduct comprehensive experiments on four benchmark datasets. The results show that the proposed network achieves competitive performance compared to current state-of-the-art interactive image segmentation methods
Tasks	Semantic Segmentation
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02480v2
PDF	http://arxiv.org/pdf/1807.02480v2.pdf
PWC	https://paperswithcode.com/paper/a-fully-convolutional-two-stream-fusion
Repo
Framework

Classifying medical notes into standard disease codes using Machine Learning


Title	Classifying medical notes into standard disease codes using Machine Learning
Authors	Amitabha Karmakar
Abstract	We investigate the automatic classification of patient discharge notes into standard disease labels. We find that Convolutional Neural Networks with Attention outperform previous algorithms used in this task, and suggest further areas for improvement.
Tasks
Published	2018-02-01
URL	http://arxiv.org/abs/1802.00382v1
PDF	http://arxiv.org/pdf/1802.00382v1.pdf
PWC	https://paperswithcode.com/paper/classifying-medical-notes-into-standard
Repo
Framework

A dataset of continuous affect annotations and physiological signals for emotion analysis


Title	A dataset of continuous affect annotations and physiological signals for emotion analysis
Authors	Karan Sharma, Claudio Castellini, Egon L. van den Broek, Alin Albu-Schaeffer, Friedhelm Schwenker
Abstract	From a computational viewpoint, emotions continue to be intriguingly hard to understand. In research, direct, real-time inspection in realistic settings is not possible. Discrete, indirect, post-hoc recordings are therefore the norm. As a result, proper emotion assessment remains a problematic issue. The Continuously Annotated Signals of Emotion (CASE) dataset provides a solution as it focusses on real-time continuous annotation of emotions, as experienced by the participants, while watching various videos. For this purpose, a novel, intuitive joystick-based annotation interface was developed, that allowed for simultaneous reporting of valence and arousal, that are instead often annotated independently. In parallel, eight high quality, synchronized physiological recordings (1000 Hz, 16-bit ADC) were made of ECG, BVP, EMG (3x), GSR (or EDA), respiration and skin temperature. The dataset consists of the physiological and annotation data from 30 participants, 15 male and 15 female, who watched several validated video-stimuli. The validity of the emotion induction, as exemplified by the annotation and physiological data, is also presented.
Tasks	Emotion Recognition
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02782v1
PDF	http://arxiv.org/pdf/1812.02782v1.pdf
PWC	https://paperswithcode.com/paper/a-dataset-of-continuous-affect-annotations
Repo
Framework

Solving Sudoku with Ant Colony Optimisation


Title	Solving Sudoku with Ant Colony Optimisation
Authors	Huw Lloyd, Martyn Amos
Abstract	In this paper we present a new Ant Colony Optimisation-based algorithm for Sudoku, which out-performs existing methods on large instances. Our method includes a novel anti-stagnation operator, which we call Best Value Evaporation.
Tasks
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03545v1
PDF	http://arxiv.org/pdf/1805.03545v1.pdf
PWC	https://paperswithcode.com/paper/solving-sudoku-with-ant-colony-optimisation
Repo
Framework

Online learning using multiple times weight updating


Title	Online learning using multiple times weight updating
Authors	Charanjeet, Anuj Sharma
Abstract	Online learning makes sequence of decisions with partial data arrival where next movement of data is unknown. In this paper, we have presented a new technique as multiple times weight updating that update the weight iteratively forsame instance. The proposed technique analyzed with popular state-of-art algorithms from literature and experimented using established tool. The results indicates that mistake rate reduces to zero or close to zero for various datasets and algorithms. The overhead running cost is not too expensive and achieving mistake rate close to zero further strengthen the proposed technique. The present work include bound nature of weight updating for single instance and achieve optimal weight value. This proposed work could be extended to big datasets problems to reduce mistake rate in online learning environment. Also, the proposed technique could be helpful to meet real life challenges.
Tasks
Published	2018-10-26
URL	http://arxiv.org/abs/1811.00178v2
PDF	http://arxiv.org/pdf/1811.00178v2.pdf
PWC	https://paperswithcode.com/paper/online-learning-using-multiple-times-weight
Repo
Framework

On the Topic of Jets: Disentangling Quarks and Gluons at Colliders


Title	On the Topic of Jets: Disentangling Quarks and Gluons at Colliders
Authors	Eric M. Metodiev, Jesse Thaler
Abstract	We introduce jet topics: a framework to identify underlying classes of jets from collider data. Because of a close mathematical relationship between distributions of observables in jets and emergent themes in sets of documents, we can apply recent techniques in “topic modeling” to extract jet topics from data with minimal or no input from simulation or theory. As a proof of concept with parton shower samples, we apply jet topics to determine separate quark and gluon jet distributions for constituent multiplicity. We also determine separate quark and gluon rapidity spectra from a mixed Z-plus-jet sample. While jet topics are defined directly from hadron-level multi-differential cross sections, one can also predict jet topics from first-principles theoretical calculations, with potential implications for how to define quark and gluon jets beyond leading-logarithmic accuracy. These investigations suggest that jet topics will be useful for extracting underlying jet distributions and fractions in a wide range of contexts at the Large Hadron Collider.
Tasks
Published	2018-01-31
URL	http://arxiv.org/abs/1802.00008v2
PDF	http://arxiv.org/pdf/1802.00008v2.pdf
PWC	https://paperswithcode.com/paper/on-the-topic-of-jets-disentangling-quarks-and
Repo
Framework

Computing the Value of Computation for Planning


Title	Computing the Value of Computation for Planning
Authors	Can Eren Sezener
Abstract	An intelligent agent performs actions in order to achieve its goals. Such actions can either be externally directed, such as opening a door, or internally directed, such as writing data to a memory location or strengthening a synaptic connection. Some internal actions, to which we refer as computations, potentially help the agent choose better actions. Considering that (external) actions and computations might draw upon the same resources, such as time and energy, deciding when to act or compute, as well as what to compute, are detrimental to the performance of an agent. In an environment that provides rewards depending on an agent’s behavior, an action’s value is typically defined as the sum of expected long-term rewards succeeding the action (itself a complex quantity that depends on what the agent goes on to do after the action in question). However, defining the value of a computation is not as straightforward, as computations are only valuable in a higher order way, through the alteration of actions. This thesis offers a principled way of computing the value of a computation in a planning setting formalized as a Markov decision process. We present two different definitions of computation values: static and dynamic. They address two extreme cases of the computation budget: affording calculation of zero or infinitely many steps in the future. We show that these values have desirable properties, such as temporal consistency and asymptotic convergence. Furthermore, we propose methods for efficiently computing and approximating the static and dynamic computation values. We describe a sense in which the policies that greedily maximize these values can be optimal. We utilize these principles to construct Monte Carlo tree search algorithms that outperform most of the state-of-the-art in terms of finding higher quality actions given the same simulation resources.
Tasks
Published	2018-11-07
URL	http://arxiv.org/abs/1811.03035v1
PDF	http://arxiv.org/pdf/1811.03035v1.pdf
PWC	https://paperswithcode.com/paper/computing-the-value-of-computation-for
Repo
Framework

Deep semi-supervised segmentation with weight-averaged consistency targets


Title	Deep semi-supervised segmentation with weight-averaged consistency targets
Authors	Christian S. Perone, Julien Cohen-Adad
Abstract	Recently proposed techniques for semi-supervised learning such as Temporal Ensembling and Mean Teacher have achieved state-of-the-art results in many important classification benchmarks. In this work, we expand the Mean Teacher approach to segmentation tasks and show that it can bring important improvements in a realistic small data regime using a publicly available multi-center dataset from the Magnetic Resonance Imaging (MRI) domain. We also devise a method to solve the problems that arise when using traditional data augmentation strategies for segmentation tasks on our new training scheme.
Tasks	Data Augmentation
Published	2018-07-12
URL	http://arxiv.org/abs/1807.04657v2
PDF	http://arxiv.org/pdf/1807.04657v2.pdf
PWC	https://paperswithcode.com/paper/deep-semi-supervised-segmentation-with-weight
Repo
Framework

Recent Advances in Open Set Recognition: A Survey


Title	Recent Advances in Open Set Recognition: A Survey
Authors	Chuanxing Geng, Sheng-jun Huang, Songcan Chen
Abstract	In real-world recognition/classification tasks, limited by various objective factors, it is usually difficult to collect training samples to exhaust all classes when training a recognizer or classifier. A more realistic scenario is open set recognition (OSR), where incomplete knowledge of the world exists at training time, and unknown classes can be submitted to an algorithm during testing, requiring the classifiers to not only accurately classify the seen classes, but also effectively deal with the unseen ones. This paper provides a comprehensive survey of existing open set recognition techniques covering various aspects ranging from related definitions, representations of models, datasets, evaluation criteria, and algorithm comparisons. Furthermore, we briefly analyze the relationships between OSR and its related tasks including zero-shot, one-shot (few-shot) recognition/learning techniques, classification with reject option, and so forth. Additionally, we also overview the open world recognition which can be seen as a natural extension of OSR. Importantly, we highlight the limitations of existing approaches and point out some promising subsequent research directions in this field.
Tasks	Open Set Learning
Published	2018-11-21
URL	https://arxiv.org/abs/1811.08581v4
PDF	https://arxiv.org/pdf/1811.08581v4.pdf
PWC	https://paperswithcode.com/paper/recent-advances-in-open-set-recognition-a
Repo
Framework

Improved survival of cancer patients admitted to the ICU between 2002 and 2011 at a U.S. teaching hospital


Title	Improved survival of cancer patients admitted to the ICU between 2002 and 2011 at a U.S. teaching hospital
Authors	Chris Sauer, Jinghui Dong, Leo Celi, Daniele Ramazzotti
Abstract	Over the past decades, both critical care and cancer care have improved substantially. Due to increased cancer-specific survival, we hypothesized that both the number of cancer patients admitted to the ICU and overall survival have increased since the millennium change. MIMIC-III, a freely accessible critical care database of Beth Israel Deaconess Medical Center, Boston, USA was used to retrospectively study trends and outcomes of cancer patients admitted to the ICU between 2002 and 2011. Multiple logistic regression analysis was performed to adjust for confounders of 28-day and 1-year mortality. Out of 41,468 unique ICU admissions, 1,100 hemato-oncologic, 3,953 oncologic and 49 patients with both a hematological and solid malignancy were analyzed. Hematological patients had higher critical illness scores than non-cancer patients, while oncologic patients had similar APACHE-III and SOFA-scores compared to non-cancer patients. In the univariate analysis, cancer was strongly associated with mortality (OR= 2.74, 95%CI: 2.56, 2.94). Over the 10-year study period, 28-day mortality of cancer patients decreased by 30%. This trend persisted after adjustment for covariates, with cancer patients having significantly higher mortality (OR=2.63, 95%CI: 2.38, 2.88). Between 2002 and 2011, both the adjusted odds of 28-day mortality and the adjusted odds of 1-year mortality for cancer patients decreased by 6% (95%CI: 4%, 9%). Having cancer was the strongest single predictor of 1-year mortality in the multivariate model (OR=4.47, 95%CI: 4.11, 4.84).
Tasks
Published	2018-08-06
URL	http://arxiv.org/abs/1808.02766v1
PDF	http://arxiv.org/pdf/1808.02766v1.pdf
PWC	https://paperswithcode.com/paper/improved-survival-of-cancer-patients-admitted
Repo
Framework

A Two-Step Learning Method For Detecting Landmarks on Faces From Different Domains


Title	A Two-Step Learning Method For Detecting Landmarks on Faces From Different Domains
Authors	Bruna Vieira Frade, Erickson R. Nascimento
Abstract	The detection of fiducial points on faces has significantly been favored by the rapid progress in the field of machine learning, in particular in the convolution networks. However, the accuracy of most of the detectors strongly depends on an enormous amount of annotated data. In this work, we present a domain adaptation approach based on a two-step learning to detect fiducial points on human and animal faces. We evaluate our method on three different datasets composed of different animal faces (cats, dogs, and horses). The experiments show that our method performs better than state of the art and can use few annotated data to leverage the detection of landmarks reducing the demand for large volume of annotated data.
Tasks	Domain Adaptation
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04621v1
PDF	http://arxiv.org/pdf/1809.04621v1.pdf
PWC	https://paperswithcode.com/paper/a-two-step-learning-method-for-detecting
Repo
Framework

The Power of Genetic Algorithms: what remains of the pMSSM?


Title	The Power of Genetic Algorithms: what remains of the pMSSM?
Authors	Steven Abel, David G. Cerdeno, Sandra Robles
Abstract	Genetic Algorithms (GAs) are explored as a tool for probing new physics with high dimensionality. We study the 19-dimensional pMSSM, including experimental constraints from all sources and assessing the consistency of potential signals of new physics. We show that GAs excel at making a fast and accurate diagnosis of the cross-compatibility of a set of experimental constraints in such high dimensional models. In the case of the pMSSM, it is found that only ${\cal O}(10^4)$ model evaluations are required to obtain a best fit point in agreement with much more costly MCMC scans. This efficiency allows higher dimensional models to be falsified, and patterns in the spectrum identified, orders of magnitude more quickly. As examples of falsification, we consider the muon anomalous magnetic moment, and the Galactic Centre gamma-ray excess observed by Fermi-LAT, which could in principle be explained in terms of neutralino dark matter. We show that both observables cannot be explained within the pMSSM, and that they provide the leading contribution to the total goodness of the fit, with $\chi^2_{\delta a_\mu^{\mathrm{SUSY}}}\approx12$ and $\chi^2_{\rm GCE}\approx 155$, respectively.
Tasks
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03615v1
PDF	http://arxiv.org/pdf/1805.03615v1.pdf
PWC	https://paperswithcode.com/paper/the-power-of-genetic-algorithms-what-remains
Repo
Framework