January 28, 2020

3412 words 17 mins read

Paper Group ANR 879

HistoNet: Predicting size histograms of object instances. Multi-Preference Actor Critic. Self-supervised Object Motion and Depth Estimation from Video. An Evaluation of Transfer Learning for Classifying Sales Engagement Emails at Large Scale. Adaptive Symmetric Reward Noising for Reinforcement Learning. Escaping Saddle Points with Adaptive Gradient …

HistoNet: Predicting size histograms of object instances


Title	HistoNet: Predicting size histograms of object instances
Authors	Kishan Sharma, Moritz Gold, Christian Zurbruegg, Laura Leal-Taixé, Jan Dirk Wegner
Abstract	We propose to predict histograms of object sizes in crowded scenes directly without any explicit object instance segmentation. What makes this task challenging is the high density of objects (of the same category), which makes instance identification hard. Instead of explicitly segmenting object instances, we show that directly learning histograms of object sizes improves accuracy while using drastically less parameters. This is very useful for application scenarios where explicit, pixel-accurate instance segmentation is not needed, but there lies interest in the overall distribution of instance sizes. Our core applications are in biology, where we estimate the size distribution of soldier fly larvae, and medicine, where we estimate the size distribution of cancer cells as an intermediate step to calculate the tumor cellularity score. Given an image with hundreds of small object instances, we output the total count and the size histogram. We also provide a new data set for this task, the FlyLarvae data set, which consists of 11,000 larvae instances labeled pixel-wise. Our method results in an overall improvement in the count and size distribution prediction as compared to state-of-the-art instance segmentation method Mask R-CNN.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05227v2
PDF	https://arxiv.org/pdf/1912.05227v2.pdf
PWC	https://paperswithcode.com/paper/histonet-predicting-size-histograms-of-object
Repo
Framework

Multi-Preference Actor Critic


Title	Multi-Preference Actor Critic
Authors	Ishan Durugkar, Matthew Hausknecht, Adith Swaminathan, Patrick MacAlpine
Abstract	Policy gradient algorithms typically combine discounted future rewards with an estimated value function, to compute the direction and magnitude of parameter updates. However, for most Reinforcement Learning tasks, humans can provide additional insight to constrain the policy learning. We introduce a general method to incorporate multiple different feedback channels into a single policy gradient loss. In our formulation, the Multi-Preference Actor Critic (M-PAC), these different types of feedback are implemented as constraints on the policy. We use a Lagrangian relaxation to satisfy these constraints using gradient descent while learning a policy that maximizes rewards. Experiments in Atari and Pendulum verify that constraints are being respected and can accelerate the learning process.
Tasks
Published	2019-04-05
URL	http://arxiv.org/abs/1904.03295v1
PDF	http://arxiv.org/pdf/1904.03295v1.pdf
PWC	https://paperswithcode.com/paper/multi-preference-actor-critic
Repo
Framework

Self-supervised Object Motion and Depth Estimation from Video


Title	Self-supervised Object Motion and Depth Estimation from Video
Authors	Qi Dai, Vaishakh Patil, Simon Hecker, Dengxin Dai, Luc Van Gool, Konrad Schindler
Abstract	We present a self-supervised learning framework to estimate the individual object motion and monocular depth from video. We model the object motion as a 6 degree-of-freedom rigid-body transformation. The instance segmentation mask is leveraged to introduce the information of object. Compared with methods which predict pixel-wise optical flow map to model the motion, our approach significantly reduces the number of values to be estimated. Furthermore, our system eliminates the scale ambiguity of predictions, through employing the pre-computed camera ego-motion and the left-right photometric consistency. Experiments on KITTI driving dataset demonstrate our system is capable to capture the object motion without external annotation, and contribute to the depth prediction in dynamic area. Our system outperforms earlier self-supervised approaches in terms of 3D scene flow prediction, and produces comparable results on optical flow estimation.
Tasks	Depth Estimation, Instance Segmentation, Optical Flow Estimation, Semantic Segmentation
Published	2019-12-09
URL	https://arxiv.org/abs/1912.04250v1
PDF	https://arxiv.org/pdf/1912.04250v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-object-motion-and-depth
Repo
Framework

An Evaluation of Transfer Learning for Classifying Sales Engagement Emails at Large Scale


Title	An Evaluation of Transfer Learning for Classifying Sales Engagement Emails at Large Scale
Authors	Yong Liu, Pavel Dmitriev, Yifei Huang, Andrew Brooks, Li Dong
Abstract	This paper conducts an empirical investigation to evaluate transfer learning for classifying sales engagement emails arising from digital sales engagement platforms. Given the complexity of content and context of sales engagement, lack of standardized large corpora and benchmarks, limited labeled examples and heterogenous context of intent, this real-world use case poses both a challenge and an opportunity for adopting a transfer learning approach. We propose an evaluation framework to assess a high performance transfer learning (HPTL) approach in three key areas in addition to commonly used accuracy metrics: 1) effective embeddings and pretrained language model usage, 2) minimum labeled samples requirement and 3) transfer learning implementation strategies. We use in-house sales engagement email samples as the experiment dataset, which includes over 3000 emails labeled as positive, objection, unsubscribe, or not-sure. We discuss our findings on evaluating BERT, ELMo, Flair and GloVe embeddings with both feature-based and fine-tuning approaches and their scalability on a GPU cluster with increasingly larger labeled samples. Our results show that fine-tuning of the BERT model outperforms with as few as 300 labeled samples, but underperforms with fewer than 300 labeled samples, relative to all the feature-based approaches using different embeddings.
Tasks	Accuracy Metrics, Language Modelling, Transfer Learning
Published	2019-04-19
URL	http://arxiv.org/abs/1905.01971v1
PDF	http://arxiv.org/pdf/1905.01971v1.pdf
PWC	https://paperswithcode.com/paper/190501971
Repo
Framework

Adaptive Symmetric Reward Noising for Reinforcement Learning


Title	Adaptive Symmetric Reward Noising for Reinforcement Learning
Authors	Refael Vivanti, Talya D. Sohlberg-Baris, Shlomo Cohen, Orna Cohen
Abstract	Recent reinforcement learning algorithms, though achieving impressive results in various fields, suffer from brittle training effects such as regression in results and high sensitivity to initialization and parameters. We claim that some of the brittleness stems from variance differences, i.e. when different environment areas - states and/or actions - have different rewards variance. This causes two problems: First, the “Boring Areas Trap” in algorithms such as Q-learning, where moving between areas depends on the current area variance, and getting out of a boring area is hard due to its low variance. Second, the “Manipulative Consultant” problem, when value-estimation functions used in DQN and Actor-Critic algorithms influence the agent to prefer boring areas, regardless of the mean rewards return, as they maximize estimation precision rather than rewards. This sheds a new light on how exploration contribute to training, as it helps with both challenges. Cognitive experiments in humans showed that noised reward signals may paradoxically improve performance. We explain this using the two mentioned problems, claiming that both humans and algorithms may share similar challenges. Inspired by this result, we propose the Adaptive Symmetric Reward Noising (ASRN), by which we mean adding Gaussian noise to rewards according to their states’ estimated variance, thus avoiding the two problems while not affecting the environment’s mean rewards behavior. We conduct our experiments in a Multi Armed Bandit problem with variance differences. We demonstrate that a Q-learning algorithm shows the brittleness effect in this problem, and that the ASRN scheme can dramatically improve the results. We show that ASRN helps a DQN algorithm training process reach better results in an end to end autonomous driving task using the AirSim driving simulator.
Tasks	Autonomous Driving, Q-Learning
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10144v1
PDF	https://arxiv.org/pdf/1905.10144v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-symmetric-reward-noising-for
Repo
Framework

Escaping Saddle Points with Adaptive Gradient Methods


Title	Escaping Saddle Points with Adaptive Gradient Methods
Authors	Matthew Staib, Sashank J. Reddi, Satyen Kale, Sanjiv Kumar, Suvrit Sra
Abstract	Adaptive methods such as Adam and RMSProp are widely used in deep learning but are not well understood. In this paper, we seek a crisp, clean and precise characterization of their behavior in nonconvex settings. To this end, we first provide a novel view of adaptive methods as preconditioned SGD, where the preconditioner is estimated in an online manner. By studying the preconditioner on its own, we elucidate its purpose: it rescales the stochastic gradient noise to be isotropic near stationary points, which helps escape saddle points. Furthermore, we show that adaptive methods can efficiently estimate the aforementioned preconditioner. By gluing together these two components, we provide the first (to our knowledge) second-order convergence result for any adaptive method. The key insight from our analysis is that, compared to SGD, adaptive methods escape saddle points faster, and can converge faster overall to second-order stationary points.
Tasks
Published	2019-01-26
URL	https://arxiv.org/abs/1901.09149v2
PDF	https://arxiv.org/pdf/1901.09149v2.pdf
PWC	https://paperswithcode.com/paper/escaping-saddle-points-with-adaptive-gradient
Repo
Framework

Supervised Discriminative Sparse PCA for Com-Characteristic Gene Selection and Tumor Classification on Multiview Biological Data


Title	Supervised Discriminative Sparse PCA for Com-Characteristic Gene Selection and Tumor Classification on Multiview Biological Data
Authors	Chun-Mei Feng, Yong Xu, Jin-Xing Liu, Ying-Lian Gao, Chun-Hou Zheng
Abstract	Principal Component Analysis (PCA) has been used to study the pathogenesis of diseases. To enhance the interpretability of classical PCA, various improved PCA methods have been proposed to date. Among these, a typical method is the so-called sparse PCA, which focuses on seeking sparse loadings. However, the performance of these methods is still far from satisfactory due to their limitation of using unsupervised learning methods; moreover, the class ambiguity within the sample is high. To overcome this problem, this study developed a new PCA method, which is named the Supervised Discriminative Sparse PCA (SDSPCA). The main innovation of this method is the incorporation of discriminative information and sparsity into the PCA model. Specifically, in contrast to the traditional sparse PCA, which imposes sparsity on the loadings, here, sparse components are obtained to represent the data. Furthermore, via linear transformation, the sparse components approximate the given label information. On the one hand, sparse components improve interpretability over traditional PCA, while on the other hand, they are have discriminative abilities suitable for classification purposes. A simple algorithm is developed and its convergence proof is provided. The SDSPCA has been applied to common characteristic gene selection (com-characteristic gene) and tumor classification on multi-view biological data. The sparsity and classification performance of the SDSPCA are empirically verified via abundant, reasonable, and effective experiments, and the obtained results demonstrate that SDSPCA outperforms other state-of-the-art methods.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11837v1
PDF	https://arxiv.org/pdf/1905.11837v1.pdf
PWC	https://paperswithcode.com/paper/supervised-discriminative-sparse-pca-for-com
Repo
Framework

Siamese Encoding and Alignment by Multiscale Learning with Self-Supervision


Title	Siamese Encoding and Alignment by Multiscale Learning with Self-Supervision
Authors	Eric Mitchell, Stefan Keselj, Sergiy Popovych, Davit Buniatyan, H. Sebastian Seung
Abstract	We propose a method of aligning a source image to a target image, where the transform is specified by a dense vector field. The two images are encoded as feature hierarchies by siamese convolutional nets. Then a hierarchy of aligner modules computes the transform in a coarse-to-fine recursion. Each module receives as input the transform that was computed by the module at the level above, aligns the source and target encodings at the same level of the hierarchy, and then computes an improved approximation to the transform using a convolutional net. The entire architecture of encoder and aligner nets is trained in a self-supervised manner to minimize the squared error between source and target remaining after alignment. We show that siamese encoding enables more accurate alignment than the image pyramids of SPyNet, a previous deep learning approach to coarse-to-fine alignment. Furthermore, self-supervision applies even without target values for the transform, unlike the strongly supervised SPyNet. We also show that our approach outperforms one-shot approaches to alignment, because the fine pathways in the latter approach may fail to contribute to alignment accuracy when displacements are large. As shown by previous one-shot approaches, good results from self-supervised learning require that the loss function additionally penalize non-smooth transforms. We demonstrate that “masking out” the penalty function near discontinuities leads to correct recovery of non-smooth transforms. Our claims are supported by empirical comparisons using images from serial section electron microscopy of brain tissue.
Tasks
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02643v1
PDF	http://arxiv.org/pdf/1904.02643v1.pdf
PWC	https://paperswithcode.com/paper/siamese-encoding-and-alignment-by-multiscale
Repo
Framework

Graph-Based Method for Anomaly Prediction in Brain Network


Title	Graph-Based Method for Anomaly Prediction in Brain Network
Authors	Jalal Mirakhorli, Hamidreza Amindavar, Mojgan Mirakhorli
Abstract	Resting-state functional MRI (rs-fMRI) in functional neuroimaging techniques have improved in brain disorders, dysfunction studies via mapping the topology of the brain connections, i.e. connectopic mapping. Since, there are the slight differences between healthy and unhealthy brain regions and functions, investigation into the complex topology of functional and structural brain networks in human is a complicated task with the growth of evaluation criteria. Irregular graph deep learning applications have widely spread to understanding human cognitive functions that are linked to gene expression and related distributed spatial patterns, because the neuronal networks of the brain can hold dynamically a variety of brain solutions with different activity patterns and functional connectivity, these applications might also be involved with both node-centric and graph-centric tasks. In this paper, we performed a novel approach of individual generative model and high order graph analysis for the region of interest recognition areas of the brain which do not have a normal connection during applying certain tasks. Here, we proposed a high order framework of Graph Auto-Encoder (GAE) with a hypersphere distributer for functional data analysis in brain imaging studies that is underlying non-Euclidean structure in the learning of strong non-rigid graphs among large scale data. In addition, we distinguished the possible modes of correlations in abnormal brain connections. Our finding will show the degree of correlation between the affected regions and their simultaneous occurrence over time that can be used to diagnose brain diseases or revealing the ability of the nervous system to modify in brain topology at all angles, brain plasticity, according to input stimuli.
Tasks	Anomaly Detection
Published	2019-04-15
URL	https://arxiv.org/abs/1904.07163v7
PDF	https://arxiv.org/pdf/1904.07163v7.pdf
PWC	https://paperswithcode.com/paper/graph-based-method-for-anomaly-detection-in
Repo
Framework

Structure Matters: Towards Generating Transferable Adversarial Images


Title	Structure Matters: Towards Generating Transferable Adversarial Images
Authors	Dan Peng, Zizhan Zheng, Linhao Luo, Xiaofeng Zhang
Abstract	Recent works on adversarial examples for image classification focus on directly modifying pixels with minor perturbations. The small perturbation requirement is imposed to ensure the generated adversarial examples being natural and realistic to humans, which, however, puts a curb on the attack space thus limiting the attack ability and transferability especially for systems protected by a defense mechanism. In this paper, we propose the novel concepts of structure patterns and structure-aware perturbations that relax the small perturbation constraint while still keeping images natural. The key idea of our approach is to allow perceptible deviation in adversarial examples while keeping structure patterns that are central to a human classifier. Built upon these concepts, we propose a \emph{structure-preserving attack (SPA)} for generating natural adversarial examples with extremely high transferability. Empirical results on the MNIST and the CIFAR10 datasets show that SPA exhibits strong attack ability in both the white-box and black-box setting even defenses are applied. Moreover, with the integration of PGD or CW attack, its attack ability escalates sharply under the white-box setting, without losing the outstanding transferability inherited from SPA.
Tasks	Image Classification
Published	2019-10-22
URL	https://arxiv.org/abs/1910.09821v2
PDF	https://arxiv.org/pdf/1910.09821v2.pdf
PWC	https://paperswithcode.com/paper/structure-matters-towards-generating
Repo
Framework


Title	Automatic Conditional Generation of Personalized Social Media Short Texts
Authors	Ziwen Wang, Jie Wang, Haiqian Gu, Fei Su, Bojin Zhuang
Abstract	Automatic text generation has received much attention owing to rapid development of deep neural networks. In general, text generation systems based on statistical language model will not consider anthropomorphic characteristics, which results in machine-like generated texts. To fill the gap, we propose a conditional language generation model with Big Five Personality (BFP) feature vectors as input context, which writes human-like short texts. The short text generator consists of a layer of long short memory network (LSTM), where a BFP feature vector is concatenated as one part of input for each cell. To enable supervised training generation model, a text classification model based convolution neural network (CNN) has been used to prepare BFP-tagged Chinese micro-blog corpora. Validated by a BFP linguistic computational model, our generated Chinese short texts exhibit discriminative personality styles, which are also syntactically correct and semantically smooth with appropriate emoticons. With combination of natural language generation with psychological linguistics, our proposed BFP-dependent text generation model can be widely used for individualization in machine translation, image caption, dialogue generation and so on.
Tasks	Dialogue Generation, Language Modelling, Machine Translation, Text Classification, Text Generation
Published	2019-06-15
URL	https://arxiv.org/abs/1906.09324v1
PDF	https://arxiv.org/pdf/1906.09324v1.pdf
PWC	https://paperswithcode.com/paper/automatic-conditional-generation-of
Repo
Framework

Deep learning reconstruction of ultrashort pulses from 2D spatial intensity patterns recorded by an all-in-line system in a single-shot


Title	Deep learning reconstruction of ultrashort pulses from 2D spatial intensity patterns recorded by an all-in-line system in a single-shot
Authors	Ron Ziv, Alex Dikopoltsev, Tom Zahavy, Ittai Rubinstein, Pavel Sidorenko, Oren Cohen, Mordechai Segev
Abstract	We propose a simple all-in-line single-shot scheme for diagnostics of ultrashort laser pulses, consisting of a multi-mode fiber, a nonlinear crystal and a CCD camera. The system records a 2D spatial intensity pattern, from which the pulse shape (amplitude and phase) are recovered, through a fast Deep Learning algorithm. We explore this scheme in simulations and demonstrate the recovery of ultrashort pulses, robustness to noise in measurements and to inaccuracies in the parameters of the system components. Our technique mitigates the need for commonly used iterative optimization reconstruction methods, which are usually slow and hampered by the presence of noise. These features make our concept system advantageous for real time probing of ultrafast processes and noisy conditions. Moreover, this work exemplifies that using deep learning we can unlock new types of systems for pulse recovery.
Tasks
Published	2019-11-23
URL	https://arxiv.org/abs/1911.10326v1
PDF	https://arxiv.org/pdf/1911.10326v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-reconstruction-of-ultrashort
Repo
Framework

Machine Learning in Least-Squares Monte Carlo Proxy Modeling of Life Insurance Companies


Title	Machine Learning in Least-Squares Monte Carlo Proxy Modeling of Life Insurance Companies
Authors	Anne-Sophie Krah, Zoran Nikolić, Ralf Korn
Abstract	Under the Solvency II regime, life insurance companies are asked to derive their solvency capital requirements from the full loss distributions over the coming year. Since the industry is currently far from being endowed with sufficient computational capacities to fully simulate these distributions, the insurers have to rely on suitable approximation techniques such as the least-squares Monte Carlo (LSMC) method. The key idea of LSMC is to run only a few wisely selected simulations and to process their output further to obtain a risk-dependent proxy function of the loss. In this paper, we present and analyze various adaptive machine learning approaches that can take over the proxy modeling task. The studied approaches range from ordinary and generalized least-squares regression variants over GLM and GAM methods to MARS and kernel regression routines. We justify the combinability of their regression ingredients in a theoretical discourse. Further, we illustrate the approaches in slightly disguised real-world experiments and perform comprehensive out-of-sample tests.
Tasks
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02182v1
PDF	https://arxiv.org/pdf/1909.02182v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-in-least-squares-monte-carlo
Repo
Framework

Potential adversarial samples for white-box attacks


Title	Potential adversarial samples for white-box attacks
Authors	Amir Nazemi, Paul Fieguth
Abstract	Deep convolutional neural networks can be highly vulnerable to small perturbations of their inputs, potentially a major issue or limitation on system robustness when using deep networks as classifiers. In this paper we propose a low-cost method to explore marginal sample data near trained classifier decision boundaries, thus identifying potential adversarial samples. By finding such adversarial samples it is possible to reduce the search space of adversarial attack algorithms while keeping a reasonable successful perturbation rate. In our developed strategy, the potential adversarial samples represent only 61% of the test data, but in fact cover more than 82% of the adversarial samples produced by iFGSM and 92% of the adversarial samples successfully perturbed by DeepFool on CIFAR10.
Tasks	Adversarial Attack
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06409v1
PDF	https://arxiv.org/pdf/1912.06409v1.pdf
PWC	https://paperswithcode.com/paper/potential-adversarial-samples-for-white-box
Repo
Framework

Performance Monitoring for End-to-End Speech Recognition


Title	Performance Monitoring for End-to-End Speech Recognition
Authors	Ruizhi Li, Gregory Sell, Hynek Hermansky
Abstract	Measuring performance of an automatic speech recognition (ASR) system without ground-truth could be beneficial in many scenarios, especially with data from unseen domains, where performance can be highly inconsistent. In conventional ASR systems, several performance monitoring (PM) techniques have been well-developed to monitor performance by looking at tri-phone posteriors or pre-softmax activations from neural network acoustic modeling. However, strategies for monitoring more recently developed end-to-end ASR systems have not yet been explored, and so that is the focus of this paper. We adapt previous PM measures (Entropy, M-measure and Auto-encoder) and apply our proposed RNN predictor in the end-to-end setting. These measures utilize the decoder output layer and attention probability vectors, and their predictive power is measured with simple linear models. Our findings suggest that decoder-level features are more feasible and informative than attention-level probabilities for PM measures, and that M-measure on the decoder posteriors achieves the best overall predictive performance with an average prediction error 8.8%. Entropy measures and RNN-based prediction also show competitive predictability, especially for unseen conditions.
Tasks	End-To-End Speech Recognition, Speech Recognition
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04896v1
PDF	http://arxiv.org/pdf/1904.04896v1.pdf
PWC	https://paperswithcode.com/paper/performance-monitoring-for-end-to-end-speech
Repo
Framework