April 3, 2020

3203 words 16 mins read

Paper Group ANR 49

Large Scale Many-Objective Optimization Driven by Distributional Adversarial Networks. Text-based inference of moral sentiment change. Conjugate-gradient-based Adam for stochastic optimization and its application to deep learning. Non-Asymptotic Bounds for Zeroth-Order Stochastic Optimization. Human-to-Robot Attention Transfer for Robot Execution F …

Large Scale Many-Objective Optimization Driven by Distributional Adversarial Networks


Title	Large Scale Many-Objective Optimization Driven by Distributional Adversarial Networks
Authors	Zhenyu Liang, Yunfan Li, Zhongwei Wan
Abstract	Estimation of distribution algorithms (EDA) as one of the EAs is a stochastic optimization problem which establishes a probability model to describe the distribution of solutions and randomly samples the probability model to create offspring and optimize model and population. Reference Vector Guided Evolutionary (RVEA) based on the EDA framework, having a better performance to solve MaOPs. Besides, using the generative adversarial networks to generate offspring solutions is also a state-of-art thought in EAs instead of crossover and mutation. In this paper, we will propose a novel algorithm based on RVEA[1] framework and using Distributional Adversarial Networks (DAN) [2]to generate new offspring. DAN uses a new distributional framework for adversarial training of neural networks and operates on genuine samples rather than a single point because the framework also leads to more stable training and extraordinarily better mode coverage compared to single-point-sample methods. Thereby, DAN can quickly generate offspring with high convergence regarding the same distribution of data. In addition, we also use Large-Scale Multi-Objective Optimization Based on A Competitive Swarm Optimizer (LMOCSO)[3] to adopts a new two-stage strategy to update the position in order to significantly increase the search efficiency to find optimal solutions in huge decision space. The propose new algorithm will be tested on 9 benchmark problems in Large scale multi-objective problems (LSMOP). To measure the performance, we will compare our proposal algorithm with some state-of-art EAs e.g., RM-MEDA[4], MO-CMA[10] and NSGA-II.
Tasks	Stochastic Optimization
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07013v1
PDF	https://arxiv.org/pdf/2003.07013v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-many-objective-optimization
Repo
Framework

Text-based inference of moral sentiment change


Title	Text-based inference of moral sentiment change
Authors	Jing Yi Xie, Renato Ferreira Pinto Jr., Graeme Hirst, Yang Xu
Abstract	We present a text-based framework for investigating moral sentiment change of the public via longitudinal corpora. Our framework is based on the premise that language use can inform people’s moral perception toward right or wrong, and we build our methodology by exploring moral biases learned from diachronic word embeddings. We demonstrate how a parameter-free model supports inference of historical shifts in moral sentiment toward concepts such as slavery and democracy over centuries at three incremental levels: moral relevance, moral polarity, and fine-grained moral dimensions. We apply this methodology to visualizing moral time courses of individual concepts and analyzing the relations between psycholinguistic variables and rates of moral sentiment change at scale. Our work offers opportunities for applying natural language processing toward characterizing moral sentiment change in society.
Tasks	Word Embeddings
Published	2020-01-20
URL	https://arxiv.org/abs/2001.07209v1
PDF	https://arxiv.org/pdf/2001.07209v1.pdf
PWC	https://paperswithcode.com/paper/text-based-inference-of-moral-sentiment-1
Repo
Framework

Conjugate-gradient-based Adam for stochastic optimization and its application to deep learning


Title	Conjugate-gradient-based Adam for stochastic optimization and its application to deep learning
Authors	Yu Kobayashi, Hideaki Iiduka
Abstract	This paper proposes a conjugate-gradient-based Adam algorithm blending Adam with nonlinear conjugate gradient methods and shows its convergence analysis. Numerical experiments on text classification and image classification show that the proposed algorithm can train deep neural network models in fewer epochs than the existing adaptive stochastic optimization algorithms can.
Tasks	Image Classification, Stochastic Optimization, Text Classification
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00231v2
PDF	https://arxiv.org/pdf/2003.00231v2.pdf
PWC	https://paperswithcode.com/paper/conjugate-gradient-based-adam-for-stochastic
Repo
Framework

Non-Asymptotic Bounds for Zeroth-Order Stochastic Optimization


Title	Non-Asymptotic Bounds for Zeroth-Order Stochastic Optimization
Authors	Nirav Bhavsar, Prashanth L A
Abstract	We consider the problem of optimizing an objective function with and without convexity in a simulation-optimization context, where only stochastic zeroth-order information is available. We consider two techniques for estimating gradient/Hessian, namely simultaneous perturbation (SP) and Gaussian smoothing (GS). We introduce an optimization oracle to capture a setting where the function measurements have an estimation error that can be controlled. Our oracle is appealing in several practical contexts where the objective has to be estimated from i.i.d. samples, and increasing the number of samples reduces the estimation error. In the stochastic non-convex optimization context, we analyze the zeroth-order variant of the randomized stochastic gradient (RSG) and quasi-Newton (RSQN) algorithms with a biased gradient/Hessian oracle, and with its variant involving an estimation error component. In particular, we provide non-asymptotic bounds on the performance of both algorithms, and our results provide a guideline for choosing the batch size for estimation, so that the overall error bound matches with the one obtained when there is no estimation error. Next, in the stochastic convex optimization setting, we provide non-asymptotic bounds that hold in expectation for the last iterate of a stochastic gradient descent (SGD) algorithm, and our bound for the GS variant of SGD matches the bound for SGD with unbiased gradient information. We perform simulation experiments on synthetic as well as real-world datasets, and the empirical results validate the theoretical findings.
Tasks	Stochastic Optimization
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11440v1
PDF	https://arxiv.org/pdf/2002.11440v1.pdf
PWC	https://paperswithcode.com/paper/non-asymptotic-bounds-for-zeroth-order
Repo
Framework

Human-to-Robot Attention Transfer for Robot Execution Failure Avoidance Using Stacked Neural Networks


Title	Human-to-Robot Attention Transfer for Robot Execution Failure Avoidance Using Stacked Neural Networks
Authors	Boyi Song, Yuntao Peng, Ruijiao Luo, Rui Liu
Abstract	Due to world dynamics and hardware uncertainty, robots inevitably fail in task executions, leading to undesired or even dangerous executions. To avoid failures for improved robot performance, it is critical to identify and correct robot abnormal executions in an early stage. However, limited by reasoning capability and knowledge level, it is challenging for a robot to self diagnose and correct their abnormal behaviors. To solve this problem, a novel method is proposed, human-to-robot attention transfer (H2R-AT) to seek help from a human. H2R-AT is developed based on a novel stacked neural networks model, transferring human attention embedded in verbal reminders to robot attention embedded in robot visual perceiving. With the attention transfer from a human, a robot understands what and where human concerns are to identify and correct its abnormal executions. To validate the effectiveness of H2R-AT, two representative task scenarios, “serve water for a human in a kitchen” and “pick up a defective gear in a factory” with abnormal robot executions, were designed in an open-access simulation platform V-REP; $252$ volunteers were recruited to provide about 12000 verbal reminders to learn and test the attention transfer model H2R-AT. With an accuracy of $73.68%$ in transferring attention and accuracy of $66.86%$ in avoiding robot execution failures, the effectiveness of H2R-AT was validated.
Tasks
Published	2020-02-11
URL	https://arxiv.org/abs/2002.04242v1
PDF	https://arxiv.org/pdf/2002.04242v1.pdf
PWC	https://paperswithcode.com/paper/human-to-robot-attention-transfer-for-robot
Repo
Framework

A deep-learning view of chemical space designed to facilitate drug discovery


Title	A deep-learning view of chemical space designed to facilitate drug discovery
Authors	Paul Maragakis, Hunter Nisonoff, Brian Cole, David E. Shaw
Abstract	Drug discovery projects entail cycles of design, synthesis, and testing that yield a series of chemically related small molecules whose properties, such as binding affinity to a given target protein, are progressively tailored to a particular drug discovery goal. The use of deep learning technologies could augment the typical practice of using human intuition in the design cycle, and thereby expedite drug discovery projects. Here we present DESMILES, a deep neural network model that advances the state of the art in machine learning approaches to molecular design. We applied DESMILES to a previously published benchmark that assesses the ability of a method to modify input molecules to inhibit the dopamine receptor D2, and DESMILES yielded a 77% lower failure rate compared to state-of-the-art models. To explain the ability of DESMILES to hone molecular properties, we visualize a layer of the DESMILES network, and further demonstrate this ability by using DESMILES to tailor the same molecules used in the D2 benchmark test to dock more potently against seven different receptors.
Tasks	Drug Discovery
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02948v1
PDF	https://arxiv.org/pdf/2002.02948v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-view-of-chemical-space
Repo
Framework

Optimizing Geometry Compression using Quantum Annealing


Title	Optimizing Geometry Compression using Quantum Annealing
Authors	Sebastian Feld, Markus Friedrich, Claudia Linnhoff-Popien
Abstract	The compression of geometry data is an important aspect of bandwidth-efficient data transfer for distributed 3d computer vision applications. We propose a quantum-enabled lossy 3d point cloud compression pipeline based on the constructive solid geometry (CSG) model representation. Key parts of the pipeline are mapped to NP-complete problems for which an efficient Ising formulation suitable for the execution on a Quantum Annealer exists. We describe existing Ising formulations for the maximum clique search problem and the smallest exact cover problem, both of which are important building blocks of the proposed compression pipeline. Additionally, we discuss the properties of the overall pipeline regarding result optimality and described Ising formulations.
Tasks
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13253v1
PDF	https://arxiv.org/pdf/2003.13253v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-geometry-compression-using-quantum
Repo
Framework

Semi-supervised Disentanglement with Independent Vector Variational Autoencoders


Title	Semi-supervised Disentanglement with Independent Vector Variational Autoencoders
Authors	Bo-Kyeong Kim, Sungjin Park, Geonmin Kim, Soo-Young Lee
Abstract	We aim to separate the generative factors of data into two latent vectors in a variational autoencoder. One vector captures class factors relevant to target classification tasks, while the other vector captures style factors relevant to the remaining information. To learn the discrete class features, we introduce supervision using a small amount of labeled data, which can simply yet effectively reduce the effort required for hyperparameter tuning performed in existing unsupervised methods. Furthermore, we introduce a learning objective to encourage statistical independence between the vectors. We show that (i) this vector independence term exists within the result obtained on decomposing the evidence lower bound with multiple latent vectors, and (ii) encouraging such independence along with reducing the total correlation within the vectors enhances disentanglement performance. Experiments conducted on several image datasets demonstrate that the disentanglement achieved via our method can improve classification performance and generation controllability.
Tasks
Published	2020-03-14
URL	https://arxiv.org/abs/2003.06581v1
PDF	https://arxiv.org/pdf/2003.06581v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-disentanglement-with
Repo
Framework

Convergence analysis of particle swarm optimization using stochastic Lyapunov functions and quantifier elimination


Title	Convergence analysis of particle swarm optimization using stochastic Lyapunov functions and quantifier elimination
Authors	Maximilian Gerwien, Rick Voßwinkel, Hendrik Richter
Abstract	This paper adds to the discussion about theoretical aspects of particle swarm stability by proposing to employ stochastic Lyapunov functions and to determine the convergence set by quantifier elimination. We present a computational procedure and show that this approach leads to reevaluation and extension of previously know stability regions for PSO using a Lyapunov approach under stagnation assumptions.
Tasks
Published	2020-02-05
URL	https://arxiv.org/abs/2002.01673v1
PDF	https://arxiv.org/pdf/2002.01673v1.pdf
PWC	https://paperswithcode.com/paper/convergence-analysis-of-particle-swarm
Repo
Framework

Maximum likelihood estimation and uncertainty quantification for Gaussian process approximation of deterministic functions


Title	Maximum likelihood estimation and uncertainty quantification for Gaussian process approximation of deterministic functions
Authors	Toni Karvonen, George Wynne, Filip Tronarp, Chris J. Oates, Simo Särkkä
Abstract	Despite the ubiquity of the Gaussian process regression model, few theoretical results are available that account for the fact that parameters of the covariance kernel typically need to be estimated from the dataset. This article provides one of the first theoretical analyses in the context of Gaussian process regression with a noiseless dataset. Specifically, we consider the scenario where the scale parameter of a Sobolev kernel (such as a Mat'ern kernel) is estimated by maximum likelihood. We show that the maximum likelihood estimation of the scale parameter alone provides significant adaptation against misspecification of the Gaussian process model in the sense that the model can become “slowly” overconfident at worst, regardless of the difference between the smoothness of the data-generating function and that expected by the model. The analysis is based on a combination of techniques from nonparametric regression and scattered data interpolation. Empirical results are provided in support of the theoretical findings.
Tasks
Published	2020-01-29
URL	https://arxiv.org/abs/2001.10965v2
PDF	https://arxiv.org/pdf/2001.10965v2.pdf
PWC	https://paperswithcode.com/paper/maximum-likelihood-estimation-and-uncertainty
Repo
Framework

Estimation of Z-Thickness and XY-Anisotropy of Electron Microscopy Images using Gaussian Processes


Title	Estimation of Z-Thickness and XY-Anisotropy of Electron Microscopy Images using Gaussian Processes
Authors	Thanuja D. Ambegoda, Julien N. P. Martel, Jozef Adamcik, Matthew Cook, Richard H. R. Hahnloser
Abstract	Serial section electron microscopy (ssEM) is a widely used technique for obtaining volumetric information of biological tissues at nanometer scale. However, accurate 3D reconstructions of identified cellular structures and volumetric quantifications require precise estimates of section thickness and anisotropy (or stretching) along the XY imaging plane. In fact, many image processing algorithms simply assume isotropy within the imaging plane. To ameliorate this problem, we present a method for estimating thickness and stretching of electron microscopy sections using non-parametric Bayesian regression of image statistics. We verify our thickness and stretching estimates using direct measurements obtained by atomic force microscopy (AFM) and show that our method has a lower estimation error compared to a recent indirect thickness estimation method as well as a relative Z coordinate estimation method. Furthermore, we have made the first dataset of ssSEM images with directly measured section thickness values publicly available for the evaluation of indirect thickness estimation methods.
Tasks	Gaussian Processes
Published	2020-02-01
URL	https://arxiv.org/abs/2002.00228v2
PDF	https://arxiv.org/pdf/2002.00228v2.pdf
PWC	https://paperswithcode.com/paper/estimation-of-z-thickness-and-xy-anisotropy
Repo
Framework

Dimensionality Reduction and Motion Clustering during Activities of Daily Living: 3, 4, and 7 Degree-of-Freedom Arm Movements


Title	Dimensionality Reduction and Motion Clustering during Activities of Daily Living: 3, 4, and 7 Degree-of-Freedom Arm Movements
Authors	Yuri Gloumakov, Adam J. Spiers, Aaron M. Dollar
Abstract	The wide variety of motions performed by the human arm during daily tasks makes it desirable to find representative subsets to reduce the dimensionality of these movements for a variety of applications, including the design and control of robotic and prosthetic devices. This paper presents a novel method and the results of an extensive human subjects study to obtain representative arm joint angle trajectories that span naturalistic motions during Activities of Daily Living (ADLs). In particular, we seek to identify sets of useful motion trajectories of the upper limb that are functions of a single variable, allowing, for instance, an entire prosthetic or robotic arm to be controlled with a single input from a user, along with a means to select between motions for different tasks. Data driven approaches are used to obtain clusters as well as representative motion averages for the full-arm 7 degree of freedom (DOF), elbow-wrist 4 DOF, and wrist-only 3 DOF motions. The proposed method makes use of well-known techniques such as dynamic time warping (DTW) to obtain a divergence measure between motion segments, DTW barycenter averaging (DBA) to obtain averages, Ward’s distance criterion to build hierarchical trees, batch-DTW to simultaneously align multiple motion data, and functional principal component analysis (fPCA) to evaluate cluster variability. The clusters that emerge associate various recorded motions into primarily hand start and end location for the full-arm system, motion direction for the wrist-only system, and an intermediate between the two qualities for the elbow-wrist system. The proposed clustering methodology is justified by comparing results against alternative approaches.
Tasks	Dimensionality Reduction
Published	2020-02-17
URL	https://arxiv.org/abs/2003.02641v1
PDF	https://arxiv.org/pdf/2003.02641v1.pdf
PWC	https://paperswithcode.com/paper/dimensionality-reduction-and-motion
Repo
Framework

Activation Density driven Energy-Efficient Pruning in Training


Title	Activation Density driven Energy-Efficient Pruning in Training
Authors	Timothy Foldy-Porto, Priyadarshini Panda
Abstract	The process of neural network pruning with suitable fine-tuning and retraining can yield networks with considerably fewer parameters than the original with comparable degrees of accuracy. Typically, pruning methods require large, pre-trained networks as a starting point from which they perform a time-intensive iterative pruning and retraining algorithm. We propose a novel pruning in-training method that prunes a network real-time during training, reducing the overall training time to achieve an optimal compressed network. To do so, we introduce an activation density based analysis that identifies the optimal relative sizing or compression for each layer of the network. Our method removes the need for pre-training and is architecture agnostic, allowing it to be employed on a wide variety of systems. For VGG-19 and ResNet18 on CIFAR-10, CIFAR-100, and TinyImageNet, we obtain exceedingly sparse networks (up to 200x reduction in parameters and >60x reduction in inference compute operations in the best case) with comparable accuracies (up to 2%-3% loss with respect to the baseline network). By reducing the network size periodically during training, we achieve total training times that are shorter than those of previously proposed pruning methods. Furthermore, training compressed networks at different epochs with our proposed method yields considerable reduction in training compute complexity (1.6x -3.2x lower) at near iso-accuracy as compared to a baseline network trained entirely from scratch.
Tasks	Network Pruning
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02949v1
PDF	https://arxiv.org/pdf/2002.02949v1.pdf
PWC	https://paperswithcode.com/paper/activation-density-driven-energy-efficient
Repo
Framework

PointINS: Point-based Instance Segmentation


Title	PointINS: Point-based Instance Segmentation
Authors	Lu Qi, Xiangyu Zhang, Yingcong Chen, Yukang Chen, Jian Sun, Jiaya Jia
Abstract	A single-point feature has shown its effectiveness in object detection. However, for instance segmentation, it does not lead to satisfactory results. The reasons are two folds. Firstly, it has limited representation capacity. Secondly, it could be misaligned with potential instances. To address the above issues, we propose a new point-based framework, namely PointINS, to segment instances from single points. The core module of our framework is instance-aware convolution, including the instance-agnostic feature and instance-aware weights. Instance-agnostic feature for each Point-of-Interest (PoI) serves as a template for potential instance masks. In this way, instance-aware features are computed by convolving this template with instance-aware weights for following mask prediction. Given the independence of instance-aware convolution, PointINS is general and practical as a one-stage detector for anchor-based and anchor-free frameworks. In our extensive experiments, we show the effectiveness of our framework on RetinaNet and FCOS. With ResNet101 backbone, PointINS achieves 38.3 mask mAP on challenging COCO dataset, outperforming its competitors by a large margin. The code will be made publicly available.
Tasks	Instance Segmentation, Object Detection, Semantic Segmentation
Published	2020-03-13
URL	https://arxiv.org/abs/2003.06148v1
PDF	https://arxiv.org/pdf/2003.06148v1.pdf
PWC	https://paperswithcode.com/paper/pointins-point-based-instance-segmentation
Repo
Framework

Pruning CNN’s with linear filter ensembles


Title	Pruning CNN’s with linear filter ensembles
Authors	Csanád Sándor, Szabolcs Pável, Lehel Csató
Abstract	Despite the promising results of convolutional neural networks (CNNs), their application on devices with limited resources is still a big challenge; this is mainly due to the huge memory and computation requirements of the CNN. To counter the limitation imposed by the network size, we use pruning to reduce the network size and – implicitly – the number of floating point operations (FLOPs). Contrary to the filter norm method – used in `conventional` network pruning – based on the assumption that a smaller norm implies ``less importance’’ to its associated component, we develop a novel filter importance norm that is based on the change in the empirical loss caused by the presence or removal of a component from the network architecture. Since there are too many individual possibilities for filter configuration, we repeatedly sample from these architectural components and measure the system performance in the respective state of components being active or disabled. The result is a collection of filter ensembles – filter masks – and associated performance values. We rank the filters based on a linear and additive model and remove the least important ones such that the drop in network accuracy is minimal. We evaluate our method on a fully connected network, as well as on the ResNet architecture trained on the CIFAR-10 dataset. Using our pruning method, we managed to remove $60%$ of the parameters and $64%$ of the FLOPs from the ResNet with an accuracy drop of less than $0.6%$. \|
Tasks	Network Pruning
Published	2020-01-22
URL	https://arxiv.org/abs/2001.08142v2
PDF	https://arxiv.org/pdf/2001.08142v2.pdf
PWC	https://paperswithcode.com/paper/pruning-cnns-with-linear-filter-ensembles
Repo
Framework