October 17, 2019

3108 words 15 mins read

Paper Group ANR 713

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property. Recycled ADMM: Improve Privacy and Accuracy with Less Computation in Distributed Algorithms. Estimator of Prediction Error Based on Approximate Message Passing for Penalized Linear Regression. 3D Convolution on RGB-D Point Clouds for Accurate Model-free Object Pose Es …

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property


Title	Convergence of Cubic Regularization for Nonconvex Optimization under KL Property
Authors	Yi Zhou, Zhe Wang, Yingbin Liang
Abstract	Cubic-regularized Newton’s method (CR) is a popular algorithm that guarantees to produce a second-order stationary solution for solving nonconvex optimization problems. However, existing understandings of the convergence rate of CR are conditioned on special types of geometrical properties of the objective function. In this paper, we explore the asymptotic convergence rate of CR by exploiting the ubiquitous Kurdyka-Lojasiewicz (KL) property of nonconvex objective functions. In specific, we characterize the asymptotic convergence rate of various types of optimality measures for CR including function value gap, variable distance gap, gradient norm and least eigenvalue of the Hessian matrix. Our results fully characterize the diverse convergence behaviors of these optimality measures in the full parameter regime of the KL property. Moreover, we show that the obtained asymptotic convergence rates of CR are order-wise faster than those of first-order gradient descent algorithms under the KL property.
Tasks
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07382v1
PDF	http://arxiv.org/pdf/1808.07382v1.pdf
PWC	https://paperswithcode.com/paper/convergence-of-cubic-regularization-for
Repo
Framework

Recycled ADMM: Improve Privacy and Accuracy with Less Computation in Distributed Algorithms


Title	Recycled ADMM: Improve Privacy and Accuracy with Less Computation in Distributed Algorithms
Authors	Xueru Zhang, Mohammad Mahdi Khalili, Mingyan Liu
Abstract	Alternating direction method of multiplier (ADMM) is a powerful method to solve decentralized convex optimization problems. In distributed settings, each node performs computation with its local data and the local results are exchanged among neighboring nodes in an iterative fashion. During this iterative process the leakage of data privacy arises and can accumulate significantly over many iterations, making it difficult to balance the privacy-utility tradeoff. In this study we propose Recycled ADMM (R-ADMM), where a linear approximation is applied to every even iteration, its solution directly calculated using only results from the previous, odd iteration. It turns out that under such a scheme, half of the updates incur no privacy loss and require much less computation compared to the conventional ADMM. We obtain a sufficient condition for the convergence of R-ADMM and provide the privacy analysis based on objective perturbation.
Tasks
Published	2018-10-07
URL	http://arxiv.org/abs/1810.03197v1
PDF	http://arxiv.org/pdf/1810.03197v1.pdf
PWC	https://paperswithcode.com/paper/recycled-admm-improve-privacy-and-accuracy
Repo
Framework

Estimator of Prediction Error Based on Approximate Message Passing for Penalized Linear Regression


Title	Estimator of Prediction Error Based on Approximate Message Passing for Penalized Linear Regression
Authors	Ayaka Sakata
Abstract	We propose an estimator of prediction error using an approximate message passing (AMP) algorithm that can be applied to a broad range of sparse penalties. Following Stein’s lemma, the estimator of the generalized degrees of freedom, which is a key quantity for the construction of the estimator of the prediction error, is calculated at the AMP fixed point. The resulting form of the AMP-based estimator does not depend on the penalty function, and its value can be further improved by considering the correlation between predictors. The proposed estimator is asymptotically unbiased when the components of the predictors and response variables are independently generated according to a Gaussian distribution. We examine the behaviour of the estimator for real data under nonconvex sparse penalties, where Akaike’s information criterion does not correspond to an unbiased estimator of the prediction error. The model selected by the proposed estimator is close to that which minimizes the true prediction error.
Tasks
Published	2018-02-20
URL	http://arxiv.org/abs/1802.06939v2
PDF	http://arxiv.org/pdf/1802.06939v2.pdf
PWC	https://paperswithcode.com/paper/estimator-of-prediction-error-based-on
Repo
Framework

3D Convolution on RGB-D Point Clouds for Accurate Model-free Object Pose Estimation


Title	3D Convolution on RGB-D Point Clouds for Accurate Model-free Object Pose Estimation
Authors	Zhongang Cai, Cunjun Yu, Quang-Cuong Pham
Abstract	The conventional pose estimation of a 3D object usually requires the knowledge of the 3D model of the object. Even with the recent development in convolutional neural networks (CNNs), a 3D model is often necessary in the final estimation. In this paper, we propose a two-stage pipeline that takes in raw colored point cloud data and estimates an object’s translation and rotation by running 3D convolutions on voxels. The pipeline is simple yet highly accurate: translation error is reduced to the voxel resolution (around 1 cm) and rotation error is around 5 degrees. The pipeline is also put to actual robotic grasping tests where it achieves above 90% success rate for test objects. Another innovation is that a motion capture system is used to automatically label the point cloud samples which makes it possible to rapidly collect a large amount of highly accurate real data for training the neural networks.
Tasks	Motion Capture, Pose Estimation, Robotic Grasping
Published	2018-12-29
URL	http://arxiv.org/abs/1812.11284v1
PDF	http://arxiv.org/pdf/1812.11284v1.pdf
PWC	https://paperswithcode.com/paper/3d-convolution-on-rgb-d-point-clouds-for
Repo
Framework

Neural Music Synthesis for Flexible Timbre Control


Title	Neural Music Synthesis for Flexible Timbre Control
Authors	Jong Wook Kim, Rachel Bittner, Aparna Kumar, Juan Pablo Bello
Abstract	The recent success of raw audio waveform synthesis models like WaveNet motivates a new approach for music synthesis, in which the entire process — creating audio samples from a score and instrument information — is modeled using generative neural networks. This paper describes a neural music synthesis model with flexible timbre controls, which consists of a recurrent neural network conditioned on a learned instrument embedding followed by a WaveNet vocoder. The learned embedding space successfully captures the diverse variations in timbres within a large dataset and enables timbre control and morphing by interpolating between instruments in the embedding space. The synthesis quality is evaluated both numerically and perceptually, and an interactive web demo is presented.
Tasks
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00223v1
PDF	http://arxiv.org/pdf/1811.00223v1.pdf
PWC	https://paperswithcode.com/paper/neural-music-synthesis-for-flexible-timbre
Repo
Framework

Real-Time, Highly Accurate Robotic Grasp Detection using Fully Convolutional Neural Network with Rotation Ensemble Module


Title	Real-Time, Highly Accurate Robotic Grasp Detection using Fully Convolutional Neural Network with Rotation Ensemble Module
Authors	Dongwon Park, Yonghyeok Seo, Se Young Chun
Abstract	Rotation invariance has been an important topic in computer vision tasks. Ideally, robot grasp detection should be rotation-invariant. However, rotation-invariance in robotic grasp detection has been only recently studied by using rotation anchor box that are often time-consuming and unreliable for multiple objects. In this paper, we propose a rotation ensemble module (REM) for robotic grasp detection using convolutions that rotates network weights. Our proposed REM was able to outperform current state-of-the-art methods by achieving up to 99.2% (image-wise), 98.6% (object-wise) accuracies on the Cornell dataset with real-time computation (50 frames per second). Our proposed method was also able to yield reliable grasps for multiple objects and up to 93.8% success rate for the real-time robotic grasping task with a 4-axis robot arm for small novel objects that was significantly higher than the baseline methods by 11-56%.
Tasks	Face Detection, Robotic Grasping
Published	2018-12-19
URL	https://arxiv.org/abs/1812.07762v3
PDF	https://arxiv.org/pdf/1812.07762v3.pdf
PWC	https://paperswithcode.com/paper/rotation-ensemble-module-for-detecting
Repo
Framework

Recent Advances in Convolutional Neural Network Acceleration


Title	Recent Advances in Convolutional Neural Network Acceleration
Authors	Qianru Zhang, Meng Zhang, Tinghuan Chen, Zhifei Sun, Yuzhe Ma, Bei Yu
Abstract	In recent years, convolutional neural networks (CNNs) have shown great performance in various fields such as image classification, pattern recognition, and multi-media compression. Two of the feature properties, local connectivity and weight sharing, can reduce the number of parameters and increase processing speed during training and inference. However, as the dimension of data becomes higher and the CNN architecture becomes more complicated, the end-to-end approach or the combined manner of CNN is computationally intensive, which becomes limitation to CNN’s further implementation. Therefore, it is necessary and urgent to implement CNN in a faster way. In this paper, we first summarize the acceleration methods that contribute to but not limited to CNN by reviewing a broad variety of research papers. We propose a taxonomy in terms of three levels, i.e.~structure level, algorithm level, and implementation level, for acceleration methods. We also analyze the acceleration methods in terms of CNN architecture compression, algorithm optimization, and hardware-based improvement. At last, we give a discussion on different perspectives of these acceleration and optimization methods within each level. The discussion shows that the methods in each level still have large exploration space. By incorporating such a wide range of disciplines, we expect to provide a comprehensive reference for researchers who are interested in CNN acceleration.
Tasks	Image Classification
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08596v1
PDF	http://arxiv.org/pdf/1807.08596v1.pdf
PWC	https://paperswithcode.com/paper/recent-advances-in-convolutional-neural
Repo
Framework

Newton: Gravitating Towards the Physical Limits of Crossbar Acceleration


Title	Newton: Gravitating Towards the Physical Limits of Crossbar Acceleration
Authors	Anirban Nag, Ali Shafiee, Rajeev Balasubramonian, Vivek Srikumar, Naveen Muralimanohar
Abstract	Many recent works have designed accelerators for Convolutional Neural Networks (CNNs). While digital accelerators have relied on near data processing, analog accelerators have further reduced data movement by performing in-situ computation. Recent works take advantage of highly parallel analog in-situ computation in memristor crossbars to accelerate the many vector-matrix multiplication operations in CNNs. However, these in-situ accelerators have two significant short-comings that we address in this work. First, the ADCs account for a large fraction of chip power and area. Second, these accelerators adopt a homogeneous design where every resource is provisioned for the worst case. By addressing both problems, the new architecture, Newton, moves closer to achieving optimal energy-per-neuron for crossbar accelerators. We introduce multiple new techniques that apply at different levels of the tile hierarchy. Two of the techniques leverage heterogeneity: one adapts ADC precision based on the requirements of every sub-computation (with zero impact on accuracy), and the other designs tiles customized for convolutions or classifiers. Two other techniques rely on divide-and-conquer numeric algorithms to reduce computations and ADC pressure. Finally, we place constraints on how a workload is mapped to tiles, thus helping reduce resource provisioning in tiles. For a wide range of CNN dataflows and structures, Newton achieves a 77% decrease in power, 51% improvement in energy efficiency, and 2.2x higher throughput/area, relative to the state-of-the-art ISAAC accelerator.
Tasks
Published	2018-03-10
URL	http://arxiv.org/abs/1803.06913v1
PDF	http://arxiv.org/pdf/1803.06913v1.pdf
PWC	https://paperswithcode.com/paper/newton-gravitating-towards-the-physical
Repo
Framework

Dynamic Optimization of Neural Network Structures Using Probabilistic Modeling


Title	Dynamic Optimization of Neural Network Structures Using Probabilistic Modeling
Authors	Shinichi Shirakawa, Yasushi Iwata, Youhei Akimoto
Abstract	Deep neural networks (DNNs) are powerful machine learning models and have succeeded in various artificial intelligence tasks. Although various architectures and modules for the DNNs have been proposed, selecting and designing the appropriate network structure for a target problem is a challenging task. In this paper, we propose a method to simultaneously optimize the network structure and weight parameters during neural network training. We consider a probability distribution that generates network structures, and optimize the parameters of the distribution instead of directly optimizing the network structure. The proposed method can apply to the various network structure optimization problems under the same framework. We apply the proposed method to several structure optimization problems such as selection of layers, selection of unit types, and selection of connections using the MNIST, CIFAR-10, and CIFAR-100 datasets. The experimental results show that the proposed method can find the appropriate and competitive network structures.
Tasks
Published	2018-01-23
URL	http://arxiv.org/abs/1801.07650v1
PDF	http://arxiv.org/pdf/1801.07650v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-optimization-of-neural-network
Repo
Framework

Quantitative Projection Coverage for Testing ML-enabled Autonomous Systems


Title	Quantitative Projection Coverage for Testing ML-enabled Autonomous Systems
Authors	Chih-Hong Cheng, Chung-Hao Huang, Hirotoshi Yasuoka
Abstract	Systematically testing models learned from neural networks remains a crucial unsolved barrier to successfully justify safety for autonomous vehicles engineered using data-driven approach. We propose quantitative k-projection coverage as a metric to mediate combinatorial explosion while guiding the data sampling process. By assuming that domain experts propose largely independent environment conditions and by associating elements in each condition with weights, the product of these conditions forms scenarios, and one may interpret weights associated with each equivalence class as relative importance. Achieving full k-projection coverage requires that the data set, when being projected to the hyperplane formed by arbitrarily selected k-conditions, covers each class with number of data points no less than the associated weight. For the general case where scenario composition is constrained by rules, precisely computing k-projection coverage remains in NP. In terms of finding minimum test cases to achieve full coverage, we present theoretical complexity for important sub-cases and an encoding to 0-1 integer programming. We have implemented a research prototype that generates test cases for a visual object defection unit in automated driving, demonstrating the technological feasibility of our proposed coverage criterion.
Tasks	Autonomous Vehicles
Published	2018-05-11
URL	http://arxiv.org/abs/1805.04333v1
PDF	http://arxiv.org/pdf/1805.04333v1.pdf
PWC	https://paperswithcode.com/paper/quantitative-projection-coverage-for-testing
Repo
Framework

Crowd disagreement about medical images is informative


Title	Crowd disagreement about medical images is informative
Authors	Veronika Cheplygina, Josien P. W. Pluim
Abstract	Classifiers for medical image analysis are often trained with a single consensus label, based on combining labels given by experts or crowds. However, disagreement between annotators may be informative, and thus removing it may not be the best strategy. As a proof of concept, we predict whether a skin lesion from the ISIC 2017 dataset is a melanoma or not, based on crowd annotations of visual characteristics of that lesion. We compare using the mean annotations, illustrating consensus, to standard deviations and other distribution moments, illustrating disagreement. We show that the mean annotations perform best, but that the disagreement measures are still informative. We also make the crowd annotations used in this paper available at \url{https://figshare.com/s/5cbbce14647b66286544}.
Tasks
Published	2018-06-21
URL	http://arxiv.org/abs/1806.08174v2
PDF	http://arxiv.org/pdf/1806.08174v2.pdf
PWC	https://paperswithcode.com/paper/crowd-disagreement-about-medical-images-is
Repo
Framework

On Lipschitz Bounds of General Convolutional Neural Networks


Title	On Lipschitz Bounds of General Convolutional Neural Networks
Authors	Dongmian Zou, Radu Balan, Maneesh Singh
Abstract	Many convolutional neural networks (CNNs) have a feed-forward structure. In this paper, a linear program that estimates the Lipschitz bound of such CNNs is proposed. Several CNNs, including the scattering networks, the AlexNet and the GoogleNet, are studied numerically and compared to the theoretical bounds. Next, concentration inequalities of the output distribution to a stationary random input signal expressed in terms of the Lipschitz bound are established. The Lipschitz bound is further used to establish a nonlinear discriminant analysis designed to measure the separation between features of different classes.
Tasks
Published	2018-08-04
URL	http://arxiv.org/abs/1808.01415v1
PDF	http://arxiv.org/pdf/1808.01415v1.pdf
PWC	https://paperswithcode.com/paper/on-lipschitz-bounds-of-general-convolutional
Repo
Framework

Decentralized Clustering on Compressed Data without Prior Knowledge of the Number of Clusters


Title	Decentralized Clustering on Compressed Data without Prior Knowledge of the Number of Clusters
Authors	Elsa Dupraz, Dominique Pastor, François-Xavier Socheleau
Abstract	In sensor networks, it is not always practical to set up a fusion center. Therefore, there is need for fully decentralized clustering algorithms. Decentralized clustering algorithms should minimize the amount of data exchanged between sensors in order to reduce sensor energy consumption. In this respect, we propose one centralized and one decentralized clustering algorithm that work on compressed data without prior knowledge of the number of clusters. In the standard K-means clustering algorithm, the number of clusters is estimated by repeating the algorithm several times, which dramatically increases the amount of exchanged data, while our algorithm can estimate this number in one run. The proposed clustering algorithms derive from a theoretical framework establishing that, under asymptotic conditions, the cluster centroids are the only fixed-point of a cost function we introduce. This cost function depends on a weight function which we choose as the p-value of a Wald hypothesis test. This p-value measures the plausibility that a given measurement vector belongs to a given cluster. Experimental results show that our two algorithms are competitive in terms of clustering performance with respect to K-means and DB-Scan, while lowering by a factor at least $2$ the amount of data exchanged between sensors.
Tasks
Published	2018-07-12
URL	http://arxiv.org/abs/1807.04566v1
PDF	http://arxiv.org/pdf/1807.04566v1.pdf
PWC	https://paperswithcode.com/paper/decentralized-clustering-on-compressed-data
Repo
Framework

CaosDB - Research Data Management for Complex, Changing, and Automated Research Workflows


Title	CaosDB - Research Data Management for Complex, Changing, and Automated Research Workflows
Authors	Timm Fitschen, Alexander Schlemmer, Daniel Hornung, Henrik tom Wörden, Ulrich Parlitz, Stefan Luther
Abstract	Here we present CaosDB, a Research Data Management System (RDMS) designed to ensure seamless integration of inhomogeneous data sources and repositories of legacy data. Its primary purpose is the management of data from biomedical sciences, both from simulations and experiments during the complete research data lifecycle. An RDMS for this domain faces particular challenges: Research data arise in huge amounts, from a wide variety of sources, and traverse a highly branched path of further processing. To be accepted by its users, an RDMS must be built around workflows of the scientists and practices and thus support changes in workflow and data structure. Nevertheless it should encourage and support the development and observation of standards and furthermore facilitate the automation of data acquisition and processing with specialized software. The storage data model of an RDMS must reflect these complexities with appropriate semantics and ontologies while offering simple methods for finding, retrieving, and understanding relevant data. We show how CaosDB responds to these challenges and give an overview of the CaosDB Server, its data model and its easy-to-learn CaosDB Query Language. We briefly discuss the status of the implementation, how we currently use CaosDB, and how we plan to use and extend it.
Tasks
Published	2018-01-23
URL	http://arxiv.org/abs/1801.07653v2
PDF	http://arxiv.org/pdf/1801.07653v2.pdf
PWC	https://paperswithcode.com/paper/caosdb-research-data-management-for-complex
Repo
Framework

Path of Vowel Raising in Chengdu Dialect of Mandarin


Title	Path of Vowel Raising in Chengdu Dialect of Mandarin
Authors	Hai Hu, Yiwen Zhang
Abstract	He and Rao (2013) reported a raising phenomenon of /a/ in /Xan/ (X being a consonant or a vowel) in Chengdu dialect of Mandarin, i.e. /a/ is realized as [epsilon] for young speakers but [ae] for older speakers, but they offered no acoustic analysis. We designed an acoustic study that examined the realization of /Xan/ in speakers of different age (old vs. young) and gender (male vs. female) groups, where X represents three conditions: 1) unaspirated consonants: C ([p], [t], [k]), 2) aspirated consonants: Ch ([ph], [th], [kh]), and 3) high vowels: V ([i], [y], [u]). 17 native speakers were asked to read /Xan/ characters and the F1 values are extracted for comparison. Our results confirmed the raising effect in He and Rao (2013), i.e., young speakers realize /a/ as [epsilon] in /an/, whereas older speakers in the most part realize it as [ae]. Also, female speakers raise more than male speakers within the same age group. Interestingly, within the /Van/ condition, older speakers do raise /a/ in /ian/ and /yan/. We interpret this as /a/ first assimilates to its preceding front high vowels /i/ and /y/ for older speakers, which then becomes phonologized in younger speakers in all conditions, including /Chan/ and /Can/. This shows a possible trajectory of the ongoing sound change in the Chengdu dialect.
Tasks
Published	2018-03-11
URL	http://arxiv.org/abs/1803.03887v1
PDF	http://arxiv.org/pdf/1803.03887v1.pdf
PWC	https://paperswithcode.com/paper/path-of-vowel-raising-in-chengdu-dialect-of
Repo
Framework