January 26, 2020

3162 words 15 mins read

Paper Group ANR 1490

On the Convergence of Perturbed Distributed Asynchronous Stochastic Gradient Descent to Second Order Stationary Points in Non-convex Optimization. Learned In Speech Recognition: Contextual Acoustic Word Embeddings. Automatic brain tissue segmentation in fetal MRI using convolutional neural networks. Enforcing constraints for time series prediction …

On the Convergence of Perturbed Distributed Asynchronous Stochastic Gradient Descent to Second Order Stationary Points in Non-convex Optimization


Title	On the Convergence of Perturbed Distributed Asynchronous Stochastic Gradient Descent to Second Order Stationary Points in Non-convex Optimization
Authors	Lifu Wang, Bo Shen, Ning Zhao
Abstract	In this paper, the second order convergence of non-convex optimization in the asynchronous stochastic gradient descent (ASGD) algorithm is studied systematically. We investigate the behavior of ASGD near and away from saddle points. Different from the general stochastic gradient descent(SGD), we show that ASGD might return back even if it has escaped the saddle points, yet after staying near a strict saddle point for a long enough time ($O(T)$), ASGD will finally go away from strict saddle points. An inequality is given to describe the process of ASGD to escape saddle points. Using a novel Razumikhin-Lyapunov method, we show the exponential instability of the perturbed gradient dynamics near the strict saddle points and give a more detailed estimation about how the time delay parameter $T$ influences the speed to escape. In particular, we consider the optimization of smooth nonconvex functions, and propose a perturbed asynchronous stochastic gradient descent algorithm with guarantee of convergence to second order stationary points with high probability in $O(1/\epsilon^4)$ iterations. To the best of our knowledge, this is the first work on the second order convergence of asynchronous algorithm.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06000v3
PDF	https://arxiv.org/pdf/1910.06000v3.pdf
PWC	https://paperswithcode.com/paper/on-the-convergence-of-perturbed-distributed
Repo
Framework

Learned In Speech Recognition: Contextual Acoustic Word Embeddings


Title	Learned In Speech Recognition: Contextual Acoustic Word Embeddings
Authors	Shruti Palaskar, Vikas Raunak, Florian Metze
Abstract	End-to-end acoustic-to-word speech recognition models have recently gained popularity because they are easy to train, scale well to large amounts of training data, and do not require a lexicon. In addition, word models may also be easier to integrate with downstream tasks such as spoken language understanding, because inference (search) is much simplified compared to phoneme, character or any other sort of sub-word units. In this paper, we describe methods to construct contextual acoustic word embeddings directly from a supervised sequence-to-sequence acoustic-to-word speech recognition model using the learned attention distribution. On a suite of 16 standard sentence evaluation tasks, our embeddings show competitive performance against a word2vec model trained on the speech transcriptions. In addition, we evaluate these embeddings on a spoken language understanding task, and observe that our embeddings match the performance of text-based embeddings in a pipeline of first performing speech recognition and then constructing word embeddings from transcriptions.
Tasks	Speech Recognition, Spoken Language Understanding, Word Embeddings
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06833v1
PDF	http://arxiv.org/pdf/1902.06833v1.pdf
PWC	https://paperswithcode.com/paper/learned-in-speech-recognition-contextual
Repo
Framework

Automatic brain tissue segmentation in fetal MRI using convolutional neural networks


Title	Automatic brain tissue segmentation in fetal MRI using convolutional neural networks
Authors	N. Khalili, N. Lessmann, E. Turk, N. Claessens, R. de Heus, T. Kolk, M. A. Viergever, M. J. N. L. Benders, I. Isgum
Abstract	MR images of fetuses allow clinicians to detect brain abnormalities in an early stage of development. The cornerstone of volumetric and morphologic analysis in fetal MRI is segmentation of the fetal brain into different tissue classes. Manual segmentation is cumbersome and time consuming, hence automatic segmentation could substantially simplify the procedure. However, automatic brain tissue segmentation in these scans is challenging owing to artifacts including intensity inhomogeneity, caused in particular by spontaneous fetal movements during the scan. Unlike methods that estimate the bias field to remove intensity inhomogeneity as a preprocessing step to segmentation, we propose to perform segmentation using a convolutional neural network that exploits images with synthetically introduced intensity inhomogeneity as data augmentation. The method first uses a CNN to extract the intracranial volume. Thereafter, another CNN with the same architecture is employed to segment the extracted volume into seven brain tissue classes: cerebellum, basal ganglia and thalami, ventricular cerebrospinal fluid, white matter, brain stem, cortical gray matter and extracerebral cerebrospinal fluid. To make the method applicable to slices showing intensity inhomogeneity artifacts, the training data was augmented by applying a combination of linear gradients with random offsets and orientations to image slices without artifacts.
Tasks	Data Augmentation
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04713v1
PDF	https://arxiv.org/pdf/1906.04713v1.pdf
PWC	https://paperswithcode.com/paper/automatic-brain-tissue-segmentation-in-fetal
Repo
Framework

Enforcing constraints for time series prediction in supervised, unsupervised and reinforcement learning


Title	Enforcing constraints for time series prediction in supervised, unsupervised and reinforcement learning
Authors	Panos Stinis
Abstract	We assume that we are given a time series of data from a dynamical system and our task is to learn the flow map of the dynamical system. We present a collection of results on how to enforce constraints coming from the dynamical system in order to accelerate the training of deep neural networks to represent the flow map of the system as well as increase their predictive ability. In particular, we provide ways to enforce constraints during training for all three major modes of learning, namely supervised, unsupervised and reinforcement learning. In general, the dynamic constraints need to include terms which are analogous to memory terms in model reduction formalisms. Such memory terms act as a restoring force which corrects the errors committed by the learned flow map during prediction. For supervised learning, the constraints are added to the objective function. For the case of unsupervised learning, in particular generative adversarial networks, the constraints are introduced by augmenting the input of the discriminator. Finally, for the case of reinforcement learning and in particular actor-critic methods, the constraints are added to the reward function. In addition, for the reinforcement learning case, we present a novel approach based on homotopy of the action-value function in order to stabilize and accelerate training. We use numerical results for the Lorenz system to illustrate the various constructions.
Tasks	Time Series, Time Series Prediction
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07501v1
PDF	https://arxiv.org/pdf/1905.07501v1.pdf
PWC	https://paperswithcode.com/paper/enforcing-constraints-for-time-series
Repo
Framework

PLUME: Polyhedral Learning Using Mixture of Experts


Title	PLUME: Polyhedral Learning Using Mixture of Experts
Authors	Kulin Shah, P. S. Sastry, Naresh Manwani
Abstract	In this paper, we propose a novel mixture of expert architecture for learning polyhedral classifiers. We learn the parameters of the classifierusing an expectation maximization algorithm. Wederive the generalization bounds of the proposedapproach. Through an extensive simulation study, we show that the proposed method performs comparably to other state-of-the-art approaches.
Tasks
Published	2019-04-22
URL	http://arxiv.org/abs/1904.09948v1
PDF	http://arxiv.org/pdf/1904.09948v1.pdf
PWC	https://paperswithcode.com/paper/plume-polyhedral-learning-using-mixture-of
Repo
Framework

SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering


Title	SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering
Authors	Zhenpeng Chen, Yanbin Cao, Xuan Lu, Qiaozhu Mei, Xuanzhe Liu
Abstract	Sentiment analysis has various application scenarios in software engineering (SE), such as detecting developers’ emotions in commit messages and identifying their opinions on Q&A forums. However, commonly used out-of-the-box sentiment analysis tools cannot obtain reliable results on SE tasks and the misunderstanding of technical jargon is demonstrated to be the main reason. Then, researchers have to utilize labeled SE-related texts to customize sentiment analysis for SE tasks via a variety of algorithms. However, the scarce labeled data can cover only very limited expressions and thus cannot guarantee the analysis quality. To address such a problem, we turn to the easily available emoji usage data for help. More specifically, we employ emotional emojis as noisy labels of sentiments and propose a representation learning approach that uses both Tweets and GitHub posts containing emojis to learn sentiment-aware representations for SE-related texts. These emoji-labeled posts can not only supply the technical jargon, but also incorporate more general sentiment patterns shared across domains. They as well as labeled data are used to learn the final sentiment classifier. Compared to the existing sentiment analysis methods used in SE, the proposed approach can achieve significant improvement on representative benchmark datasets. By further contrast experiments, we find that the Tweets make a key contribution to the power of our approach. This finding informs future research not to unilaterally pursue the domain-specific resource, but try to transform knowledge from the open domain through ubiquitous signals such as emojis.
Tasks	Representation Learning, Sentiment Analysis
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02202v1
PDF	https://arxiv.org/pdf/1907.02202v1.pdf
PWC	https://paperswithcode.com/paper/sentimoji-an-emoji-powered-learning-approach
Repo
Framework

Artist and style exposure bias in collaborative filtering based music recommendations


Title	Artist and style exposure bias in collaborative filtering based music recommendations
Authors	Andres Ferraro, Dmitry Bogdanov, Xavier Serra, Jason Yoon
Abstract	Algorithms have an increasing influence on the music that we consume and understanding their behavior is fundamental to make sure they give a fair exposure to all artists across different styles. In this on-going work we contribute to this research direction analyzing the impact of collaborative filtering recommendations from the perspective of artist and music style exposure given by the system. We first analyze the distribution of the recommendations considering the exposure of different styles or genres and compare it to the users’ listening behavior. This comparison suggests that the system is reinforcing the popularity of the items. Then, we simulate the effect of the system in the long term with a feedback loop. From this simulation we can see how the system gives less opportunity to the majority of artists, concentrating the users on fewer items. The results of our analysis demonstrate the need for a better evaluation methodology for current music recommendation algorithms, not only limited to user-focused relevance metrics.
Tasks
Published	2019-11-12
URL	https://arxiv.org/abs/1911.04827v1
PDF	https://arxiv.org/pdf/1911.04827v1.pdf
PWC	https://paperswithcode.com/paper/artist-and-style-exposure-bias-in
Repo
Framework

Solve fraud detection problem by using graph based learning methods


Title	Solve fraud detection problem by using graph based learning methods
Authors	Loc Tran, Tuan Tran, Linh Tran, An Mai
Abstract	The credit cards’ fraud transactions detection is the important problem in machine learning field. To detect the credit cards’s fraud transactions help reduce the significant loss of the credit cards’ holders and the banks. To detect the credit cards’ fraud transactions, data scientists normally employ the unsupervised learning techniques and supervised learning techniques. In this paper, we employ the graph p-Laplacian based semi-supervised learning methods combined with the undersampling techniques such as Cluster Centroids to solve the credit cards’ fraud transactions detection problem. Experimental results show that the graph p-Laplacian semi-supervised learning methods outperform the current state of the art graph Laplacian based semi-supervised learning method (p=2).
Tasks	Fraud Detection
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11708v1
PDF	https://arxiv.org/pdf/1908.11708v1.pdf
PWC	https://paperswithcode.com/paper/solve-fraud-detection-problem-by-using-graph
Repo
Framework

Automated Discovery of Business Process Simulation Models from Event Logs


Title	Automated Discovery of Business Process Simulation Models from Event Logs
Authors	Manuel Camargo, Marlon Dumas, Oscar González-Rojas
Abstract	Business process simulation is a versatile technique to estimate the performance of a process under multiple scenarios. This, in turn, allows analysts to compare alternative options to improve a business process. A common roadblock for business process simulation is that constructing accurate simulation models is cumbersome and error-prone. Modern information systems store detailed execution logs of the business processes they support. Previous work has shown that these logs can be used to discover simulation models. However, existing methods for log-based discovery of simulation models do not seek to optimize the accuracy of the resulting models. Instead they leave it to the user to manually tune the simulation model to achieve the desired level of accuracy. This article presents an accuracy-optimized method to discover business process simulation models from execution logs. The method decomposes the problem into a series of steps with associated configuration parameters. A hyper-parameter optimization method is used to search through the space of possible configurations so as to maximize the similarity between the behavior of the simulation model and the behavior observed in the log. The method has been implemented as a tool and evaluated using logs from different domains.
Tasks
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05404v3
PDF	https://arxiv.org/pdf/1910.05404v3.pdf
PWC	https://paperswithcode.com/paper/automated-discovery-of-business-process
Repo
Framework

Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking


Title	Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking
Authors	Lu Sheng, Jianfei Cai, Tat-Jen Cham, Vladimir Pavlovic, King Ngi Ngan
Abstract	In this paper, we propose a generative framework that unifies depth-based 3D facial pose tracking and face model adaptation on-the-fly, in the unconstrained scenarios with heavy occlusions and arbitrary facial expression variations. Specifically, we introduce a statistical 3D morphable model that flexibly describes the distribution of points on the surface of the face model, with an efficient switchable online adaptation that gradually captures the identity of the tracked subject and rapidly constructs a suitable face model when the subject changes. Moreover, unlike prior art that employed ICP-based facial pose estimation, to improve robustness to occlusions, we propose a ray visibility constraint that regularizes the pose based on the face model’s visibility with respect to the input point cloud. Ablation studies and experimental results on Biwi and ICT-3DHP datasets demonstrate that the proposed framework is effective and outperforms completing state-of-the-art depth-based methods.
Tasks	Pose Estimation, Pose Tracking
Published	2019-05-06
URL	https://arxiv.org/abs/1905.02114v1
PDF	https://arxiv.org/pdf/1905.02114v1.pdf
PWC	https://paperswithcode.com/paper/visibility-constrained-generative-model-for
Repo
Framework

Cascaded Detail-Preserving Networks for Super-Resolution of Document Images


Title	Cascaded Detail-Preserving Networks for Super-Resolution of Document Images
Authors	Zhichao Fu, Yu Kong, Yingbin Zheng, Hao Ye, Wenxin Hu, Jing Yang, Liang He
Abstract	The accuracy of OCR is usually affected by the quality of the input document image and different kinds of marred document images hamper the OCR results. Among these scenarios, the low-resolution image is a common and challenging case. In this paper, we propose the cascaded networks for document image super-resolution. Our model is composed by the Detail-Preserving Networks with small magnification. The loss function with perceptual terms is designed to simultaneously preserve the original patterns and enhance the edge of the characters. These networks are trained with the same architecture and different parameters and then assembled into a pipeline model with a larger magnification. The low-resolution images can upscale gradually by passing through each Detail-Preserving Network until the final high-resolution images. Through extensive experiments on two scanning document image datasets, we demonstrate that the proposed approach outperforms recent state-of-the-art image super-resolution methods, and combining it with standard OCR system lead to signification improvements on the recognition results.
Tasks	Image Super-Resolution, Optical Character Recognition, Super-Resolution
Published	2019-11-25
URL	https://arxiv.org/abs/1911.10714v1
PDF	https://arxiv.org/pdf/1911.10714v1.pdf
PWC	https://paperswithcode.com/paper/cascaded-detail-preserving-networks-for-super
Repo
Framework

Similarity Grouping-Guided Neural Network Modeling for Maritime Time Series Prediction


Title	Similarity Grouping-Guided Neural Network Modeling for Maritime Time Series Prediction
Authors	Yan Li, Ryan Wen Liu, Zhao Liu, Jingxian Liu
Abstract	Reliable and accurate prediction of time series plays a crucial role in maritime industry, such as economic investment, transportation planning, port planning and design, etc. The dynamic growth of maritime time series has the predominantly complex, nonlinear and non-stationary properties. To guarantee high-quality prediction performance, we propose to first adopt the empirical mode decomposition (EMD) and ensemble EMD (EEMD) methods to decompose the original time series into high- and low-frequency components. The low-frequency components can be easily predicted directly through traditional neural network (NN) methods. It is more difficult to predict high-frequency components due to their properties of weak mathematical regularity. To take advantage of the inherent self-similarities within high-frequency components, these components will be divided into several continuous small (overlapping) segments. The grouped segments with high similarities are then selected to form more proper training datasets for traditional NN methods. This regrouping strategy can assist in enhancing the prediction accuracy of high-frequency components. The final prediction result is obtained by integrating the predicted high- and low-frequency components. Our proposed three-step prediction frameworks benefit from the time series decomposition and similar segments grouping. Experiments on both port cargo throughput and vessel traffic flow have illustrated its superior performance in terms of prediction accuracy and robustness.
Tasks	Time Series, Time Series Prediction
Published	2019-05-13
URL	https://arxiv.org/abs/1905.04872v1
PDF	https://arxiv.org/pdf/1905.04872v1.pdf
PWC	https://paperswithcode.com/paper/similarity-grouping-guided-neural-network
Repo
Framework

Quantitative Weak Convergence for Discrete Stochastic Processes


Title	Quantitative Weak Convergence for Discrete Stochastic Processes
Authors	Xiang Cheng, Peter L. Bartlett, Michael I. Jordan
Abstract	In this paper, we quantitative convergence in $W_2$ for a family of Langevin-like stochastic processes that includes stochastic gradient descent and related gradient-based algorithms. Under certain regularity assumptions, we show that the iterates of these stochastic processes converge to an invariant distribution at a rate of $\tilde{O}\lrp{1/\sqrt{k}}$ where $k$ is the number of steps; this rate is provably tight up to log factors. Our result reduces to a quantitative form of the classical Central Limit Theorem in the special case when the potential is quadratic.
Tasks
Published	2019-02-03
URL	https://arxiv.org/abs/1902.00832v2
PDF	https://arxiv.org/pdf/1902.00832v2.pdf
PWC	https://paperswithcode.com/paper/quantitative-central-limit-theorems-for
Repo
Framework

Road is Enough! Extrinsic Calibration of Non-overlapping Stereo Camera and LiDAR using Road Information


Title	Road is Enough! Extrinsic Calibration of Non-overlapping Stereo Camera and LiDAR using Road Information
Authors	Jinyong Jeong, Lucas Y. Cho, Ayoung Kim
Abstract	This paper presents a framework for the targetless extrinsic calibration of stereo cameras and Light Detection and Ranging (LiDAR) sensors with a non-overlapping Field of View (FOV). In order to solve the extrinsic calibrations problem under such challenging configuration, the proposed solution exploits road markings as static and robust features among the various dynamic objects that are present in urban environment. First, this study utilizes road markings that are commonly captured by the two sensor modalities to select informative images for estimating the extrinsic parameters. In order to accomplish stable optimization, multiple cost functions are defined, including Normalized Information Distance (NID), edge alignment and, plane fitting cost. Therefore a smooth cost curve is formed for global optimization to prevent convergence to the local optimal point. We further evaluate each cost function by examining parameter sensitivity near the optimal point. Another key characteristic of extrinsic calibration, repeatability, is analyzed by conducting the proposed method multiple times with varying randomly perturbed initial points.
Tasks	Calibration
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10586v2
PDF	http://arxiv.org/pdf/1902.10586v2.pdf
PWC	https://paperswithcode.com/paper/road-is-enough-extrinsic-calibration-of-non
Repo
Framework

DomainSiam: Domain-Aware Siamese Network for Visual Object Tracking


Title	DomainSiam: Domain-Aware Siamese Network for Visual Object Tracking
Authors	Mohamed H. Abdelpakey, Mohamed S. Shehata
Abstract	Visual object tracking is a fundamental task in the field of computer vision. Recently, Siamese trackers have achieved state-of-the-art performance on recent benchmarks. However, Siamese trackers do not fully utilize semantic and objectness information from pre-trained networks that have been trained on the image classification task. Furthermore, the pre-trained Siamese architecture is sparsely activated by the category label which leads to unnecessary calculations and overfitting. In this paper, we propose to learn a Domain-Aware, that is fully utilizing semantic and objectness information while producing a class-agnostic using a ridge regression network. Moreover, to reduce the sparsity problem, we solve the ridge regression problem with a differentiable weighted-dynamic loss function. Our tracker, dubbed DomainSiam, improves the feature learning in the training phase and generalization capability to other domains. Extensive experiments are performed on five tracking benchmarks including OTB2013 and OTB2015 for a validation set; as well as the VOT2017, VOT2018, LaSOT, TrackingNet, and GOT10k for a testing set. DomainSiam achieves state-of-the-art performance on these benchmarks while running at 53 FPS.
Tasks	Image Classification, Object Tracking, Visual Object Tracking
Published	2019-08-21
URL	https://arxiv.org/abs/1908.07905v1
PDF	https://arxiv.org/pdf/1908.07905v1.pdf
PWC	https://paperswithcode.com/paper/190807905
Repo
Framework