October 19, 2019

2814 words 14 mins read

Paper Group ANR 367

Robust Structured Multi-task Multi-view Sparse Tracking. Single Bitmap Block Truncation Coding of Color Images Using Hill Climbing Algorithm. Learning to track on-the-fly using a particle filter with annealed- weighted QPSO modeled after a singular Dirac delta potential. Analysis of Multilingual Sequence-to-Sequence speech recognition systems. Pred …

Robust Structured Multi-task Multi-view Sparse Tracking


Title	Robust Structured Multi-task Multi-view Sparse Tracking
Authors	Mohammadreza Javanmardi, Xiaojun Qi
Abstract	Sparse representation is a viable solution to visual tracking. In this paper, we propose a structured multi-task multi-view tracking (SMTMVT) method, which exploits the sparse appearance model in the particle filter framework to track targets under different challenges. Specifically, we extract features of the target candidates from different views and sparsely represent them by a linear combination of templates of different views. Unlike the conventional sparse trackers, SMTMVT not only jointly considers the relationship between different tasks and different views but also retains the structures among different views in a robust multi-task multi-view formulation. We introduce a numerical algorithm based on the proximal gradient method to quickly and effectively find the sparsity by dividing the optimization problem into two subproblems with the closed-form solutions. Both qualitative and quantitative evaluations on the benchmark of challenging image sequences demonstrate the superior performance of the proposed tracker against various state-of-the-art trackers.
Tasks	Visual Tracking
Published	2018-06-06
URL	http://arxiv.org/abs/1806.01985v1
PDF	http://arxiv.org/pdf/1806.01985v1.pdf
PWC	https://paperswithcode.com/paper/robust-structured-multi-task-multi-view
Repo
Framework

Single Bitmap Block Truncation Coding of Color Images Using Hill Climbing Algorithm


Title	Single Bitmap Block Truncation Coding of Color Images Using Hill Climbing Algorithm
Authors	Lige Zhang, Xiaolin Qin, Qing Li, Haoyue Peng, Yu Hou
Abstract	Recently, the use of digital images in various fields is increasing rapidly. To increase the number of images stored and get faster transmission of them, it is necessary to reduce the size of these images. Single bitmap block truncation coding (SBBTC) schemes are compression techniques, which are used to generate a common bitmap to quantize the R, G and B planes in color image. As one of the traditional SBBTC schemes, weighted plane (W-plane) method is famous for its simplicity and low time consumption. However, the W-plane method also has poor performance in visual quality. This paper proposes an improved SBBTC scheme based on W-plane method using parallel computing and hill climbing algorithm. Compared with various schemes, the simulation results of the proposed scheme are better than that of the reference schemes in visual quality and time consumption.
Tasks
Published	2018-07-13
URL	http://arxiv.org/abs/1807.04960v1
PDF	http://arxiv.org/pdf/1807.04960v1.pdf
PWC	https://paperswithcode.com/paper/single-bitmap-block-truncation-coding-of
Repo
Framework

Learning to track on-the-fly using a particle filter with annealed- weighted QPSO modeled after a singular Dirac delta potential


Title	Learning to track on-the-fly using a particle filter with annealed- weighted QPSO modeled after a singular Dirac delta potential
Authors	Saptarshi Sengupta, Richard Alan Peters II
Abstract	This paper proposes an evolutionary Particle Filter with a memory guided proposal step size update and an improved, fully-connected Quantum-behaved Particle Swarm Optimization (QPSO) resampling scheme for visual tracking applications. The proposal update step uses importance weights proportional to velocities encountered in recent memory to limit the swarm movement within probable regions of interest. The QPSO resampling scheme uses a fitness weighted mean best update to bias the swarm towards the fittest section of particles while also employing a simulated annealing operator to avoid subpar fine tune during latter course of iterations. By moving particles closer to high likelihood landscapes of the posterior distribution using such constructs, the sample impoverishment problem that plagues the Particle Filter is mitigated to a great extent. Experimental results using benchmark sequences imply that the proposed method outperforms competitive candidate trackers such as the Particle Filter and the traditional Particle Swarm Optimization based Particle Filter on a suite of tracker performance indices.
Tasks	Visual Tracking
Published	2018-06-04
URL	http://arxiv.org/abs/1806.01396v1
PDF	http://arxiv.org/pdf/1806.01396v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-track-on-the-fly-using-a-particle
Repo
Framework

Analysis of Multilingual Sequence-to-Sequence speech recognition systems


Title	Analysis of Multilingual Sequence-to-Sequence speech recognition systems
Authors	Martin Karafiát, Murali Karthick Baskar, Shinji Watanabe, Takaaki Hori, Matthew Wiesner, Jan “Honza’’ Černocký
Abstract	This paper investigates the applications of various multilingual approaches developed in conventional hidden Markov model (HMM) systems to sequence-to-sequence (seq2seq) automatic speech recognition (ASR). On a set composed of Babel data, we first show the effectiveness of multi-lingual training with stacked bottle-neck (SBN) features. Then we explore various architectures and training strategies of multi-lingual seq2seq models based on CTC-attention networks including combinations of output layer, CTC and/or attention component re-training. We also investigate the effectiveness of language-transfer learning in a very low resource scenario when the target language is not included in the original multi-lingual training data. Interestingly, we found multilingual features superior to multilingual models, and this finding suggests that we can efficiently combine the benefits of the HMM system with the seq2seq system through these multilingual feature techniques.
Tasks	Sequence-To-Sequence Speech Recognition, Speech Recognition, Transfer Learning
Published	2018-11-07
URL	http://arxiv.org/abs/1811.03451v1
PDF	http://arxiv.org/pdf/1811.03451v1.pdf
PWC	https://paperswithcode.com/paper/analysis-of-multilingual-sequence-to-sequence
Repo
Framework

Prediction of Facebook Post Metrics using Machine Learning


Title	Prediction of Facebook Post Metrics using Machine Learning
Authors	Emmanuel Sam, Sergey Yarushev, Sebastián Basterrech, Alexey Averkin
Abstract	In this short paper, we evaluate the performance of three well-known Machine Learning techniques for predicting the impact of a post in Facebook. Social medias have a huge influence in the social behaviour. Therefore to develop an automatic model for predicting the impact of posts in social medias can be useful to the society. In this article, we analyze the efficiency for predicting the post impact of three popular techniques: Support Vector Regression (SVR), Echo State Network (ESN) and Adaptive Network Fuzzy Inject System (ANFIS). The evaluation was done over a public and well-known benchmark dataset.
Tasks
Published	2018-05-15
URL	http://arxiv.org/abs/1805.05579v1
PDF	http://arxiv.org/pdf/1805.05579v1.pdf
PWC	https://paperswithcode.com/paper/prediction-of-facebook-post-metrics-using
Repo
Framework

PAC-learning is Undecidable


Title	PAC-learning is Undecidable
Authors	Sairaam Venkatraman, S Balasubramanian, R Raghunatha Sarma
Abstract	The problem of attempting to learn the mapping between data and labels is the crux of any machine learning task. It is, therefore, of interest to the machine learning community on practical as well as theoretical counts to consider the existence of a test or criterion for deciding the feasibility of attempting to learn. We investigate the existence of such a criterion in the setting of PAC-learning, basing the feasibility solely on whether the mapping to be learnt lends itself to approximation by a given class of hypothesis functions. We show that no such criterion exists, exposing a fundamental limitation in the decidability of learning. In other words, we prove that testing for PAC-learnability is undecidable in the Turing sense. We also briefly discuss some of the probable implications of this result to the current practice of machine learning.
Tasks
Published	2018-08-20
URL	https://arxiv.org/abs/1808.06324v2
PDF	https://arxiv.org/pdf/1808.06324v2.pdf
PWC	https://paperswithcode.com/paper/pac-learning-is-undecidable
Repo
Framework

Non-Parametric Calibration of Probabilistic Regression


Title	Non-Parametric Calibration of Probabilistic Regression
Authors	Hao Song, Meelis Kull, Peter Flach
Abstract	The task of calibration is to retrospectively adjust the outputs from a machine learning model to provide better probability estimates on the target variable. While calibration has been investigated thoroughly in classification, it has not yet been well-established for regression tasks. This paper considers the problem of calibrating a probabilistic regression model to improve the estimated probability densities over the real-valued targets. We propose to calibrate a regression model through the cumulative probability density, which can be derived from calibrating a multi-class classifier. We provide three non-parametric approaches to solve the problem, two of which provide empirical estimates and the third providing smooth density estimates. The proposed approaches are experimentally evaluated to show their ability to improve the performance of regression models on the predictive likelihood.
Tasks	Calibration
Published	2018-06-20
URL	http://arxiv.org/abs/1806.07690v1
PDF	http://arxiv.org/pdf/1806.07690v1.pdf
PWC	https://paperswithcode.com/paper/non-parametric-calibration-of-probabilistic
Repo
Framework

Un-normalized hypergraph p-Laplacian based semi-supervised learning methods


Title	Un-normalized hypergraph p-Laplacian based semi-supervised learning methods
Authors	Loc Hoang Tran, Linh Hoang Tran
Abstract	Most network-based machine learning methods assume that the labels of two adjacent samples in the network are likely to be the same. However, assuming the pairwise relationship between samples is not complete. The information a group of samples that shows very similar pattern and tends to have similar labels is missed. The natural way overcoming the information loss of the above assumption is to represent the feature dataset of samples as the hypergraph. Thus, in this paper, we will present the un-normalized hypergraph p-Laplacian semi-supervised learning methods. These methods will be applied to the zoo dataset and the tiny version of 20 newsgroups dataset. Experiment results show that the accuracy performance measures of these un-normalized hypergraph p-Laplacian based semi-supervised learning methods are significantly greater than the accuracy performance measure of the un-normalized hypergraph Laplacian based semi-supervised learning method (the current state of the art method hypergraph Laplacian based semi-supervised learning method for classification problem with p=2).
Tasks
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02986v3
PDF	http://arxiv.org/pdf/1811.02986v3.pdf
PWC	https://paperswithcode.com/paper/un-normalized-hypergraph-p-laplacian-based
Repo
Framework

Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling


Title	Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling
Authors	Jaejin Cho, Murali Karthick Baskar, Ruizhi Li, Matthew Wiesner, Sri Harish Mallidi, Nelson Yalta, Martin Karafiat, Shinji Watanabe, Takaaki Hori
Abstract	Sequence-to-sequence (seq2seq) approach for low-resource ASR is a relatively new direction in speech research. The approach benefits by performing model training without using lexicon and alignments. However, this poses a new problem of requiring more data compared to conventional DNN-HMM systems. In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model as a prior model, and then port them towards 4 other BABEL languages using transfer learning approach. We also explore different architectures for improving the prior multilingual seq2seq model. The paper also discusses the effect of integrating a recurrent neural network language model (RNNLM) with a seq2seq model during decoding. Experimental results show that the transfer learning approach from the multilingual model shows substantial gains over monolingual models across all 4 BABEL languages. Incorporating an RNNLM also brings significant improvements in terms of %WER, and achieves recognition performance comparable to the models trained with twice more training data.
Tasks	Language Modelling, Sequence-To-Sequence Speech Recognition, Speech Recognition, Transfer Learning
Published	2018-10-04
URL	http://arxiv.org/abs/1810.03459v1
PDF	http://arxiv.org/pdf/1810.03459v1.pdf
PWC	https://paperswithcode.com/paper/multilingual-sequence-to-sequence-speech
Repo
Framework

Contrast-Oriented Deep Neural Networks for Salient Object Detection


Title	Contrast-Oriented Deep Neural Networks for Salient Object Detection
Authors	Guanbin Li, Yizhou Yu
Abstract	Deep convolutional neural networks have become a key element in the recent breakthrough of salient object detection. However, existing CNN-based methods are based on either patch-wise (region-wise) training and inference or fully convolutional networks. Methods in the former category are generally time-consuming due to severe storage and computational redundancies among overlapping patches. To overcome this deficiency, methods in the second category attempt to directly map a raw input image to a predicted dense saliency map in a single network forward pass. Though being very efficient, it is arduous for these methods to detect salient objects of different scales or salient regions with weak semantic information. In this paper, we develop hybrid contrast-oriented deep neural networks to overcome the aforementioned limitations. Each of our deep networks is composed of two complementary components, including a fully convolutional stream for dense prediction and a segment-level spatial pooling stream for sparse saliency inference. We further propose an attentional module that learns weight maps for fusing the two saliency predictions from these two streams. A tailored alternate scheme is designed to train these deep networks by fine-tuning pre-trained baseline models. Finally, a customized fully connected CRF model incorporating a salient contour feature embedding can be optionally applied as a post-processing step to improve spatial coherence and contour positioning in the fused result from these two streams. Extensive experiments on six benchmark datasets demonstrate that our proposed model can significantly outperform the state of the art in terms of all popular evaluation metrics.
Tasks	Object Detection, Salient Object Detection
Published	2018-03-30
URL	http://arxiv.org/abs/1803.11395v1
PDF	http://arxiv.org/pdf/1803.11395v1.pdf
PWC	https://paperswithcode.com/paper/contrast-oriented-deep-neural-networks-for
Repo
Framework

Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground


Title	Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground
Authors	Deng-Ping Fan, Ming-Ming Cheng, Jiang-Jiang Liu, Shang-Hua Gao, Qibin Hou, Ali Borji
Abstract	We provide a comprehensive evaluation of salient object detection (SOD) models. Our analysis identifies a serious design bias of existing SOD datasets which assumes that each image contains at least one clearly outstanding salient object in low clutter. The design bias has led to a saturated high performance for state-of-the-art SOD models when evaluated on existing datasets. The models, however, still perform far from being satisfactory when applied to real-world daily scenes. Based on our analyses, we first identify 7 crucial aspects that a comprehensive and balanced dataset should fulfill. Then, we propose a new high quality dataset and update the previous saliency benchmark. Specifically, our SOC (Salient Objects in Clutter) dataset, includes images with salient and non-salient objects from daily object categories. Beyond object category annotations, each salient image is accompanied by attributes that reflect common challenges in real-world scenes. Finally, we report attribute-based performance assessment on our dataset.
Tasks	Object Detection, Salient Object Detection
Published	2018-03-16
URL	http://arxiv.org/abs/1803.06091v2
PDF	http://arxiv.org/pdf/1803.06091v2.pdf
PWC	https://paperswithcode.com/paper/salient-objects-in-clutter-bringing-salient
Repo
Framework

Salient Object Detection by Lossless Feature Reflection


Title	Salient Object Detection by Lossless Feature Reflection
Authors	Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen
Abstract	Salient object detection, which aims to identify and locate the most salient pixels or regions in images, has been attracting more and more interest due to its various real-world applications. However, this vision task is quite challenging, especially under complex image scenes. Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection. Specifically, we design a symmetrical fully convolutional network (SFCN) to learn complementary saliency features under the guidance of lossless feature reflection. The location information, together with contextual and semantic information, of salient objects are jointly utilized to supervise the proposed network for more accurate saliency predictions. In addition, to overcome the blurry boundary problem, we propose a new structural loss function to learn clear object boundaries and spatially consistent saliency. The coarse prediction results are effectively refined by these structural information for performance improvements. Extensive experiments on seven saliency detection datasets demonstrate that our approach achieves consistently superior performance and outperforms the very recent state-of-the-art methods.
Tasks	Object Detection, Saliency Detection, Salient Object Detection
Published	2018-02-19
URL	http://arxiv.org/abs/1802.06527v2
PDF	http://arxiv.org/pdf/1802.06527v2.pdf
PWC	https://paperswithcode.com/paper/salient-object-detection-by-lossless-feature
Repo
Framework

Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model


Title	Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model
Authors	Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
Abstract	A sequence-to-sequence model is a neural network module for mapping two sequences of different lengths. The sequence-to-sequence model has three core modules: encoder, decoder, and attention. Attention is the bridge that connects the encoder and decoder modules and improves model performance in many tasks. In this paper, we propose two ideas to improve sequence-to-sequence model performance by enhancing the attention module. First, we maintain the history of the location and the expected context from several previous time-steps. Second, we apply multiscale convolution from several previous attention vectors to the current decoder state. We utilized our proposed framework for sequence-to-sequence speech recognition and text-to-speech systems. The results reveal that our proposed extension could improve performance significantly compared to a standard attention baseline.
Tasks	Sequence-To-Sequence Speech Recognition, Speech Recognition
Published	2018-07-22
URL	http://arxiv.org/abs/1807.08280v1
PDF	http://arxiv.org/pdf/1807.08280v1.pdf
PWC	https://paperswithcode.com/paper/multi-scale-alignment-and-contextual-history
Repo
Framework

A Note on Inexact Condition for Cubic Regularized Newton’s Method


Title	A Note on Inexact Condition for Cubic Regularized Newton’s Method
Authors	Zhe Wang, Yi Zhou, Yingbin Liang, Guanghui Lan
Abstract	This note considers the inexact cubic-regularized Newton’s method (CR), which has been shown in \cite{Cartis2011a} to achieve the same order-level convergence rate to a secondary stationary point as the exact CR \citep{Nesterov2006}. However, the inexactness condition in \cite{Cartis2011a} is not implementable due to its dependence on future iterates variable. This note fixes such an issue by proving the same convergence rate for nonconvex optimization under an inexact adaptive condition that depends on only the current iterate. Our proof controls the sufficient decrease of the function value over the total iterations rather than each iteration as used in the previous studies, which can be of independent interest in other contexts.
Tasks
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07384v1
PDF	http://arxiv.org/pdf/1808.07384v1.pdf
PWC	https://paperswithcode.com/paper/a-note-on-inexact-condition-for-cubic
Repo
Framework

A Gaussian Process perspective on Convolutional Neural Networks


Title	A Gaussian Process perspective on Convolutional Neural Networks
Authors	Anastasia Borovykh
Abstract	In this paper we cast the well-known convolutional neural network in a Gaussian process perspective. In this way we hope to gain additional insights into the performance of convolutional networks, in particular understand under what circumstances they tend to perform well and what assumptions are implicitly made in the network. While for fully-connected networks the properties of convergence to Gaussian processes have been studied extensively, little is known about situations in which the output from a convolutional network approaches a multivariate normal distribution.
Tasks	Gaussian Processes
Published	2018-10-25
URL	http://arxiv.org/abs/1810.10798v2
PDF	http://arxiv.org/pdf/1810.10798v2.pdf
PWC	https://paperswithcode.com/paper/a-gaussian-process-perspective-on
Repo
Framework