July 27, 2019

3207 words 16 mins read

Paper Group ANR 511

Language Independent Single Document Image Super-Resolution using CNN for improved recognition. Fine-Gray competing risks model with high-dimensional covariates: estimation and Inference. Sample complexity of population recovery. Bayesian Belief Updating of Spatiotemporal Seizure Dynamics. A General Model for Robust Tensor Factorization with Unknow …

Language Independent Single Document Image Super-Resolution using CNN for improved recognition


Title	Language Independent Single Document Image Super-Resolution using CNN for improved recognition
Authors	Ram Krishna Pandey, A G Ramakrishnan
Abstract	Recognition of document images have important applications in restoring old and classical texts. The problem involves quality improvement before passing it to a properly trained OCR to get accurate recognition of the text. The image enhancement and quality improvement constitute important steps as subsequent recognition depends upon the quality of the input image. There are scenarios when high resolution images are not available and our experiments show that the OCR accuracy reduces significantly with decrease in the spatial resolution of document images. Thus the only option is to improve the resolution of such document images. The goal is to construct a high resolution image, given a single low resolution binary image, which constitutes the problem of single image super-resolution. Most of the previous work in super-resolution deal with natural images which have more information-content than the document images. Here, we use Convolution Neural Network to learn the mapping between low and the corresponding high resolution images. We experiment with different number of layers, parameter settings and non-linear functions to build a fast end-to-end framework for document image super-resolution. Our proposed model shows a very good PSNR improvement of about 4 dB on 75 dpi Tamil images, resulting in a 3 % improvement of word level accuracy by the OCR. It takes less time than the recent sparse based natural image super-resolution technique, making it useful for real-time document recognition applications.
Tasks	Image Enhancement, Image Super-Resolution, Optical Character Recognition, Super-Resolution
Published	2017-01-30
URL	http://arxiv.org/abs/1701.08835v1
PDF	http://arxiv.org/pdf/1701.08835v1.pdf
PWC	https://paperswithcode.com/paper/language-independent-single-document-image
Repo
Framework

Fine-Gray competing risks model with high-dimensional covariates: estimation and Inference


Title	Fine-Gray competing risks model with high-dimensional covariates: estimation and Inference
Authors	Jue Hou, Jelena Bradic, Ronghui Xu
Abstract	The purpose of this paper is to construct confidence intervals for the regression coefficients in the Fine-Gray model for competing risks data with random censoring, where the number of covariates can be larger than the sample size. Despite strong motivation from biomedical applications, a high-dimensional Fine-Gray model has attracted relatively little attention among the methodological or theoretical literature. We fill in this gap by developing confidence intervals based on a one-step bias-correction for a regularized estimation. We develop a theoretical framework for the partial likelihood, which does not have independent and identically distributed entries and therefore presents many technical challenges. We also study the approximation error from the weighting scheme under random censoring for competing risks and establish new concentration results for time-dependent processes. In addition to the theoretical results and algorithms, we present extensive numerical experiments and an application to a study of non-cancer mortality among prostate cancer patients using the linked Medicare-SEER data.
Tasks
Published	2017-07-29
URL	http://arxiv.org/abs/1707.09561v2
PDF	http://arxiv.org/pdf/1707.09561v2.pdf
PWC	https://paperswithcode.com/paper/fine-gray-competing-risks-model-with-high
Repo
Framework

Sample complexity of population recovery


Title	Sample complexity of population recovery
Authors	Yury Polyanskiy, Ananda Theertha Suresh, Yihong Wu
Abstract	The problem of population recovery refers to estimating a distribution based on incomplete or corrupted samples. Consider a random poll of sample size $n$ conducted on a population of individuals, where each pollee is asked to answer $d$ binary questions. We consider one of the two polling impediments: (a) in lossy population recovery, a pollee may skip each question with probability $\epsilon$, (b) in noisy population recovery, a pollee may lie on each question with probability $\epsilon$. Given $n$ lossy or noisy samples, the goal is to estimate the probabilities of all $2^d$ binary vectors simultaneously within accuracy $\delta$ with high probability. This paper settles the sample complexity of population recovery. For lossy model, the optimal sample complexity is $\tilde\Theta(\delta^{-2\max{\frac{\epsilon}{1-\epsilon},1}})$, improving the state of the art by Moitra and Saks in several ways: a lower bound is established, the upper bound is improved and the result depends at most on the logarithm of the dimension. Surprisingly, the sample complexity undergoes a phase transition from parametric to nonparametric rate when $\epsilon$ exceeds $1/2$. For noisy population recovery, the sharp sample complexity turns out to be more sensitive to dimension and scales as $\exp(\Theta(d^{1/3} \log^{2/3}(1/\delta)))$ except for the trivial cases of $\epsilon=0,1/2$ or $1$. For both models, our estimators simply compute the empirical mean of a certain function, which is found by pre-solving a linear program (LP). Curiously, the dual LP can be understood as Le Cam’s method for lower-bounding the minimax risk, thus establishing the statistical optimality of the proposed estimators. The value of the LP is determined by complex-analytic methods.
Tasks
Published	2017-02-18
URL	http://arxiv.org/abs/1702.05574v2
PDF	http://arxiv.org/pdf/1702.05574v2.pdf
PWC	https://paperswithcode.com/paper/sample-complexity-of-population-recovery
Repo
Framework

Bayesian Belief Updating of Spatiotemporal Seizure Dynamics


Title	Bayesian Belief Updating of Spatiotemporal Seizure Dynamics
Authors	Gerald K Cooray, Richard Rosch, Torsten Baldeweg, Louis Lemieux, Karl Friston, Biswa Sengupta
Abstract	Epileptic seizure activity shows complicated dynamics in both space and time. To understand the evolution and propagation of seizures spatially extended sets of data need to be analysed. We have previously described an efficient filtering scheme using variational Laplace that can be used in the Dynamic Causal Modelling (DCM) framework [Friston, 2003] to estimate the temporal dynamics of seizures recorded using either invasive or non-invasive electrical recordings (EEG/ECoG). Spatiotemporal dynamics are modelled using a partial differential equation – in contrast to the ordinary differential equation used in our previous work on temporal estimation of seizure dynamics [Cooray, 2016]. We provide the requisite theoretical background for the method and test the ensuing scheme on simulated seizure activity data and empirical invasive ECoG data. The method provides a framework to assimilate the spatial and temporal dynamics of seizure activity, an aspect of great physiological and clinical importance.
Tasks	EEG
Published	2017-05-20
URL	http://arxiv.org/abs/1705.07278v2
PDF	http://arxiv.org/pdf/1705.07278v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-belief-updating-of-spatiotemporal
Repo
Framework

A General Model for Robust Tensor Factorization with Unknown Noise


Title	A General Model for Robust Tensor Factorization with Unknown Noise
Authors	Xi’ai Chen, Zhi Han, Yao Wang, Qian Zhao, Deyu Meng, Lin Lin, Yandong Tang
Abstract	Because of the limitations of matrix factorization, such as losing spatial structure information, the concept of low-rank tensor factorization (LRTF) has been applied for the recovery of a low dimensional subspace from high dimensional visual data. The low-rank tensor recovery is generally achieved by minimizing the loss function between the observed data and the factorization representation. The loss function is designed in various forms under different noise distribution assumptions, like $L_1$ norm for Laplacian distribution and $L_2$ norm for Gaussian distribution. However, they often fail to tackle the real data which are corrupted by the noise with unknown distribution. In this paper, we propose a generalized weighted low-rank tensor factorization method (GWLRTF) integrated with the idea of noise modelling. This procedure treats the target data as high-order tensor directly and models the noise by a Mixture of Gaussians, which is called MoG GWLRTF. The parameters in the model are estimated under the EM framework and through a new developed algorithm of weighted low-rank tensor factorization. We provide two versions of the algorithm with different tensor factorization operations, i.e., CP factorization and Tucker factorization. Extensive experiments indicate the respective advantages of this two versions in different applications and also demonstrate the effectiveness of MoG GWLRTF compared with other competing methods.
Tasks
Published	2017-05-18
URL	http://arxiv.org/abs/1705.06755v1
PDF	http://arxiv.org/pdf/1705.06755v1.pdf
PWC	https://paperswithcode.com/paper/a-general-model-for-robust-tensor
Repo
Framework

The 2D Tree Sliding Window Discrete Fourier Transform


Title	The 2D Tree Sliding Window Discrete Fourier Transform
Authors	Lee F. Richardson, William F. Eddy
Abstract	We present a new algorithm for the 2D Sliding Window Discrete Fourier Transform (SWDFT). Our algorithm avoids repeating calculations in overlapping windows by storing them in a tree data-structure based on the ideas of the Cooley- Tukey Fast Fourier Transform (FFT). For an $N_0 \times N_1$ array and $n_0 \times n_1$ windows, our algorithm takes $O(N_0 N_1 n_0 n_1)$ operations. We provide a C implementation of our algorithm for the Radix-2 case, compare ours with existing algorithms, and show how our algorithm easily extends to higher dimensions.
Tasks
Published	2017-07-25
URL	http://arxiv.org/abs/1707.08213v2
PDF	http://arxiv.org/pdf/1707.08213v2.pdf
PWC	https://paperswithcode.com/paper/the-2d-tree-sliding-window-discrete-fourier
Repo
Framework

Recent Advances in Zero-shot Recognition


Title	Recent Advances in Zero-shot Recognition
Authors	Yanwei Fu, Tao Xiang, Yu-Gang Jiang, Xiangyang Xue, Leonid Sigal, Shaogang Gong
Abstract	With the recent renaissance of deep convolution neural networks, encouraging breakthroughs have been achieved on the supervised recognition tasks, where each class has sufficient training data and fully annotated training data. However, to scale the recognition to a large number of classes with few or now training samples for each class remains an unsolved problem. One approach to scaling up the recognition is to develop models capable of recognizing unseen categories without any training instances, or zero-shot recognition/ learning. This article provides a comprehensive review of existing zero-shot recognition techniques covering various aspects ranging from representations of models, and from datasets and evaluation settings. We also overview related recognition tasks including one-shot and open set recognition which can be used as natural extensions of zero-shot recognition when limited number of class samples become available or when zero-shot recognition is implemented in a real-world setting. Importantly, we highlight the limitations of existing approaches and point out future research directions in this existing new research area.
Tasks	Open Set Learning, Zero-Shot Learning
Published	2017-10-13
URL	http://arxiv.org/abs/1710.04837v1
PDF	http://arxiv.org/pdf/1710.04837v1.pdf
PWC	https://paperswithcode.com/paper/recent-advances-in-zero-shot-recognition
Repo
Framework

Reinforcement Mechanism Design for e-commerce


Title	Reinforcement Mechanism Design for e-commerce
Authors	Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang, Yiwei Zhang
Abstract	We study the problem of allocating impressions to sellers in e-commerce websites, such as Amazon, eBay or Taobao, aiming to maximize the total revenue generated by the platform. We employ a general framework of reinforcement mechanism design, which uses deep reinforcement learning to design efficient algorithms, taking the strategic behaviour of the sellers into account. Specifically, we model the impression allocation problem as a Markov decision process, where the states encode the history of impressions, prices, transactions and generated revenue and the actions are the possible impression allocations in each round. To tackle the problem of continuity and high-dimensionality of states and actions, we adopt the ideas of the DDPG algorithm to design an actor-critic policy gradient algorithm which takes advantage of the problem domain in order to achieve convergence and stability. We evaluate our proposed algorithm, coined IA(GRU), by comparing it against DDPG, as well as several natural heuristics, under different rationality models for the sellers - we assume that sellers follow well-known no-regret type strategies which may vary in their degree of sophistication. We find that IA(GRU) outperforms all algorithms in terms of the total revenue.
Tasks
Published	2017-08-25
URL	http://arxiv.org/abs/1708.07607v3
PDF	http://arxiv.org/pdf/1708.07607v3.pdf
PWC	https://paperswithcode.com/paper/reinforcement-mechanism-design-for-e-commerce
Repo
Framework

Real-Time Illegal Parking Detection System Based on Deep Learning


Title	Real-Time Illegal Parking Detection System Based on Deep Learning
Authors	Xuemei Xie, Chenye Wang, Shu Chen, Guangming Shi, Zhifu Zhao
Abstract	The increasing illegal parking has become more and more serious. Nowadays the methods of detecting illegally parked vehicles are based on background segmentation. However, this method is weakly robust and sensitive to environment. Benefitting from deep learning, this paper proposes a novel illegal vehicle parking detection system. Illegal vehicles captured by camera are firstly located and classified by the famous Single Shot MultiBox Detector (SSD) algorithm. To improve the performance, we propose to optimize SSD by adjusting the aspect ratio of default box to accommodate with our dataset better. After that, a tracking and analysis of movement is adopted to judge the illegal vehicles in the region of interest (ROI). Experiments show that the system can achieve a 99% accuracy and real-time (25FPS) detection with strong robustness in complex environments.
Tasks
Published	2017-10-05
URL	http://arxiv.org/abs/1710.02546v1
PDF	http://arxiv.org/pdf/1710.02546v1.pdf
PWC	https://paperswithcode.com/paper/real-time-illegal-parking-detection-system
Repo
Framework

Machine-Translation History and Evolution: Survey for Arabic-English Translations


Title	Machine-Translation History and Evolution: Survey for Arabic-English Translations
Authors	Nabeel T. Alsohybe, Neama Abdulaziz Dahan, Fadl Mutaher Ba-Alwi
Abstract	As a result of the rapid changes in information and communication technology (ICT), the world has become a small village where people from all over the world connect with each other in dialogue and communication via the Internet. Also, communications have become a daily routine activity due to the new globalization where companies and even universities become global residing cross countries borders. As a result, translation becomes a needed activity in this connected world. ICT made it possible to have a student in one country take a course or even a degree from a different country anytime anywhere easily. The resulted communication still needs a language as a means that helps the receiver understands the contents of the sent message. People need an automated translation application because human translators are hard to find all the times, and the human translations are very expensive comparing to the translations automated process. Several types of research describe the electronic process of the Machine-Translation. In this paper, the authors are going to study some of these previous researches, and they will explore some of the needed tools for the Machine-Translation. This research is going to contribute to the Machine-Translation area by helping future researchers to have a summary for the Machine-Translation groups of research and to let lights on the importance of the translation mechanism.
Tasks	Machine Translation
Published	2017-09-14
URL	http://arxiv.org/abs/1709.04685v1
PDF	http://arxiv.org/pdf/1709.04685v1.pdf
PWC	https://paperswithcode.com/paper/machine-translation-history-and-evolution
Repo
Framework

Change Detection under Global Viewpoint Uncertainty


Title	Change Detection under Global Viewpoint Uncertainty
Authors	Murase Tomoya, Tanaka Kanji
Abstract	This paper addresses the problem of change detection from a novel perspective of long-term map learning. We are particularly interested in designing an approach that can scale to large maps and that can function under global uncertainty in the viewpoint (i.e., GPS-denied situations). Our approach, which utilizes a compact bag-of-words (BoW) scene model, makes several contributions to the problem: 1) Two kinds of prior information are extracted from the view sequence map and used for change detection. Further, we propose a novel type of prior, called motion prior, to predict the relative motions of stationary objects and anomaly ego-motion detection. The proposed prior is also useful for distinguishing stationary from non-stationary objects. 2) A small set of good reference images (e.g., 10) are efficiently retrieved from the view sequence map by employing the recently developed Bag-of-Local-Convolutional-Features (BoLCF) scene model. 3) Change detection is reformulated as a scene retrieval over these reference images to find changed objects using a novel spatial Bag-of-Words (SBoW) scene model. Evaluations conducted of individual techniques and also their combinations on a challenging dataset of highly dynamic scenes in the publicly available Malaga dataset verify their efficacy.
Tasks	Motion Detection
Published	2017-03-01
URL	http://arxiv.org/abs/1703.00552v1
PDF	http://arxiv.org/pdf/1703.00552v1.pdf
PWC	https://paperswithcode.com/paper/change-detection-under-global-viewpoint
Repo
Framework

Psychological and Personality Profiles of Political Extremists


Title	Psychological and Personality Profiles of Political Extremists
Authors	Meysam Alizadeh, Ingmar Weber, Claudio Cioffi-Revilla, Santo Fortunato, Michael Macy
Abstract	Global recruitment into radical Islamic movements has spurred renewed interest in the appeal of political extremism. Is the appeal a rational response to material conditions or is it the expression of psychological and personality disorders associated with aggressive behavior, intolerance, conspiratorial imagination, and paranoia? Empirical answers using surveys have been limited by lack of access to extremist groups, while field studies have lacked psychological measures and failed to compare extremists with contrast groups. We revisit the debate over the appeal of extremism in the U.S. context by comparing publicly available Twitter messages written by over 355,000 political extremist followers with messages written by non-extremist U.S. users. Analysis of text-based psychological indicators supports the moral foundation theory which identifies emotion as a critical factor in determining political orientation of individuals. Extremist followers also differ from others in four of the Big Five personality traits.
Tasks
Published	2017-04-01
URL	http://arxiv.org/abs/1704.00119v1
PDF	http://arxiv.org/pdf/1704.00119v1.pdf
PWC	https://paperswithcode.com/paper/psychological-and-personality-profiles-of
Repo
Framework

Object Detection in Videos with Tubelet Proposal Networks


Title	Object Detection in Videos with Tubelet Proposal Networks
Authors	Kai Kang, Hongsheng Li, Tong Xiao, Wanli Ouyang, Junjie Yan, Xihui Liu, Xiaogang Wang
Abstract	Object detection in videos has drawn increasing attention recently with the introduction of the large-scale ImageNet VID dataset. Different from object detection in static images, temporal information in videos is vital for object detection. To fully utilize temporal information, state-of-the-art methods are based on spatiotemporal tubelets, which are essentially sequences of associated bounding boxes across time. However, the existing methods have major limitations in generating tubelets in terms of quality and efficiency. Motion-based methods are able to obtain dense tubelets efficiently, but the lengths are generally only several frames, which is not optimal for incorporating long-term temporal information. Appearance-based methods, usually involving generic object tracking, could generate long tubelets, but are usually computationally expensive. In this work, we propose a framework for object detection in videos, which consists of a novel tubelet proposal network to efficiently generate spatiotemporal proposals, and a Long Short-term Memory (LSTM) network that incorporates temporal information from tubelet proposals for achieving high object detection accuracy in videos. Experiments on the large-scale ImageNet VID dataset demonstrate the effectiveness of the proposed framework for object detection in videos.
Tasks	Object Detection, Object Tracking
Published	2017-02-21
URL	http://arxiv.org/abs/1702.06355v2
PDF	http://arxiv.org/pdf/1702.06355v2.pdf
PWC	https://paperswithcode.com/paper/object-detection-in-videos-with-tubelet
Repo
Framework

Graph Partitioning with Acyclicity Constraints


Title	Graph Partitioning with Acyclicity Constraints
Authors	Orlando Moreira, Merten Popp, Christian Schulz
Abstract	Graphs are widely used to model execution dependencies in applications. In particular, the NP-complete problem of partitioning a graph under constraints receives enormous attention by researchers because of its applicability in multiprocessor scheduling. We identified the additional constraint of acyclic dependencies between blocks when mapping computer vision and imaging applications to a heterogeneous embedded multiprocessor. Existing algorithms and heuristics do not address this requirement and deliver results that are not applicable for our use-case. In this work, we show that this more constrained version of the graph partitioning problem is NP-complete and present heuristics that achieve a close approximation of the optimal solution found by an exhaustive search for small problem instances and much better scalability for larger instances. In addition, we can show a positive impact on the schedule of a real imaging application that improves communication volume and execution time.
Tasks	graph partitioning
Published	2017-04-03
URL	http://arxiv.org/abs/1704.00705v1
PDF	http://arxiv.org/pdf/1704.00705v1.pdf
PWC	https://paperswithcode.com/paper/graph-partitioning-with-acyclicity
Repo
Framework

Object Referring in Visual Scene with Spoken Language


Title	Object Referring in Visual Scene with Spoken Language
Authors	Arun Balajee Vasudevan, Dengxin Dai, Luc Van Gool
Abstract	Object referring has important applications, especially for human-machine interaction. While having received great attention, the task is mainly attacked with written language (text) as input rather than spoken language (speech), which is more natural. This paper investigates Object Referring with Spoken Language (ORSpoken) by presenting two datasets and one novel approach. Objects are annotated with their locations in images, text descriptions and speech descriptions. This makes the datasets ideal for multi-modality learning. The approach is developed by carefully taking down ORSpoken problem into three sub-problems and introducing task-specific vision-language interactions at the corresponding levels. Experiments show that our method outperforms competing methods consistently and significantly. The approach is also evaluated in the presence of audio noise, showing the efficacy of the proposed vision-language interaction methods in counteracting background noise.
Tasks
Published	2017-11-10
URL	http://arxiv.org/abs/1711.03800v2
PDF	http://arxiv.org/pdf/1711.03800v2.pdf
PWC	https://paperswithcode.com/paper/object-referring-in-visual-scene-with-spoken
Repo
Framework