Paper Group ANR 511
Language Independent Single Document Image Super-Resolution using CNN for improved recognition. Fine-Gray competing risks model with high-dimensional covariates: estimation and Inference. Sample complexity of population recovery. Bayesian Belief Updating of Spatiotemporal Seizure Dynamics. A General Model for Robust Tensor Factorization with Unknow …
Language Independent Single Document Image Super-Resolution using CNN for improved recognition
Title | Language Independent Single Document Image Super-Resolution using CNN for improved recognition |
Authors | Ram Krishna Pandey, A G Ramakrishnan |
Abstract | Recognition of document images have important applications in restoring old and classical texts. The problem involves quality improvement before passing it to a properly trained OCR to get accurate recognition of the text. The image enhancement and quality improvement constitute important steps as subsequent recognition depends upon the quality of the input image. There are scenarios when high resolution images are not available and our experiments show that the OCR accuracy reduces significantly with decrease in the spatial resolution of document images. Thus the only option is to improve the resolution of such document images. The goal is to construct a high resolution image, given a single low resolution binary image, which constitutes the problem of single image super-resolution. Most of the previous work in super-resolution deal with natural images which have more information-content than the document images. Here, we use Convolution Neural Network to learn the mapping between low and the corresponding high resolution images. We experiment with different number of layers, parameter settings and non-linear functions to build a fast end-to-end framework for document image super-resolution. Our proposed model shows a very good PSNR improvement of about 4 dB on 75 dpi Tamil images, resulting in a 3 % improvement of word level accuracy by the OCR. It takes less time than the recent sparse based natural image super-resolution technique, making it useful for real-time document recognition applications. |
Tasks | Image Enhancement, Image Super-Resolution, Optical Character Recognition, Super-Resolution |
Published | 2017-01-30 |
URL | http://arxiv.org/abs/1701.08835v1 |
http://arxiv.org/pdf/1701.08835v1.pdf | |
PWC | https://paperswithcode.com/paper/language-independent-single-document-image |
Repo | |
Framework | |
Fine-Gray competing risks model with high-dimensional covariates: estimation and Inference
Title | Fine-Gray competing risks model with high-dimensional covariates: estimation and Inference |
Authors | Jue Hou, Jelena Bradic, Ronghui Xu |
Abstract | The purpose of this paper is to construct confidence intervals for the regression coefficients in the Fine-Gray model for competing risks data with random censoring, where the number of covariates can be larger than the sample size. Despite strong motivation from biomedical applications, a high-dimensional Fine-Gray model has attracted relatively little attention among the methodological or theoretical literature. We fill in this gap by developing confidence intervals based on a one-step bias-correction for a regularized estimation. We develop a theoretical framework for the partial likelihood, which does not have independent and identically distributed entries and therefore presents many technical challenges. We also study the approximation error from the weighting scheme under random censoring for competing risks and establish new concentration results for time-dependent processes. In addition to the theoretical results and algorithms, we present extensive numerical experiments and an application to a study of non-cancer mortality among prostate cancer patients using the linked Medicare-SEER data. |
Tasks | |
Published | 2017-07-29 |
URL | http://arxiv.org/abs/1707.09561v2 |
http://arxiv.org/pdf/1707.09561v2.pdf | |
PWC | https://paperswithcode.com/paper/fine-gray-competing-risks-model-with-high |
Repo | |
Framework | |
Sample complexity of population recovery
Title | Sample complexity of population recovery |
Authors | Yury Polyanskiy, Ananda Theertha Suresh, Yihong Wu |
Abstract | The problem of population recovery refers to estimating a distribution based on incomplete or corrupted samples. Consider a random poll of sample size $n$ conducted on a population of individuals, where each pollee is asked to answer $d$ binary questions. We consider one of the two polling impediments: (a) in lossy population recovery, a pollee may skip each question with probability $\epsilon$, (b) in noisy population recovery, a pollee may lie on each question with probability $\epsilon$. Given $n$ lossy or noisy samples, the goal is to estimate the probabilities of all $2^d$ binary vectors simultaneously within accuracy $\delta$ with high probability. This paper settles the sample complexity of population recovery. For lossy model, the optimal sample complexity is $\tilde\Theta(\delta^{-2\max{\frac{\epsilon}{1-\epsilon},1}})$, improving the state of the art by Moitra and Saks in several ways: a lower bound is established, the upper bound is improved and the result depends at most on the logarithm of the dimension. Surprisingly, the sample complexity undergoes a phase transition from parametric to nonparametric rate when $\epsilon$ exceeds $1/2$. For noisy population recovery, the sharp sample complexity turns out to be more sensitive to dimension and scales as $\exp(\Theta(d^{1/3} \log^{2/3}(1/\delta)))$ except for the trivial cases of $\epsilon=0,1/2$ or $1$. For both models, our estimators simply compute the empirical mean of a certain function, which is found by pre-solving a linear program (LP). Curiously, the dual LP can be understood as Le Cam’s method for lower-bounding the minimax risk, thus establishing the statistical optimality of the proposed estimators. The value of the LP is determined by complex-analytic methods. |
Tasks | |
Published | 2017-02-18 |
URL | http://arxiv.org/abs/1702.05574v2 |
http://arxiv.org/pdf/1702.05574v2.pdf | |
PWC | https://paperswithcode.com/paper/sample-complexity-of-population-recovery |
Repo | |
Framework | |
Bayesian Belief Updating of Spatiotemporal Seizure Dynamics
Title | Bayesian Belief Updating of Spatiotemporal Seizure Dynamics |
Authors | Gerald K Cooray, Richard Rosch, Torsten Baldeweg, Louis Lemieux, Karl Friston, Biswa Sengupta |
Abstract | Epileptic seizure activity shows complicated dynamics in both space and time. To understand the evolution and propagation of seizures spatially extended sets of data need to be analysed. We have previously described an efficient filtering scheme using variational Laplace that can be used in the Dynamic Causal Modelling (DCM) framework [Friston, 2003] to estimate the temporal dynamics of seizures recorded using either invasive or non-invasive electrical recordings (EEG/ECoG). Spatiotemporal dynamics are modelled using a partial differential equation – in contrast to the ordinary differential equation used in our previous work on temporal estimation of seizure dynamics [Cooray, 2016]. We provide the requisite theoretical background for the method and test the ensuing scheme on simulated seizure activity data and empirical invasive ECoG data. The method provides a framework to assimilate the spatial and temporal dynamics of seizure activity, an aspect of great physiological and clinical importance. |
Tasks | EEG |
Published | 2017-05-20 |
URL | http://arxiv.org/abs/1705.07278v2 |
http://arxiv.org/pdf/1705.07278v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-belief-updating-of-spatiotemporal |
Repo | |
Framework | |
A General Model for Robust Tensor Factorization with Unknown Noise
Title | A General Model for Robust Tensor Factorization with Unknown Noise |
Authors | Xi’ai Chen, Zhi Han, Yao Wang, Qian Zhao, Deyu Meng, Lin Lin, Yandong Tang |
Abstract | Because of the limitations of matrix factorization, such as losing spatial structure information, the concept of low-rank tensor factorization (LRTF) has been applied for the recovery of a low dimensional subspace from high dimensional visual data. The low-rank tensor recovery is generally achieved by minimizing the loss function between the observed data and the factorization representation. The loss function is designed in various forms under different noise distribution assumptions, like $L_1$ norm for Laplacian distribution and $L_2$ norm for Gaussian distribution. However, they often fail to tackle the real data which are corrupted by the noise with unknown distribution. In this paper, we propose a generalized weighted low-rank tensor factorization method (GWLRTF) integrated with the idea of noise modelling. This procedure treats the target data as high-order tensor directly and models the noise by a Mixture of Gaussians, which is called MoG GWLRTF. The parameters in the model are estimated under the EM framework and through a new developed algorithm of weighted low-rank tensor factorization. We provide two versions of the algorithm with different tensor factorization operations, i.e., CP factorization and Tucker factorization. Extensive experiments indicate the respective advantages of this two versions in different applications and also demonstrate the effectiveness of MoG GWLRTF compared with other competing methods. |
Tasks | |
Published | 2017-05-18 |
URL | http://arxiv.org/abs/1705.06755v1 |
http://arxiv.org/pdf/1705.06755v1.pdf | |
PWC | https://paperswithcode.com/paper/a-general-model-for-robust-tensor |
Repo | |
Framework | |
The 2D Tree Sliding Window Discrete Fourier Transform
Title | The 2D Tree Sliding Window Discrete Fourier Transform |
Authors | Lee F. Richardson, William F. Eddy |
Abstract | We present a new algorithm for the 2D Sliding Window Discrete Fourier Transform (SWDFT). Our algorithm avoids repeating calculations in overlapping windows by storing them in a tree data-structure based on the ideas of the Cooley- Tukey Fast Fourier Transform (FFT). For an $N_0 \times N_1$ array and $n_0 \times n_1$ windows, our algorithm takes $O(N_0 N_1 n_0 n_1)$ operations. We provide a C implementation of our algorithm for the Radix-2 case, compare ours with existing algorithms, and show how our algorithm easily extends to higher dimensions. |
Tasks | |
Published | 2017-07-25 |
URL | http://arxiv.org/abs/1707.08213v2 |
http://arxiv.org/pdf/1707.08213v2.pdf | |
PWC | https://paperswithcode.com/paper/the-2d-tree-sliding-window-discrete-fourier |
Repo | |
Framework | |
Recent Advances in Zero-shot Recognition
Title | Recent Advances in Zero-shot Recognition |
Authors | Yanwei Fu, Tao Xiang, Yu-Gang Jiang, Xiangyang Xue, Leonid Sigal, Shaogang Gong |
Abstract | With the recent renaissance of deep convolution neural networks, encouraging breakthroughs have been achieved on the supervised recognition tasks, where each class has sufficient training data and fully annotated training data. However, to scale the recognition to a large number of classes with few or now training samples for each class remains an unsolved problem. One approach to scaling up the recognition is to develop models capable of recognizing unseen categories without any training instances, or zero-shot recognition/ learning. This article provides a comprehensive review of existing zero-shot recognition techniques covering various aspects ranging from representations of models, and from datasets and evaluation settings. We also overview related recognition tasks including one-shot and open set recognition which can be used as natural extensions of zero-shot recognition when limited number of class samples become available or when zero-shot recognition is implemented in a real-world setting. Importantly, we highlight the limitations of existing approaches and point out future research directions in this existing new research area. |
Tasks | Open Set Learning, Zero-Shot Learning |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.04837v1 |
http://arxiv.org/pdf/1710.04837v1.pdf | |
PWC | https://paperswithcode.com/paper/recent-advances-in-zero-shot-recognition |
Repo | |
Framework | |
Reinforcement Mechanism Design for e-commerce
Title | Reinforcement Mechanism Design for e-commerce |
Authors | Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang, Yiwei Zhang |
Abstract | We study the problem of allocating impressions to sellers in e-commerce websites, such as Amazon, eBay or Taobao, aiming to maximize the total revenue generated by the platform. We employ a general framework of reinforcement mechanism design, which uses deep reinforcement learning to design efficient algorithms, taking the strategic behaviour of the sellers into account. Specifically, we model the impression allocation problem as a Markov decision process, where the states encode the history of impressions, prices, transactions and generated revenue and the actions are the possible impression allocations in each round. To tackle the problem of continuity and high-dimensionality of states and actions, we adopt the ideas of the DDPG algorithm to design an actor-critic policy gradient algorithm which takes advantage of the problem domain in order to achieve convergence and stability. We evaluate our proposed algorithm, coined IA(GRU), by comparing it against DDPG, as well as several natural heuristics, under different rationality models for the sellers - we assume that sellers follow well-known no-regret type strategies which may vary in their degree of sophistication. We find that IA(GRU) outperforms all algorithms in terms of the total revenue. |
Tasks | |
Published | 2017-08-25 |
URL | http://arxiv.org/abs/1708.07607v3 |
http://arxiv.org/pdf/1708.07607v3.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-mechanism-design-for-e-commerce |
Repo | |
Framework | |
Real-Time Illegal Parking Detection System Based on Deep Learning
Title | Real-Time Illegal Parking Detection System Based on Deep Learning |
Authors | Xuemei Xie, Chenye Wang, Shu Chen, Guangming Shi, Zhifu Zhao |
Abstract | The increasing illegal parking has become more and more serious. Nowadays the methods of detecting illegally parked vehicles are based on background segmentation. However, this method is weakly robust and sensitive to environment. Benefitting from deep learning, this paper proposes a novel illegal vehicle parking detection system. Illegal vehicles captured by camera are firstly located and classified by the famous Single Shot MultiBox Detector (SSD) algorithm. To improve the performance, we propose to optimize SSD by adjusting the aspect ratio of default box to accommodate with our dataset better. After that, a tracking and analysis of movement is adopted to judge the illegal vehicles in the region of interest (ROI). Experiments show that the system can achieve a 99% accuracy and real-time (25FPS) detection with strong robustness in complex environments. |
Tasks | |
Published | 2017-10-05 |
URL | http://arxiv.org/abs/1710.02546v1 |
http://arxiv.org/pdf/1710.02546v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-illegal-parking-detection-system |
Repo | |
Framework | |
Machine-Translation History and Evolution: Survey for Arabic-English Translations
Title | Machine-Translation History and Evolution: Survey for Arabic-English Translations |
Authors | Nabeel T. Alsohybe, Neama Abdulaziz Dahan, Fadl Mutaher Ba-Alwi |
Abstract | As a result of the rapid changes in information and communication technology (ICT), the world has become a small village where people from all over the world connect with each other in dialogue and communication via the Internet. Also, communications have become a daily routine activity due to the new globalization where companies and even universities become global residing cross countries borders. As a result, translation becomes a needed activity in this connected world. ICT made it possible to have a student in one country take a course or even a degree from a different country anytime anywhere easily. The resulted communication still needs a language as a means that helps the receiver understands the contents of the sent message. People need an automated translation application because human translators are hard to find all the times, and the human translations are very expensive comparing to the translations automated process. Several types of research describe the electronic process of the Machine-Translation. In this paper, the authors are going to study some of these previous researches, and they will explore some of the needed tools for the Machine-Translation. This research is going to contribute to the Machine-Translation area by helping future researchers to have a summary for the Machine-Translation groups of research and to let lights on the importance of the translation mechanism. |
Tasks | Machine Translation |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04685v1 |
http://arxiv.org/pdf/1709.04685v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-translation-history-and-evolution |
Repo | |
Framework | |
Change Detection under Global Viewpoint Uncertainty
Title | Change Detection under Global Viewpoint Uncertainty |
Authors | Murase Tomoya, Tanaka Kanji |
Abstract | This paper addresses the problem of change detection from a novel perspective of long-term map learning. We are particularly interested in designing an approach that can scale to large maps and that can function under global uncertainty in the viewpoint (i.e., GPS-denied situations). Our approach, which utilizes a compact bag-of-words (BoW) scene model, makes several contributions to the problem: 1) Two kinds of prior information are extracted from the view sequence map and used for change detection. Further, we propose a novel type of prior, called motion prior, to predict the relative motions of stationary objects and anomaly ego-motion detection. The proposed prior is also useful for distinguishing stationary from non-stationary objects. 2) A small set of good reference images (e.g., 10) are efficiently retrieved from the view sequence map by employing the recently developed Bag-of-Local-Convolutional-Features (BoLCF) scene model. 3) Change detection is reformulated as a scene retrieval over these reference images to find changed objects using a novel spatial Bag-of-Words (SBoW) scene model. Evaluations conducted of individual techniques and also their combinations on a challenging dataset of highly dynamic scenes in the publicly available Malaga dataset verify their efficacy. |
Tasks | Motion Detection |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00552v1 |
http://arxiv.org/pdf/1703.00552v1.pdf | |
PWC | https://paperswithcode.com/paper/change-detection-under-global-viewpoint |
Repo | |
Framework | |
Psychological and Personality Profiles of Political Extremists
Title | Psychological and Personality Profiles of Political Extremists |
Authors | Meysam Alizadeh, Ingmar Weber, Claudio Cioffi-Revilla, Santo Fortunato, Michael Macy |
Abstract | Global recruitment into radical Islamic movements has spurred renewed interest in the appeal of political extremism. Is the appeal a rational response to material conditions or is it the expression of psychological and personality disorders associated with aggressive behavior, intolerance, conspiratorial imagination, and paranoia? Empirical answers using surveys have been limited by lack of access to extremist groups, while field studies have lacked psychological measures and failed to compare extremists with contrast groups. We revisit the debate over the appeal of extremism in the U.S. context by comparing publicly available Twitter messages written by over 355,000 political extremist followers with messages written by non-extremist U.S. users. Analysis of text-based psychological indicators supports the moral foundation theory which identifies emotion as a critical factor in determining political orientation of individuals. Extremist followers also differ from others in four of the Big Five personality traits. |
Tasks | |
Published | 2017-04-01 |
URL | http://arxiv.org/abs/1704.00119v1 |
http://arxiv.org/pdf/1704.00119v1.pdf | |
PWC | https://paperswithcode.com/paper/psychological-and-personality-profiles-of |
Repo | |
Framework | |
Object Detection in Videos with Tubelet Proposal Networks
Title | Object Detection in Videos with Tubelet Proposal Networks |
Authors | Kai Kang, Hongsheng Li, Tong Xiao, Wanli Ouyang, Junjie Yan, Xihui Liu, Xiaogang Wang |
Abstract | Object detection in videos has drawn increasing attention recently with the introduction of the large-scale ImageNet VID dataset. Different from object detection in static images, temporal information in videos is vital for object detection. To fully utilize temporal information, state-of-the-art methods are based on spatiotemporal tubelets, which are essentially sequences of associated bounding boxes across time. However, the existing methods have major limitations in generating tubelets in terms of quality and efficiency. Motion-based methods are able to obtain dense tubelets efficiently, but the lengths are generally only several frames, which is not optimal for incorporating long-term temporal information. Appearance-based methods, usually involving generic object tracking, could generate long tubelets, but are usually computationally expensive. In this work, we propose a framework for object detection in videos, which consists of a novel tubelet proposal network to efficiently generate spatiotemporal proposals, and a Long Short-term Memory (LSTM) network that incorporates temporal information from tubelet proposals for achieving high object detection accuracy in videos. Experiments on the large-scale ImageNet VID dataset demonstrate the effectiveness of the proposed framework for object detection in videos. |
Tasks | Object Detection, Object Tracking |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06355v2 |
http://arxiv.org/pdf/1702.06355v2.pdf | |
PWC | https://paperswithcode.com/paper/object-detection-in-videos-with-tubelet |
Repo | |
Framework | |
Graph Partitioning with Acyclicity Constraints
Title | Graph Partitioning with Acyclicity Constraints |
Authors | Orlando Moreira, Merten Popp, Christian Schulz |
Abstract | Graphs are widely used to model execution dependencies in applications. In particular, the NP-complete problem of partitioning a graph under constraints receives enormous attention by researchers because of its applicability in multiprocessor scheduling. We identified the additional constraint of acyclic dependencies between blocks when mapping computer vision and imaging applications to a heterogeneous embedded multiprocessor. Existing algorithms and heuristics do not address this requirement and deliver results that are not applicable for our use-case. In this work, we show that this more constrained version of the graph partitioning problem is NP-complete and present heuristics that achieve a close approximation of the optimal solution found by an exhaustive search for small problem instances and much better scalability for larger instances. In addition, we can show a positive impact on the schedule of a real imaging application that improves communication volume and execution time. |
Tasks | graph partitioning |
Published | 2017-04-03 |
URL | http://arxiv.org/abs/1704.00705v1 |
http://arxiv.org/pdf/1704.00705v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-partitioning-with-acyclicity |
Repo | |
Framework | |
Object Referring in Visual Scene with Spoken Language
Title | Object Referring in Visual Scene with Spoken Language |
Authors | Arun Balajee Vasudevan, Dengxin Dai, Luc Van Gool |
Abstract | Object referring has important applications, especially for human-machine interaction. While having received great attention, the task is mainly attacked with written language (text) as input rather than spoken language (speech), which is more natural. This paper investigates Object Referring with Spoken Language (ORSpoken) by presenting two datasets and one novel approach. Objects are annotated with their locations in images, text descriptions and speech descriptions. This makes the datasets ideal for multi-modality learning. The approach is developed by carefully taking down ORSpoken problem into three sub-problems and introducing task-specific vision-language interactions at the corresponding levels. Experiments show that our method outperforms competing methods consistently and significantly. The approach is also evaluated in the presence of audio noise, showing the efficacy of the proposed vision-language interaction methods in counteracting background noise. |
Tasks | |
Published | 2017-11-10 |
URL | http://arxiv.org/abs/1711.03800v2 |
http://arxiv.org/pdf/1711.03800v2.pdf | |
PWC | https://paperswithcode.com/paper/object-referring-in-visual-scene-with-spoken |
Repo | |
Framework | |