Paper Group ANR 639
Detecting Curve Text in the Wild: New Dataset and New Solution. Visual Integration of Data and Model Space in Ensemble Learning. Photometric stereo for strong specular highlights. leave a trace - A People Tracking System Meets Anomaly Detection. Vector Quantization using the Improved Differential Evolution Algorithm for Image Compression. FLAME: A …
Detecting Curve Text in the Wild: New Dataset and New Solution
Title | Detecting Curve Text in the Wild: New Dataset and New Solution |
Authors | Liu Yuliang, Jin Lianwen, Zhang Shuaitao, Zhang Sheng |
Abstract | Scene text detection has been made great progress in recent years. The detection manners are evolving from axis-aligned rectangle to rotated rectangle and further to quadrangle. However, current datasets contain very little curve text, which can be widely observed in scene images such as signboard, product name and so on. To raise the concerns of reading curve text in the wild, in this paper, we construct a curve text dataset named CTW1500, which includes over 10k text annotations in 1,500 images (1000 for training and 500 for testing). Based on this dataset, we pioneering propose a polygon based curve text detector (CTD) which can directly detect curve text without empirical combination. Moreover, by seamlessly integrating the recurrent transverse and longitudinal offset connection (TLOC), the proposed method can be end-to-end trainable to learn the inherent connection among the position offsets. This allows the CTD to explore context information instead of predicting points independently, resulting in more smooth and accurate detection. We also propose two simple but effective post-processing methods named non-polygon suppress (NPS) and polygonal non-maximum suppression (PNMS) to further improve the detection accuracy. Furthermore, the proposed approach in this paper is designed in an universal manner, which can also be trained with rectangular or quadrilateral bounding boxes without extra efforts. Experimental results on CTW-1500 demonstrate our method with only a light backbone can outperform state-of-the-art methods with a large margin. By evaluating only in the curve or non-curve subset, the CTD + TLOC can still achieve the best results. Code is available at https://github.com/Yuliang-Liu/Curve-Text-Detector. |
Tasks | Curved Text Detection, Scene Text Detection |
Published | 2017-12-06 |
URL | http://arxiv.org/abs/1712.02170v1 |
http://arxiv.org/pdf/1712.02170v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-curve-text-in-the-wild-new-dataset |
Repo | |
Framework | |
Visual Integration of Data and Model Space in Ensemble Learning
Title | Visual Integration of Data and Model Space in Ensemble Learning |
Authors | Bruno Schneider, Dominik Jäckle, Florian Stoffel, Alexandra Diehl, Johannes Fuchs, Daniel Keim |
Abstract | Ensembles of classifier models typically deliver superior performance and can outperform single classifier models given a dataset and classification task at hand. However, the gain in performance comes together with the lack in comprehensibility, posing a challenge to understand how each model affects the classification outputs and where the errors come from. We propose a tight visual integration of the data and the model space for exploring and combining classifier models. We introduce a workflow that builds upon the visual integration and enables the effective exploration of classification outputs and models. We then present a use case in which we start with an ensemble automatically selected by a standard ensemble selection algorithm, and show how we can manipulate models and alternative combinations. |
Tasks | |
Published | 2017-10-19 |
URL | http://arxiv.org/abs/1710.07322v1 |
http://arxiv.org/pdf/1710.07322v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-integration-of-data-and-model-space-in |
Repo | |
Framework | |
Photometric stereo for strong specular highlights
Title | Photometric stereo for strong specular highlights |
Authors | Maryam Khanian, Ali Sharifi Boroujerdi, Michael Breuß |
Abstract | Photometric stereo (PS) is a fundamental technique in computer vision known to produce 3-D shape with high accuracy. The setting of PS is defined by using several input images of a static scene taken from one and the same camera position but under varying illumination. The vast majority of studies in this 3-D reconstruction method assume orthographic projection for the camera model. In addition, they mainly consider the Lambertian reflectance model as the way that light scatters at surfaces. So, providing reliable PS results from real world objects still remains a challenging task. We address 3-D reconstruction by PS using a more realistic set of assumptions combining for the first time the complete Blinn-Phong reflectance model and perspective projection. To this end, we will compare two different methods of incorporating the perspective projection into our model. Experiments are performed on both synthetic and real world images. Note that our real-world experiments do not benefit from laboratory conditions. The results show the high potential of our method even for complex real world applications such as medical endoscopy images which may include high amounts of specular highlights. |
Tasks | |
Published | 2017-09-05 |
URL | http://arxiv.org/abs/1709.01357v1 |
http://arxiv.org/pdf/1709.01357v1.pdf | |
PWC | https://paperswithcode.com/paper/photometric-stereo-for-strong-specular |
Repo | |
Framework | |
leave a trace - A People Tracking System Meets Anomaly Detection
Title | leave a trace - A People Tracking System Meets Anomaly Detection |
Authors | Dominik Rueß, Konstantinos Amplianitis, Niklas Deckers, Michele Adduci, Kristian Manthey, Ralf Reulke |
Abstract | Video surveillance always had a negative connotation, among others because of the loss of privacy and because it may not automatically increase public safety. If it was able to detect atypical (i.e. dangerous) situations in real time, autonomously and anonymously, this could change. A prerequisite for this is a reliable automatic detection of possibly dangerous situations from video data. This is done classically by object extraction and tracking. From the derived trajectories, we then want to determine dangerous situations by detecting atypical trajectories. However, due to ethical considerations it is better to develop such a system on data without people being threatened or even harmed, plus with having them know that there is such a tracking system installed. Another important point is that these situations do not occur very often in real, public CCTV areas and may be captured properly even less. In the artistic project leave a trace the tracked objects, people in an atrium of a institutional building, become actor and thus part of the installation. Visualisation in real-time allows interaction by these actors, which in turn creates many atypical interaction situations on which we can develop our situation detection. The data set has evolved over three years and hence, is huge. In this article we describe the tracking system and several approaches for the detection of atypical trajectories. |
Tasks | Anomaly Detection |
Published | 2017-07-20 |
URL | http://arxiv.org/abs/1707.06557v1 |
http://arxiv.org/pdf/1707.06557v1.pdf | |
PWC | https://paperswithcode.com/paper/leave-a-trace-a-people-tracking-system-meets |
Repo | |
Framework | |
Vector Quantization using the Improved Differential Evolution Algorithm for Image Compression
Title | Vector Quantization using the Improved Differential Evolution Algorithm for Image Compression |
Authors | Sayan Nag |
Abstract | Vector Quantization, VQ is a popular image compression technique with a simple decoding architecture and high compression ratio. Codebook designing is the most essential part in Vector Quantization. LindeBuzoGray, LBG is a traditional method of generation of VQ Codebook which results in lower PSNR value. A Codebook affects the quality of image compression, so the choice of an appropriate codebook is a must. Several optimization techniques have been proposed for global codebook generation to enhance the quality of image compression. In this paper, a novel algorithm called IDE-LBG is proposed which uses Improved Differential Evolution Algorithm coupled with LBG for generating optimum VQ Codebooks. The proposed IDE works better than the traditional DE with modifications in the scaling factor and the boundary control mechanism. The IDE generates better solutions by efficient exploration and exploitation of the search space. Then the best optimal solution obtained by the IDE is provided as the initial Codebook for the LBG. This approach produces an efficient Codebook with less computational time and the consequences include excellent PSNR values and superior quality reconstructed images. It is observed that the proposed IDE-LBG find better VQ Codebooks as compared to IPSO-LBG, BA-LBG and FA-LBG. |
Tasks | Efficient Exploration, Image Compression, Quantization |
Published | 2017-10-15 |
URL | http://arxiv.org/abs/1710.05311v1 |
http://arxiv.org/pdf/1710.05311v1.pdf | |
PWC | https://paperswithcode.com/paper/vector-quantization-using-the-improved |
Repo | |
Framework | |
FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference
Title | FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference |
Authors | Tianyu Wang, Marco Morucci, M. Usaid Awan, Yameng Liu, Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky |
Abstract | A classical problem in causal inference is that of matching, where treatment units need to be matched to control units based on covariate information. In this work, we propose a method that computes high quality almost-exact matches for high-dimensional categorical datasets. This method, called FLAME (Fast Large-scale Almost Matching Exactly), learns a distance metric for matching using a hold-out training data set. In order to perform matching efficiently for large datasets, FLAME leverages techniques that are natural for query processing in the area of database management, and two implementations of FLAME are provided: the first uses SQL queries and the second uses bit-vector techniques. The algorithm starts by constructing matches of the highest quality (exact matches on all covariates), and successively eliminates variables in order to match exactly on as many variables as possible, while still maintaining interpretable high-quality matches and balance between treatment and control groups. We leverage these high quality matches to estimate conditional average treatment effects (CATEs). Our experiments show that FLAME scales to huge datasets with millions of observations where existing state-of-the-art methods fail, and that it achieves significantly better performance than other matching methods. |
Tasks | Causal Inference |
Published | 2017-07-19 |
URL | https://arxiv.org/abs/1707.06315v7 |
https://arxiv.org/pdf/1707.06315v7.pdf | |
PWC | https://paperswithcode.com/paper/flame-a-fast-large-scale-almost-matching |
Repo | |
Framework | |
Deep 3D Face Identification
Title | Deep 3D Face Identification |
Authors | Donghyun Kim, Matthias Hernandez, Jongmoo Choi, Gerard Medioni |
Abstract | We propose a novel 3D face recognition algorithm using a deep convolutional neural network (DCNN) and a 3D augmentation technique. The performance of 2D face recognition algorithms has significantly increased by leveraging the representational power of deep neural networks and the use of large-scale labeled training data. As opposed to 2D face recognition, training discriminative deep features for 3D face recognition is very difficult due to the lack of large-scale 3D face datasets. In this paper, we show that transfer learning from a CNN trained on 2D face images can effectively work for 3D face recognition by fine-tuning the CNN with a relatively small number of 3D facial scans. We also propose a 3D face augmentation technique which synthesizes a number of different facial expressions from a single 3D face scan. Our proposed method shows excellent recognition results on Bosphorus, BU-3DFE, and 3D-TEC datasets, without using hand-crafted features. The 3D identification using our deep features also scales well for large databases. |
Tasks | Face Identification, Face Recognition, Transfer Learning |
Published | 2017-03-30 |
URL | http://arxiv.org/abs/1703.10714v1 |
http://arxiv.org/pdf/1703.10714v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-3d-face-identification |
Repo | |
Framework | |
Mapping the Americanization of English in Space and Time
Title | Mapping the Americanization of English in Space and Time |
Authors | Bruno Gonçalves, Lucía Loureiro-Porto, José J. Ramasco, David Sánchez |
Abstract | As global political preeminence gradually shifted from the United Kingdom to the United States, so did the capacity to culturally influence the rest of the world. In this work, we analyze how the world-wide varieties of written English are evolving. We study both the spatial and temporal variations of vocabulary and spelling of English using a large corpus of geolocated tweets and the Google Books datasets corresponding to books published in the US and the UK. The advantage of our approach is that we can address both standard written language (Google Books) and the more colloquial forms of microblogging messages (Twitter). We find that American English is the dominant form of English outside the UK and that its influence is felt even within the UK borders. Finally, we analyze how this trend has evolved over time and the impact that some cultural events have had in shaping it. |
Tasks | |
Published | 2017-07-03 |
URL | http://arxiv.org/abs/1707.00781v2 |
http://arxiv.org/pdf/1707.00781v2.pdf | |
PWC | https://paperswithcode.com/paper/mapping-the-americanization-of-english-in |
Repo | |
Framework | |
Brain Abnormality Detection by Deep Convolutional Neural Network
Title | Brain Abnormality Detection by Deep Convolutional Neural Network |
Authors | Mina Rezaei, Haojin Yang, Christoph Meinel |
Abstract | In this paper, we describe our method for classification of brain magnetic resonance (MR) images into different abnormalities and healthy classes based on the deep neural network. We propose our method to detect high and low-grade glioma, multiple sclerosis, and Alzheimer diseases as well as healthy cases. Our network architecture has ten learning layers that include seven convolutional layers and three fully connected layers. We have achieved a promising result in five categories of brain images (classification task) with 95.7% accuracy. |
Tasks | Anomaly Detection |
Published | 2017-08-17 |
URL | http://arxiv.org/abs/1708.05206v1 |
http://arxiv.org/pdf/1708.05206v1.pdf | |
PWC | https://paperswithcode.com/paper/brain-abnormality-detection-by-deep |
Repo | |
Framework | |
Recurrent Neural Networks for Online Video Popularity Prediction
Title | Recurrent Neural Networks for Online Video Popularity Prediction |
Authors | Tomasz Trzcinski, Pawel Andruszkiewicz, Tomasz Bochenski, Przemyslaw Rokita |
Abstract | In this paper, we address the problem of popularity prediction of online videos shared in social media. We prove that this challenging task can be approached using recently proposed deep neural network architectures. We cast the popularity prediction problem as a classification task and we aim to solve it using only visual cues extracted from videos. To that end, we propose a new method based on a Long-term Recurrent Convolutional Network (LRCN) that incorporates the sequentiality of the information in the model. Results obtained on a dataset of over 37’000 videos published on Facebook show that using our method leads to over 30% improvement in prediction performance over the traditional shallow approaches and can provide valuable insights for content creators. |
Tasks | |
Published | 2017-07-21 |
URL | http://arxiv.org/abs/1707.06807v1 |
http://arxiv.org/pdf/1707.06807v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-neural-networks-for-online-video |
Repo | |
Framework | |
Robust Multi-view Pedestrian Tracking Using Neural Networks
Title | Robust Multi-view Pedestrian Tracking Using Neural Networks |
Authors | Md Zahangir Alom, Tarek M. Taha |
Abstract | In this paper, we present a real-time robust multi-view pedestrian detection and tracking system for video surveillance using neural networks which can be used in dynamic environments. The proposed system consists of two phases: multi-view pedestrian detection and tracking. First, pedestrian detection utilizes background subtraction to segment the foreground blob. An adaptive background subtraction method where each of the pixel of input image models as a mixture of Gaussians and uses an on-line approximation to update the model applies to extract the foreground region. The Gaussian distributions are then evaluated to determine which are most likely to result from a background process. This method produces a steady, real-time tracker in outdoor environment that consistently deals with changes of lighting condition, and long-term scene change. Second, the Tracking is performed at two phases: pedestrian classification and tracking the individual subject. A sliding window is applied on foreground binary image to select an input window which is used for selecting the input image patches from actually input frame. The neural networks is used for classification with PHOG features. Finally, a Kalman filter is applied to calculate the subsequent step for tracking that aims at finding the exact position of pedestrians in an input image. The experimental result shows that the proposed approach yields promising performance on multi-view pedestrian detection and tracking on different benchmark datasets. |
Tasks | Pedestrian Detection |
Published | 2017-04-21 |
URL | http://arxiv.org/abs/1704.06370v1 |
http://arxiv.org/pdf/1704.06370v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-multi-view-pedestrian-tracking-using |
Repo | |
Framework | |
RIOT: a Stochastic-based Method for Workflow Scheduling in the Cloud
Title | RIOT: a Stochastic-based Method for Workflow Scheduling in the Cloud |
Authors | Jianfeng Chen, Tim Menzies |
Abstract | Cloud computing provides engineers or scientists a place to run complex computing tasks. Finding a workflow’s deployment configuration in a cloud environment is not easy. Traditional workflow scheduling algorithms were based on some heuristics, e.g. reliability greedy, cost greedy, cost-time balancing, etc., or more recently, the meta-heuristic methods, such as genetic algorithms. These methods are very slow and not suitable for rescheduling in the dynamic cloud environment. This paper introduces RIOT (Randomized Instance Order Types), a stochastic based method for workflow scheduling. RIOT groups the tasks in the workflow into virtual machines via a probability model and then uses an effective surrogate-based method to assess a large amount of potential scheduling. Experiments in dozens of study cases showed that RIOT executes tens of times faster than traditional methods while generating comparable results to other methods. |
Tasks | |
Published | 2017-08-27 |
URL | http://arxiv.org/abs/1708.08127v2 |
http://arxiv.org/pdf/1708.08127v2.pdf | |
PWC | https://paperswithcode.com/paper/riot-a-stochastic-based-method-for-workflow |
Repo | |
Framework | |
A Survey of Quantum Learning Theory
Title | A Survey of Quantum Learning Theory |
Authors | Srinivasan Arunachalam, Ronald de Wolf |
Abstract | This paper surveys quantum learning theory: the theoretical aspects of machine learning using quantum computers. We describe the main results known for three models of learning: exact learning from membership queries, and Probably Approximately Correct (PAC) and agnostic learning from classical or quantum examples. |
Tasks | |
Published | 2017-01-24 |
URL | http://arxiv.org/abs/1701.06806v3 |
http://arxiv.org/pdf/1701.06806v3.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-of-quantum-learning-theory |
Repo | |
Framework | |
On the Limits of Learning Representations with Label-Based Supervision
Title | On the Limits of Learning Representations with Label-Based Supervision |
Authors | Jiaming Song, Russell Stewart, Shengjia Zhao, Stefano Ermon |
Abstract | Advances in neural network based classifiers have transformed automatic feature learning from a pipe dream of stronger AI to a routine and expected property of practical systems. Since the emergence of AlexNet every winning submission of the ImageNet challenge has employed end-to-end representation learning, and due to the utility of good representations for transfer learning, representation learning has become as an important and distinct task from supervised learning. At present, this distinction is inconsequential, as supervised methods are state-of-the-art in learning transferable representations. But recent work has shown that generative models can also be powerful agents of representation learning. Will the representations learned from these generative methods ever rival the quality of those from their supervised competitors? In this work, we argue in the affirmative, that from an information theoretic perspective, generative models have greater potential for representation learning. Based on several experimentally validated assumptions, we show that supervised learning is upper bounded in its capacity for representation learning in ways that certain generative models, such as Generative Adversarial Networks (GANs) are not. We hope that our analysis will provide a rigorous motivation for further exploration of generative representation learning. |
Tasks | Representation Learning, Transfer Learning |
Published | 2017-03-07 |
URL | http://arxiv.org/abs/1703.02156v1 |
http://arxiv.org/pdf/1703.02156v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-limits-of-learning-representations |
Repo | |
Framework | |
Android Malware Characterization using Metadata and Machine Learning Techniques
Title | Android Malware Characterization using Metadata and Machine Learning Techniques |
Authors | Ignacio Martín, José Alberto Hernández, Alfonso Muñoz, Antonio Guzmán |
Abstract | Android Malware has emerged as a consequence of the increasing popularity of smartphones and tablets. While most previous work focuses on inherent characteristics of Android apps to detect malware, this study analyses indirect features and meta-data to identify patterns in malware applications. Our experiments show that: (1) the permissions used by an application offer only moderate performance results; (2) other features publicly available at Android Markets are more relevant in detecting malware, such as the application developer and certificate issuer, and (3) compact and efficient classifiers can be constructed for the early detection of malware applications prior to code inspection or sandboxing. |
Tasks | |
Published | 2017-12-12 |
URL | http://arxiv.org/abs/1712.04402v1 |
http://arxiv.org/pdf/1712.04402v1.pdf | |
PWC | https://paperswithcode.com/paper/android-malware-characterization-using |
Repo | |
Framework | |