Paper Group ANR 421
Video Key Frame Extraction using Entropy value as Global and Local Feature. Stochastic Runtime Analysis of a Cross Entropy Algorithm for Traveling Salesman Problems. Word Representation Models for Morphologically Rich Languages in Neural Machine Translation. Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks. Rich Image Captioning …
Video Key Frame Extraction using Entropy value as Global and Local Feature
Title | Video Key Frame Extraction using Entropy value as Global and Local Feature |
Authors | Siddu P Algur, Vivek R |
Abstract | Key frames play an important role in video annotation. It is one of the widely used methods for video abstraction as this will help us for processing a large set of video data with sufficient content representation in faster way. In this paper a novel approach for key-frame extraction using entropy value is proposed. The proposed approach classifies frames based on entropy values as global feature and selects frame from each class as representative key-frame. It also eliminates redundant frames from selected key-frames using entropy value as local feature. Evaluation of the approach on several video clips has been presented. Results show that the algorithm is successful in helping annotators automatically identify video key-frames. |
Tasks | |
Published | 2016-05-28 |
URL | http://arxiv.org/abs/1605.08857v1 |
http://arxiv.org/pdf/1605.08857v1.pdf | |
PWC | https://paperswithcode.com/paper/video-key-frame-extraction-using-entropy |
Repo | |
Framework | |
Stochastic Runtime Analysis of a Cross Entropy Algorithm for Traveling Salesman Problems
Title | Stochastic Runtime Analysis of a Cross Entropy Algorithm for Traveling Salesman Problems |
Authors | Zijun Wu, Rolf Moehring, Jianhui Lai |
Abstract | This article analyzes the stochastic runtime of a Cross-Entropy Algorithm on two classes of traveling salesman problems. The algorithm shares main features of the famous Max-Min Ant System with iteration-best reinforcement. For simple instances that have a ${1,n}$-valued distance function and a unique optimal solution, we prove a stochastic runtime of $O(n^{6+\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{3+\epsilon}\ln n)$ with the edge-based random solution generation for an arbitrary $\epsilon\in (0,1)$. These runtimes are very close to the known expected runtime for variants of Max-Min Ant System with best-so-far reinforcement. They are obtained for the stronger notion of stochastic runtime, which means that an optimal solution is obtained in that time with an overwhelming probability, i.e., a probability tending exponentially fast to one with growing problem size. We also inspect more complex instances with $n$ vertices positioned on an $m\times m$ grid. When the $n$ vertices span a convex polygon, we obtain a stochastic runtime of $O(n^{3}m^{5+\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{2}m^{5+\epsilon})$ for the edge-based random solution generation. When there are $k = O(1)$ many vertices inside a convex polygon spanned by the other $n-k$ vertices, we obtain a stochastic runtime of $O(n^{4}m^{5+\epsilon}+n^{6k-1}m^{\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{3}m^{5+\epsilon}+n^{3k}m^{\epsilon})$ with the edge-based random solution generation. These runtimes are better than the expected runtime for the so-called $(\mu!+!\lambda)$ EA reported in a recent article, and again obtained for the stronger notion of stochastic runtime. |
Tasks | |
Published | 2016-12-21 |
URL | http://arxiv.org/abs/1612.06962v2 |
http://arxiv.org/pdf/1612.06962v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-runtime-analysis-of-a-cross |
Repo | |
Framework | |
Word Representation Models for Morphologically Rich Languages in Neural Machine Translation
Title | Word Representation Models for Morphologically Rich Languages in Neural Machine Translation |
Authors | Ekaterina Vylomova, Trevor Cohn, Xuanli He, Gholamreza Haffari |
Abstract | Dealing with the complex word forms in morphologically rich languages is an open problem in language processing, and is particularly important in translation. In contrast to most modern neural systems of translation, which discard the identity for rare words, in this paper we propose several architectures for learning word representations from character and morpheme level word decompositions. We incorporate these representations in a novel machine translation model which jointly learns word alignments and translations via a hard attention mechanism. Evaluating on translating from several morphologically rich languages into English, we show consistent improvements over strong baseline methods, of between 1 and 1.5 BLEU points. |
Tasks | Machine Translation |
Published | 2016-06-14 |
URL | http://arxiv.org/abs/1606.04217v1 |
http://arxiv.org/pdf/1606.04217v1.pdf | |
PWC | https://paperswithcode.com/paper/word-representation-models-for |
Repo | |
Framework | |
Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks
Title | Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks |
Authors | Peter Ondruska, Ingmar Posner |
Abstract | This paper presents to the best of our knowledge the first end-to-end object tracking approach which directly maps from raw sensor input to object tracks in sensor space without requiring any feature engineering or system identification in the form of plant or sensor models. Specifically, our system accepts a stream of raw sensor data at one end and, in real-time, produces an estimate of the entire environment state at the output including even occluded objects. We achieve this by framing the problem as a deep learning task and exploit sequence models in the form of recurrent neural networks to learn a mapping from sensor measurements to object tracks. In particular, we propose a learning method based on a form of input dropout which allows learning in an unsupervised manner, only based on raw, occluded sensor data without access to ground-truth annotations. We demonstrate our approach using a synthetic dataset designed to mimic the task of tracking objects in 2D laser data – as commonly encountered in robotics applications – and show that it learns to track many dynamic objects despite occlusions and the presence of sensor noise. |
Tasks | Feature Engineering, Object Tracking |
Published | 2016-02-02 |
URL | http://arxiv.org/abs/1602.00991v2 |
http://arxiv.org/pdf/1602.00991v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-tracking-seeing-beyond-seeing-using |
Repo | |
Framework | |
Rich Image Captioning in the Wild
Title | Rich Image Captioning in the Wild |
Authors | Kenneth Tran, Xiaodong He, Lei Zhang, Jian Sun, Cornelia Carapcea, Chris Thrasher, Chris Buehler, Chris Sienkiewicz |
Abstract | We present an image caption system that addresses new challenges of automatically describing images in the wild. The challenges include high quality caption quality with respect to human judgments, out-of-domain data handling, and low latency required in many applications. Built on top of a state-of-the-art framework, we developed a deep vision model that detects a broad range of visual concepts, an entity recognition model that identifies celebrities and landmarks, and a confidence model for the caption output. Experimental results show that our caption engine outperforms previous state-of-the-art systems significantly on both in-domain dataset (i.e. MS COCO) and out of-domain datasets. |
Tasks | Image Captioning |
Published | 2016-03-30 |
URL | http://arxiv.org/abs/1603.09016v2 |
http://arxiv.org/pdf/1603.09016v2.pdf | |
PWC | https://paperswithcode.com/paper/rich-image-captioning-in-the-wild |
Repo | |
Framework | |
Towards Anthropo-inspired Computational Systems: the $P^3$ Model
Title | Towards Anthropo-inspired Computational Systems: the $P^3$ Model |
Authors | Michael W. Bridges, Salvatore Distefano, Manuel Mazzara, Marat Minlebaev, Max Talanov, Jordi Vallverdú |
Abstract | This paper proposes a model which aim is providing a more coherent framework for agents design. We identify three closely related anthropo-centered domains working on separate functional levels. Abstracting from human physiology, psychology, and philosophy we create the $P^3$ model to be used as a multi-tier approach to deal with complex class of problems. The three layers identified in this model have been named PhysioComputing, MindComputing, and MetaComputing. Several instantiations of this model are finally presented related to different IT areas such as artificial intelligence, distributed computing, software and service engineering. |
Tasks | |
Published | 2016-06-10 |
URL | http://arxiv.org/abs/1606.03229v1 |
http://arxiv.org/pdf/1606.03229v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-anthropo-inspired-computational |
Repo | |
Framework | |
Power Data Classification: A Hybrid of a Novel Local Time Warping and LSTM
Title | Power Data Classification: A Hybrid of a Novel Local Time Warping and LSTM |
Authors | Yuanlong Li, Han Hu, Yonggang Wen, Jun Zhang |
Abstract | In this paper, for the purpose of data centre energy consumption monitoring and analysis, we propose to detect the running programs in a server by classifying the observed power consumption series. Time series classification problem has been extensively studied with various distance measurements developed; also recently the deep learning based sequence models have been proved to be promising. In this paper, we propose a novel distance measurement and build a time series classification algorithm hybridizing nearest neighbour and long short term memory (LSTM) neural network. More specifically, first we propose a new distance measurement termed as Local Time Warping (LTW), which utilizes a user-specified set for local warping, and is designed to be non-commutative and non-dynamic programming. Second we hybridize the 1NN-LTW and LSTM together. In particular, we combine the prediction probability vector of 1NN-LTW and LSTM to determine the label of the test cases. Finally, using the power consumption data from a real data center, we show that the proposed LTW can improve the classification accuracy of DTW from about 84% to 90%. Our experimental results prove that the proposed LTW is competitive on our data set compared with existed DTW variants and its non-commutative feature is indeed beneficial. We also test a linear version of LTW and it can significantly outperform existed linear runtime lower bound methods like LB_Keogh. Furthermore, with the hybrid algorithm, for the power series classification task we achieve an accuracy up to about 93%. Our research can inspire more studies on time series distance measurement and the hybrid of the deep learning models with other traditional models. |
Tasks | Time Series, Time Series Classification |
Published | 2016-08-15 |
URL | http://arxiv.org/abs/1608.04171v4 |
http://arxiv.org/pdf/1608.04171v4.pdf | |
PWC | https://paperswithcode.com/paper/power-data-classification-a-hybrid-of-a-novel |
Repo | |
Framework | |
Compressive Spectral Clustering
Title | Compressive Spectral Clustering |
Authors | Nicolas Tremblay, Gilles Puy, Remi Gribonval, Pierre Vandergheynst |
Abstract | Spectral clustering has become a popular technique due to its high performance in many contexts. It comprises three main steps: create a similarity graph between N objects to cluster, compute the first k eigenvectors of its Laplacian matrix to define a feature vector for each object, and run k-means on these features to separate objects into k classes. Each of these three steps becomes computationally intensive for large N and/or k. We propose to speed up the last two steps based on recent results in the emerging field of graph signal processing: graph filtering of random signals, and random sampling of bandlimited graph signals. We prove that our method, with a gain in computation time that can reach several orders of magnitude, is in fact an approximation of spectral clustering, for which we are able to control the error. We test the performance of our method on artificial and real-world network data. |
Tasks | |
Published | 2016-02-05 |
URL | http://arxiv.org/abs/1602.02018v2 |
http://arxiv.org/pdf/1602.02018v2.pdf | |
PWC | https://paperswithcode.com/paper/compressive-spectral-clustering |
Repo | |
Framework | |
Temporal-Needle: A view and appearance invariant video descriptor
Title | Temporal-Needle: A view and appearance invariant video descriptor |
Authors | Michal Yarom, Michal Irani |
Abstract | The ability to detect similar actions across videos can be very useful for real-world applications in many fields. However, this task is still challenging for existing systems, since videos that present the same action, can be taken from significantly different viewing directions, performed by different actors and backgrounds and under various video qualities. Video descriptors play a significant role in these systems. In this work we propose the “temporal-needle” descriptor which captures the dynamic behavior, while being invariant to viewpoint and appearance. The descriptor is computed using multi temporal scales of the video and by computing self-similarity for every patch through time in every temporal scale. The descriptor is computed for every pixel in the video. However, to find similar actions across videos, we consider only a small subset of the descriptors - the statistical significant descriptors. This allow us to find good correspondences across videos more efficiently. Using the descriptor, we were able to detect the same behavior across videos in a variety of scenarios. We demonstrate the use of the descriptor in tasks such as temporal and spatial alignment, action detection and even show its potential in unsupervised video clustering into categories. In this work we handled only videos taken with stationary cameras, but the descriptor can be extended to handle moving camera as well. |
Tasks | Action Detection |
Published | 2016-12-14 |
URL | http://arxiv.org/abs/1612.04854v1 |
http://arxiv.org/pdf/1612.04854v1.pdf | |
PWC | https://paperswithcode.com/paper/temporal-needle-a-view-and-appearance |
Repo | |
Framework | |
Polyp Detection and Segmentation from Video Capsule Endoscopy: A Review
Title | Polyp Detection and Segmentation from Video Capsule Endoscopy: A Review |
Authors | V. B. Surya Prasath |
Abstract | Video capsule endoscopy (VCE) is used widely nowadays for visualizing the gastrointestinal (GI) tract. Capsule endoscopy exams are prescribed usually as an additional monitoring mechanism and can help in identifying polyps, bleeding, etc. To analyze the large scale video data produced by VCE exams automatic image processing, computer vision, and learning algorithms are required. Recently, automatic polyp detection algorithms have been proposed with various degrees of success. Though polyp detection in colonoscopy and other traditional endoscopy procedure based images is becoming a mature field, due to its unique imaging characteristics detecting polyps automatically in VCE is a hard problem. We review different polyp detection approaches for VCE imagery and provide systematic analysis with challenges faced by standard image processing and computer vision methods. |
Tasks | |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.01915v1 |
http://arxiv.org/pdf/1609.01915v1.pdf | |
PWC | https://paperswithcode.com/paper/polyp-detection-and-segmentation-from-video |
Repo | |
Framework | |
Using CMA-ES for tuning coupled PID controllers within models of combustion engines
Title | Using CMA-ES for tuning coupled PID controllers within models of combustion engines |
Authors | Katerina Henclova |
Abstract | Proportional integral derivative (PID) controllers are important and widely used tools in system control. Tuning of the controller gains is a laborious task, especially for complex systems such as combustion engines. To minimize the time of an engineer for tuning of the gains in a simulation software, we propose to formulate a part of the problem as a black-box optimization task. In this paper, we summarize the properties and practical limitations of tuning of the gains in this particular application. We investigate the latest methods of black-box optimization and conclude that the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) with bi-population restart strategy, elitist parent selection and active covariance matrix adaptation is best suited for this task. Details of the algorithm’s experiment-based calibration are explained as well as derivation of a suitable objective function. The method’s performance is compared with that of PSO and SHADE. Finally, its usability is verified on six models of real engines. |
Tasks | Calibration |
Published | 2016-09-21 |
URL | http://arxiv.org/abs/1609.06741v4 |
http://arxiv.org/pdf/1609.06741v4.pdf | |
PWC | https://paperswithcode.com/paper/using-cma-es-for-tuning-coupled-pid |
Repo | |
Framework | |
MuFuRU: The Multi-Function Recurrent Unit
Title | MuFuRU: The Multi-Function Recurrent Unit |
Authors | Dirk Weissenborn, Tim Rocktäschel |
Abstract | Recurrent neural networks such as the GRU and LSTM found wide adoption in natural language processing and achieve state-of-the-art results for many tasks. These models are characterized by a memory state that can be written to and read from by applying gated composition operations to the current input and the previous state. However, they only cover a small subset of potentially useful compositions. We propose Multi-Function Recurrent Units (MuFuRUs) that allow for arbitrary differentiable functions as composition operations. Furthermore, MuFuRUs allow for an input- and state-dependent choice of these composition operations that is learned. Our experiments demonstrate that the additional functionality helps in different sequence modeling tasks, including the evaluation of propositional logic formulae, language modeling and sentiment analysis. |
Tasks | Language Modelling, Sentiment Analysis |
Published | 2016-06-09 |
URL | http://arxiv.org/abs/1606.03002v1 |
http://arxiv.org/pdf/1606.03002v1.pdf | |
PWC | https://paperswithcode.com/paper/mufuru-the-multi-function-recurrent-unit |
Repo | |
Framework | |
Undecidability of the Lambek calculus with a relevant modality
Title | Undecidability of the Lambek calculus with a relevant modality |
Authors | Max Kanovich, Stepan Kuznetsov, Andre Scedrov |
Abstract | Morrill and Valentin in the paper “Computational coverage of TLG: Nonlinearity” considered an extension of the Lambek calculus enriched by a so-called “exponential” modality. This modality behaves in the “relevant” style, that is, it allows contraction and permutation, but not weakening. Morrill and Valentin stated an open problem whether this system is decidable. Here we show its undecidability. Our result remains valid if we consider the fragment where all division operations have one direction. We also show that the derivability problem in a restricted case, where the modality can be applied only to variables (primitive types), is decidable and belongs to the NP class. |
Tasks | |
Published | 2016-01-23 |
URL | http://arxiv.org/abs/1601.06303v4 |
http://arxiv.org/pdf/1601.06303v4.pdf | |
PWC | https://paperswithcode.com/paper/undecidability-of-the-lambek-calculus-with-a |
Repo | |
Framework | |
NPCs as People, Too: The Extreme AI Personality Engine
Title | NPCs as People, Too: The Extreme AI Personality Engine |
Authors | Jeffrey Georgeson, Christopher Child |
Abstract | PK Dick once asked “Do Androids Dream of Electric Sheep?” In video games, a similar question could be asked of non-player characters: Do NPCs have dreams? Can they live and change as humans do? Can NPCs have personalities, and can these develop through interactions with players, other NPCs, and the world around them? Despite advances in personality AI for games, most NPCs are still undeveloped and undeveloping, reacting with flat affect and predictable routines that make them far less than human–in fact, they become little more than bits of the scenery that give out parcels of information. This need not be the case. Extreme AI, a psychology-based personality engine, creates adaptive NPC personalities. Originally developed as part of the thesis “NPCs as People: Using Databases and Behaviour Trees to Give Non-Player Characters Personality,” Extreme AI is now a fully functioning personality engine using all thirty facets of the Five Factor model of personality and an AI system that is live throughout gameplay. This paper discusses the research leading to Extreme AI; develops the ideas found in that thesis; discusses the development of other personality engines; and provides examples of Extreme AI’s use in two game demos. |
Tasks | |
Published | 2016-09-15 |
URL | http://arxiv.org/abs/1609.04879v1 |
http://arxiv.org/pdf/1609.04879v1.pdf | |
PWC | https://paperswithcode.com/paper/npcs-as-people-too-the-extreme-ai-personality |
Repo | |
Framework | |
Skipping Word: A Character-Sequential Representation based Framework for Question Answering
Title | Skipping Word: A Character-Sequential Representation based Framework for Question Answering |
Authors | Lingxun Meng, Yan Li, Mengyi Liu, Peng Shu |
Abstract | Recent works using artificial neural networks based on word distributed representation greatly boost the performance of various natural language learning tasks, especially question answering. Though, they also carry along with some attendant problems, such as corpus selection for embedding learning, dictionary transformation for different learning tasks, etc. In this paper, we propose to straightforwardly model sentences by means of character sequences, and then utilize convolutional neural networks to integrate character embedding learning together with point-wise answer selection training. Compared with deep models pre-trained on word embedding (WE) strategy, our character-sequential representation (CSR) based method shows a much simpler procedure and more stable performance across different benchmarks. Extensive experiments on two benchmark answer selection datasets exhibit the competitive performance compared with the state-of-the-art methods. |
Tasks | Answer Selection, Question Answering |
Published | 2016-09-02 |
URL | http://arxiv.org/abs/1609.00565v1 |
http://arxiv.org/pdf/1609.00565v1.pdf | |
PWC | https://paperswithcode.com/paper/skipping-word-a-character-sequential |
Repo | |
Framework | |