May 6, 2019

2953 words 14 mins read

Paper Group ANR 421

Video Key Frame Extraction using Entropy value as Global and Local Feature. Stochastic Runtime Analysis of a Cross Entropy Algorithm for Traveling Salesman Problems. Word Representation Models for Morphologically Rich Languages in Neural Machine Translation. Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks. Rich Image Captioning …

Video Key Frame Extraction using Entropy value as Global and Local Feature


Title	Video Key Frame Extraction using Entropy value as Global and Local Feature
Authors	Siddu P Algur, Vivek R
Abstract	Key frames play an important role in video annotation. It is one of the widely used methods for video abstraction as this will help us for processing a large set of video data with sufficient content representation in faster way. In this paper a novel approach for key-frame extraction using entropy value is proposed. The proposed approach classifies frames based on entropy values as global feature and selects frame from each class as representative key-frame. It also eliminates redundant frames from selected key-frames using entropy value as local feature. Evaluation of the approach on several video clips has been presented. Results show that the algorithm is successful in helping annotators automatically identify video key-frames.
Tasks
Published	2016-05-28
URL	http://arxiv.org/abs/1605.08857v1
PDF	http://arxiv.org/pdf/1605.08857v1.pdf
PWC	https://paperswithcode.com/paper/video-key-frame-extraction-using-entropy
Repo
Framework

Stochastic Runtime Analysis of a Cross Entropy Algorithm for Traveling Salesman Problems


Title	Stochastic Runtime Analysis of a Cross Entropy Algorithm for Traveling Salesman Problems
Authors	Zijun Wu, Rolf Moehring, Jianhui Lai
Abstract	This article analyzes the stochastic runtime of a Cross-Entropy Algorithm on two classes of traveling salesman problems. The algorithm shares main features of the famous Max-Min Ant System with iteration-best reinforcement. For simple instances that have a ${1,n}$-valued distance function and a unique optimal solution, we prove a stochastic runtime of $O(n^{6+\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{3+\epsilon}\ln n)$ with the edge-based random solution generation for an arbitrary $\epsilon\in (0,1)$. These runtimes are very close to the known expected runtime for variants of Max-Min Ant System with best-so-far reinforcement. They are obtained for the stronger notion of stochastic runtime, which means that an optimal solution is obtained in that time with an overwhelming probability, i.e., a probability tending exponentially fast to one with growing problem size. We also inspect more complex instances with $n$ vertices positioned on an $m\times m$ grid. When the $n$ vertices span a convex polygon, we obtain a stochastic runtime of $O(n^{3}m^{5+\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{2}m^{5+\epsilon})$ for the edge-based random solution generation. When there are $k = O(1)$ many vertices inside a convex polygon spanned by the other $n-k$ vertices, we obtain a stochastic runtime of $O(n^{4}m^{5+\epsilon}+n^{6k-1}m^{\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{3}m^{5+\epsilon}+n^{3k}m^{\epsilon})$ with the edge-based random solution generation. These runtimes are better than the expected runtime for the so-called $(\mu!+!\lambda)$ EA reported in a recent article, and again obtained for the stronger notion of stochastic runtime.
Tasks
Published	2016-12-21
URL	http://arxiv.org/abs/1612.06962v2
PDF	http://arxiv.org/pdf/1612.06962v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-runtime-analysis-of-a-cross
Repo
Framework

Word Representation Models for Morphologically Rich Languages in Neural Machine Translation


Title	Word Representation Models for Morphologically Rich Languages in Neural Machine Translation
Authors	Ekaterina Vylomova, Trevor Cohn, Xuanli He, Gholamreza Haffari
Abstract	Dealing with the complex word forms in morphologically rich languages is an open problem in language processing, and is particularly important in translation. In contrast to most modern neural systems of translation, which discard the identity for rare words, in this paper we propose several architectures for learning word representations from character and morpheme level word decompositions. We incorporate these representations in a novel machine translation model which jointly learns word alignments and translations via a hard attention mechanism. Evaluating on translating from several morphologically rich languages into English, we show consistent improvements over strong baseline methods, of between 1 and 1.5 BLEU points.
Tasks	Machine Translation
Published	2016-06-14
URL	http://arxiv.org/abs/1606.04217v1
PDF	http://arxiv.org/pdf/1606.04217v1.pdf
PWC	https://paperswithcode.com/paper/word-representation-models-for
Repo
Framework

Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks


Title	Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks
Authors	Peter Ondruska, Ingmar Posner
Abstract	This paper presents to the best of our knowledge the first end-to-end object tracking approach which directly maps from raw sensor input to object tracks in sensor space without requiring any feature engineering or system identification in the form of plant or sensor models. Specifically, our system accepts a stream of raw sensor data at one end and, in real-time, produces an estimate of the entire environment state at the output including even occluded objects. We achieve this by framing the problem as a deep learning task and exploit sequence models in the form of recurrent neural networks to learn a mapping from sensor measurements to object tracks. In particular, we propose a learning method based on a form of input dropout which allows learning in an unsupervised manner, only based on raw, occluded sensor data without access to ground-truth annotations. We demonstrate our approach using a synthetic dataset designed to mimic the task of tracking objects in 2D laser data – as commonly encountered in robotics applications – and show that it learns to track many dynamic objects despite occlusions and the presence of sensor noise.
Tasks	Feature Engineering, Object Tracking
Published	2016-02-02
URL	http://arxiv.org/abs/1602.00991v2
PDF	http://arxiv.org/pdf/1602.00991v2.pdf
PWC	https://paperswithcode.com/paper/deep-tracking-seeing-beyond-seeing-using
Repo
Framework

Rich Image Captioning in the Wild


Title	Rich Image Captioning in the Wild
Authors	Kenneth Tran, Xiaodong He, Lei Zhang, Jian Sun, Cornelia Carapcea, Chris Thrasher, Chris Buehler, Chris Sienkiewicz
Abstract	We present an image caption system that addresses new challenges of automatically describing images in the wild. The challenges include high quality caption quality with respect to human judgments, out-of-domain data handling, and low latency required in many applications. Built on top of a state-of-the-art framework, we developed a deep vision model that detects a broad range of visual concepts, an entity recognition model that identifies celebrities and landmarks, and a confidence model for the caption output. Experimental results show that our caption engine outperforms previous state-of-the-art systems significantly on both in-domain dataset (i.e. MS COCO) and out of-domain datasets.
Tasks	Image Captioning
Published	2016-03-30
URL	http://arxiv.org/abs/1603.09016v2
PDF	http://arxiv.org/pdf/1603.09016v2.pdf
PWC	https://paperswithcode.com/paper/rich-image-captioning-in-the-wild
Repo
Framework

Towards Anthropo-inspired Computational Systems: the $P^3$ Model


Title	Towards Anthropo-inspired Computational Systems: the $P^3$ Model
Authors	Michael W. Bridges, Salvatore Distefano, Manuel Mazzara, Marat Minlebaev, Max Talanov, Jordi Vallverdú
Abstract	This paper proposes a model which aim is providing a more coherent framework for agents design. We identify three closely related anthropo-centered domains working on separate functional levels. Abstracting from human physiology, psychology, and philosophy we create the $P^3$ model to be used as a multi-tier approach to deal with complex class of problems. The three layers identified in this model have been named PhysioComputing, MindComputing, and MetaComputing. Several instantiations of this model are finally presented related to different IT areas such as artificial intelligence, distributed computing, software and service engineering.
Tasks
Published	2016-06-10
URL	http://arxiv.org/abs/1606.03229v1
PDF	http://arxiv.org/pdf/1606.03229v1.pdf
PWC	https://paperswithcode.com/paper/towards-anthropo-inspired-computational
Repo
Framework

Power Data Classification: A Hybrid of a Novel Local Time Warping and LSTM


Title	Power Data Classification: A Hybrid of a Novel Local Time Warping and LSTM
Authors	Yuanlong Li, Han Hu, Yonggang Wen, Jun Zhang
Abstract	In this paper, for the purpose of data centre energy consumption monitoring and analysis, we propose to detect the running programs in a server by classifying the observed power consumption series. Time series classification problem has been extensively studied with various distance measurements developed; also recently the deep learning based sequence models have been proved to be promising. In this paper, we propose a novel distance measurement and build a time series classification algorithm hybridizing nearest neighbour and long short term memory (LSTM) neural network. More specifically, first we propose a new distance measurement termed as Local Time Warping (LTW), which utilizes a user-specified set for local warping, and is designed to be non-commutative and non-dynamic programming. Second we hybridize the 1NN-LTW and LSTM together. In particular, we combine the prediction probability vector of 1NN-LTW and LSTM to determine the label of the test cases. Finally, using the power consumption data from a real data center, we show that the proposed LTW can improve the classification accuracy of DTW from about 84% to 90%. Our experimental results prove that the proposed LTW is competitive on our data set compared with existed DTW variants and its non-commutative feature is indeed beneficial. We also test a linear version of LTW and it can significantly outperform existed linear runtime lower bound methods like LB_Keogh. Furthermore, with the hybrid algorithm, for the power series classification task we achieve an accuracy up to about 93%. Our research can inspire more studies on time series distance measurement and the hybrid of the deep learning models with other traditional models.
Tasks	Time Series, Time Series Classification
Published	2016-08-15
URL	http://arxiv.org/abs/1608.04171v4
PDF	http://arxiv.org/pdf/1608.04171v4.pdf
PWC	https://paperswithcode.com/paper/power-data-classification-a-hybrid-of-a-novel
Repo
Framework

Compressive Spectral Clustering


Title	Compressive Spectral Clustering
Authors	Nicolas Tremblay, Gilles Puy, Remi Gribonval, Pierre Vandergheynst
Abstract	Spectral clustering has become a popular technique due to its high performance in many contexts. It comprises three main steps: create a similarity graph between N objects to cluster, compute the first k eigenvectors of its Laplacian matrix to define a feature vector for each object, and run k-means on these features to separate objects into k classes. Each of these three steps becomes computationally intensive for large N and/or k. We propose to speed up the last two steps based on recent results in the emerging field of graph signal processing: graph filtering of random signals, and random sampling of bandlimited graph signals. We prove that our method, with a gain in computation time that can reach several orders of magnitude, is in fact an approximation of spectral clustering, for which we are able to control the error. We test the performance of our method on artificial and real-world network data.
Tasks
Published	2016-02-05
URL	http://arxiv.org/abs/1602.02018v2
PDF	http://arxiv.org/pdf/1602.02018v2.pdf
PWC	https://paperswithcode.com/paper/compressive-spectral-clustering
Repo
Framework

Temporal-Needle: A view and appearance invariant video descriptor


Title	Temporal-Needle: A view and appearance invariant video descriptor
Authors	Michal Yarom, Michal Irani
Abstract	The ability to detect similar actions across videos can be very useful for real-world applications in many fields. However, this task is still challenging for existing systems, since videos that present the same action, can be taken from significantly different viewing directions, performed by different actors and backgrounds and under various video qualities. Video descriptors play a significant role in these systems. In this work we propose the “temporal-needle” descriptor which captures the dynamic behavior, while being invariant to viewpoint and appearance. The descriptor is computed using multi temporal scales of the video and by computing self-similarity for every patch through time in every temporal scale. The descriptor is computed for every pixel in the video. However, to find similar actions across videos, we consider only a small subset of the descriptors - the statistical significant descriptors. This allow us to find good correspondences across videos more efficiently. Using the descriptor, we were able to detect the same behavior across videos in a variety of scenarios. We demonstrate the use of the descriptor in tasks such as temporal and spatial alignment, action detection and even show its potential in unsupervised video clustering into categories. In this work we handled only videos taken with stationary cameras, but the descriptor can be extended to handle moving camera as well.
Tasks	Action Detection
Published	2016-12-14
URL	http://arxiv.org/abs/1612.04854v1
PDF	http://arxiv.org/pdf/1612.04854v1.pdf
PWC	https://paperswithcode.com/paper/temporal-needle-a-view-and-appearance
Repo
Framework

Polyp Detection and Segmentation from Video Capsule Endoscopy: A Review


Title	Polyp Detection and Segmentation from Video Capsule Endoscopy: A Review
Authors	V. B. Surya Prasath
Abstract	Video capsule endoscopy (VCE) is used widely nowadays for visualizing the gastrointestinal (GI) tract. Capsule endoscopy exams are prescribed usually as an additional monitoring mechanism and can help in identifying polyps, bleeding, etc. To analyze the large scale video data produced by VCE exams automatic image processing, computer vision, and learning algorithms are required. Recently, automatic polyp detection algorithms have been proposed with various degrees of success. Though polyp detection in colonoscopy and other traditional endoscopy procedure based images is becoming a mature field, due to its unique imaging characteristics detecting polyps automatically in VCE is a hard problem. We review different polyp detection approaches for VCE imagery and provide systematic analysis with challenges faced by standard image processing and computer vision methods.
Tasks
Published	2016-09-07
URL	http://arxiv.org/abs/1609.01915v1
PDF	http://arxiv.org/pdf/1609.01915v1.pdf
PWC	https://paperswithcode.com/paper/polyp-detection-and-segmentation-from-video
Repo
Framework

Using CMA-ES for tuning coupled PID controllers within models of combustion engines


Title	Using CMA-ES for tuning coupled PID controllers within models of combustion engines
Authors	Katerina Henclova
Abstract	Proportional integral derivative (PID) controllers are important and widely used tools in system control. Tuning of the controller gains is a laborious task, especially for complex systems such as combustion engines. To minimize the time of an engineer for tuning of the gains in a simulation software, we propose to formulate a part of the problem as a black-box optimization task. In this paper, we summarize the properties and practical limitations of tuning of the gains in this particular application. We investigate the latest methods of black-box optimization and conclude that the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) with bi-population restart strategy, elitist parent selection and active covariance matrix adaptation is best suited for this task. Details of the algorithm’s experiment-based calibration are explained as well as derivation of a suitable objective function. The method’s performance is compared with that of PSO and SHADE. Finally, its usability is verified on six models of real engines.
Tasks	Calibration
Published	2016-09-21
URL	http://arxiv.org/abs/1609.06741v4
PDF	http://arxiv.org/pdf/1609.06741v4.pdf
PWC	https://paperswithcode.com/paper/using-cma-es-for-tuning-coupled-pid
Repo
Framework

MuFuRU: The Multi-Function Recurrent Unit


Title	MuFuRU: The Multi-Function Recurrent Unit
Authors	Dirk Weissenborn, Tim Rocktäschel
Abstract	Recurrent neural networks such as the GRU and LSTM found wide adoption in natural language processing and achieve state-of-the-art results for many tasks. These models are characterized by a memory state that can be written to and read from by applying gated composition operations to the current input and the previous state. However, they only cover a small subset of potentially useful compositions. We propose Multi-Function Recurrent Units (MuFuRUs) that allow for arbitrary differentiable functions as composition operations. Furthermore, MuFuRUs allow for an input- and state-dependent choice of these composition operations that is learned. Our experiments demonstrate that the additional functionality helps in different sequence modeling tasks, including the evaluation of propositional logic formulae, language modeling and sentiment analysis.
Tasks	Language Modelling, Sentiment Analysis
Published	2016-06-09
URL	http://arxiv.org/abs/1606.03002v1
PDF	http://arxiv.org/pdf/1606.03002v1.pdf
PWC	https://paperswithcode.com/paper/mufuru-the-multi-function-recurrent-unit
Repo
Framework

Undecidability of the Lambek calculus with a relevant modality


Title	Undecidability of the Lambek calculus with a relevant modality
Authors	Max Kanovich, Stepan Kuznetsov, Andre Scedrov
Abstract	Morrill and Valentin in the paper “Computational coverage of TLG: Nonlinearity” considered an extension of the Lambek calculus enriched by a so-called “exponential” modality. This modality behaves in the “relevant” style, that is, it allows contraction and permutation, but not weakening. Morrill and Valentin stated an open problem whether this system is decidable. Here we show its undecidability. Our result remains valid if we consider the fragment where all division operations have one direction. We also show that the derivability problem in a restricted case, where the modality can be applied only to variables (primitive types), is decidable and belongs to the NP class.
Tasks
Published	2016-01-23
URL	http://arxiv.org/abs/1601.06303v4
PDF	http://arxiv.org/pdf/1601.06303v4.pdf
PWC	https://paperswithcode.com/paper/undecidability-of-the-lambek-calculus-with-a
Repo
Framework

NPCs as People, Too: The Extreme AI Personality Engine


Title	NPCs as People, Too: The Extreme AI Personality Engine
Authors	Jeffrey Georgeson, Christopher Child
Abstract	PK Dick once asked “Do Androids Dream of Electric Sheep?” In video games, a similar question could be asked of non-player characters: Do NPCs have dreams? Can they live and change as humans do? Can NPCs have personalities, and can these develop through interactions with players, other NPCs, and the world around them? Despite advances in personality AI for games, most NPCs are still undeveloped and undeveloping, reacting with flat affect and predictable routines that make them far less than human–in fact, they become little more than bits of the scenery that give out parcels of information. This need not be the case. Extreme AI, a psychology-based personality engine, creates adaptive NPC personalities. Originally developed as part of the thesis “NPCs as People: Using Databases and Behaviour Trees to Give Non-Player Characters Personality,” Extreme AI is now a fully functioning personality engine using all thirty facets of the Five Factor model of personality and an AI system that is live throughout gameplay. This paper discusses the research leading to Extreme AI; develops the ideas found in that thesis; discusses the development of other personality engines; and provides examples of Extreme AI’s use in two game demos.
Tasks
Published	2016-09-15
URL	http://arxiv.org/abs/1609.04879v1
PDF	http://arxiv.org/pdf/1609.04879v1.pdf
PWC	https://paperswithcode.com/paper/npcs-as-people-too-the-extreme-ai-personality
Repo
Framework

Skipping Word: A Character-Sequential Representation based Framework for Question Answering


Title	Skipping Word: A Character-Sequential Representation based Framework for Question Answering
Authors	Lingxun Meng, Yan Li, Mengyi Liu, Peng Shu
Abstract	Recent works using artificial neural networks based on word distributed representation greatly boost the performance of various natural language learning tasks, especially question answering. Though, they also carry along with some attendant problems, such as corpus selection for embedding learning, dictionary transformation for different learning tasks, etc. In this paper, we propose to straightforwardly model sentences by means of character sequences, and then utilize convolutional neural networks to integrate character embedding learning together with point-wise answer selection training. Compared with deep models pre-trained on word embedding (WE) strategy, our character-sequential representation (CSR) based method shows a much simpler procedure and more stable performance across different benchmarks. Extensive experiments on two benchmark answer selection datasets exhibit the competitive performance compared with the state-of-the-art methods.
Tasks	Answer Selection, Question Answering
Published	2016-09-02
URL	http://arxiv.org/abs/1609.00565v1
PDF	http://arxiv.org/pdf/1609.00565v1.pdf
PWC	https://paperswithcode.com/paper/skipping-word-a-character-sequential
Repo
Framework