May 5, 2019

2960 words 14 mins read

Paper Group ANR 528

Binary Subspace Coding for Query-by-Image Video Retrieval. The ground truth about metadata and community detection in networks. Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs. Gibberish Semantics: How Good is Russian Twitter in Word Semantic Similarity Task?. Towards Self-explanatory Ontology Visualizat …

Binary Subspace Coding for Query-by-Image Video Retrieval


Title	Binary Subspace Coding for Query-by-Image Video Retrieval
Authors	Ruicong Xu, Yang Yang, Yadan Luo, Fumin Shen, Zi Huang, Heng Tao Shen
Abstract	The query-by-image video retrieval (QBIVR) task has been attracting considerable research attention recently. However, most existing methods represent a video by either aggregating or projecting all its frames into a single datum point, which may easily cause severe information loss. In this paper, we propose an efficient QBIVR framework to enable an effective and efficient video search with image query. We first define a similarity-preserving distance metric between an image and its orthogonal projection in the subspace of the video, which can be equivalently transformed to a Maximum Inner Product Search (MIPS) problem. Besides, to boost the efficiency of solving the MIPS problem, we propose two asymmetric hashing schemes, which bridge the domain gap of images and videos. The first approach, termed Inner-product Binary Coding (IBC), preserves the inner relationships of images and videos in a common Hamming space. To further improve the retrieval efficiency, we devise a Bilinear Binary Coding (BBC) approach, which employs compact bilinear projections instead of a single large projection matrix. Extensive experiments have been conducted on four real-world video datasets to verify the effectiveness of our proposed approaches as compared to the state-of-the-arts.
Tasks	Video Retrieval
Published	2016-12-06
URL	http://arxiv.org/abs/1612.01657v1
PDF	http://arxiv.org/pdf/1612.01657v1.pdf
PWC	https://paperswithcode.com/paper/binary-subspace-coding-for-query-by-image
Repo
Framework

The ground truth about metadata and community detection in networks


Title	The ground truth about metadata and community detection in networks
Authors	Leto Peel, Daniel B. Larremore, Aaron Clauset
Abstract	Across many scientific domains, there is a common need to automatically extract a simplified view or coarse-graining of how a complex system’s components interact. This general task is called community detection in networks and is analogous to searching for clusters in independent vector data. It is common to evaluate the performance of community detection algorithms by their ability to find so-called “ground truth” communities. This works well in synthetic networks with planted communities because such networks’ links are formed explicitly based on those known communities. However, there are no planted communities in real world networks. Instead, it is standard practice to treat some observed discrete-valued node attributes, or metadata, as ground truth. Here, we show that metadata are not the same as ground truth, and that treating them as such induces severe theoretical and practical problems. We prove that no algorithm can uniquely solve community detection, and we prove a general No Free Lunch theorem for community detection, which implies that there can be no algorithm that is optimal for all possible community detection tasks. However, community detection remains a powerful tool and node metadata still have value so a careful exploration of their relationship with network structure can yield insights of genuine worth. We illustrate this point by introducing two statistical techniques that can quantify the relationship between metadata and community structure for a broad class of models. We demonstrate these techniques using both synthetic and real-world networks, and for multiple types of metadata and community structure.
Tasks	Community Detection
Published	2016-08-20
URL	http://arxiv.org/abs/1608.05878v2
PDF	http://arxiv.org/pdf/1608.05878v2.pdf
PWC	https://paperswithcode.com/paper/the-ground-truth-about-metadata-and-community
Repo
Framework

Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs


Title	Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs
Authors	Baichuan Zhang, Sutanay Choudhury, Mohammad Al Hasan, Xia Ning, Khushbu Agarwal, Sumit Purohit, Paola Pesntez Cabrera
Abstract	Link prediction, or predicting the likelihood of a link in a knowledge graph based on its existing state is a key research task. It differs from a traditional link prediction task in that the links in a knowledge graph are categorized into different predicates and the link prediction performance of different predicates in a knowledge graph generally varies widely. In this work, we propose a latent feature embedding based link prediction model which considers the prediction task for each predicate disjointly. To learn the model parameters it utilizes a Bayesian personalized ranking based optimization technique. Experimental results on large-scale knowledge bases such as YAGO2 show that our link prediction approach achieves substantially higher performance than several state-of-art approaches. We also show that for a given predicate the topological properties of the knowledge graph induced by the given predicate edges are key indicators of the link prediction performance of that predicate in the knowledge graph.
Tasks	Knowledge Graphs, Link Prediction
Published	2016-01-14
URL	http://arxiv.org/abs/1601.03778v2
PDF	http://arxiv.org/pdf/1601.03778v2.pdf
PWC	https://paperswithcode.com/paper/trust-from-the-past-bayesian-personalized
Repo
Framework

Gibberish Semantics: How Good is Russian Twitter in Word Semantic Similarity Task?


Title	Gibberish Semantics: How Good is Russian Twitter in Word Semantic Similarity Task?
Authors	Nikolay N. Vasiliev
Abstract	The most studied and most successful language models were developed and evaluated mainly for English and other close European languages, such as French, German, etc. It is important to study applicability of these models to other languages. The use of vector space models for Russian was recently studied for multiple corpora, such as Wikipedia, RuWac, lib.ru. These models were evaluated against word semantic similarity task. For our knowledge Twitter was not considered as a corpus for this task, with this work we fill the gap. Results for vectors trained on Twitter corpus are comparable in accuracy with other single-corpus trained models, although the best performance is currently achieved by combination of multiple corpora.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2016-02-28
URL	http://arxiv.org/abs/1602.08741v1
PDF	http://arxiv.org/pdf/1602.08741v1.pdf
PWC	https://paperswithcode.com/paper/gibberish-semantics-how-good-is-russian
Repo
Framework

Towards Self-explanatory Ontology Visualization with Contextual Verbalization


Title	Towards Self-explanatory Ontology Visualization with Contextual Verbalization
Authors	Renārs Liepiņš, Uldis Bojārs, Normunds Grūzītis, Kārlis Čerāns, Edgars Celms
Abstract	Ontologies are one of the core foundations of the Semantic Web. To participate in Semantic Web projects, domain experts need to be able to understand the ontologies involved. Visual notations can provide an overview of the ontology and help users to understand the connections among entities. However, the users first need to learn the visual notation before they can interpret it correctly. Controlled natural language representation would be readable right away and might be preferred in case of complex axioms, however, the structure of the ontology would remain less apparent. We propose to combine ontology visualizations with contextual ontology verbalizations of selected ontology (diagram) elements, displaying controlled natural language (CNL) explanations of OWL axioms corresponding to the selected visual notation elements. Thus, the domain experts will benefit from both the high-level overview provided by the graphical notation and the detailed textual explanations of particular elements in the diagram.
Tasks
Published	2016-07-06
URL	http://arxiv.org/abs/1607.01490v1
PDF	http://arxiv.org/pdf/1607.01490v1.pdf
PWC	https://paperswithcode.com/paper/towards-self-explanatory-ontology
Repo
Framework

The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives


Title	The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives
Authors	Mohit Iyyer, Varun Manjunatha, Anupam Guha, Yogarshi Vyas, Jordan Boyd-Graber, Hal Daumé III, Larry Davis
Abstract	Visual narrative is often a combination of explicit information and judicious omissions, relying on the viewer to supply missing details. In comics, most movements in time and space are hidden in the “gutters” between panels. To follow the story, readers logically connect panels together by inferring unseen actions through a process called “closure”. While computers can now describe what is explicitly depicted in natural images, in this paper we examine whether they can understand the closure-driven narratives conveyed by stylized artwork and dialogue in comic book panels. We construct a dataset, COMICS, that consists of over 1.2 million panels (120 GB) paired with automatic textbox transcriptions. An in-depth analysis of COMICS demonstrates that neither text nor image alone can tell a comic book story, so a computer must understand both modalities to keep up with the plot. We introduce three cloze-style tasks that ask models to predict narrative and character-centric aspects of a panel given n preceding panels as context. Various deep neural architectures underperform human baselines on these tasks, suggesting that COMICS contains fundamental challenges for both vision and language.
Tasks
Published	2016-11-16
URL	http://arxiv.org/abs/1611.05118v2
PDF	http://arxiv.org/pdf/1611.05118v2.pdf
PWC	https://paperswithcode.com/paper/the-amazing-mysteries-of-the-gutter-drawing
Repo
Framework

Neural Machine Translation from Simplified Translations


Title	Neural Machine Translation from Simplified Translations
Authors	Josep Crego, Jean Senellart
Abstract	Text simplification aims at reducing the lexical, grammatical and structural complexity of a text while keeping the same meaning. In the context of machine translation, we introduce the idea of simplified translations in order to boost the learning ability of deep neural translation models. We conduct preliminary experiments showing that translation complexity is actually reduced in a translation of a source bi-text compared to the target reference of the bi-text while using a neural machine translation (NMT) system learned on the exact same bi-text. Based on knowledge distillation idea, we then train an NMT system using the simplified bi-text, and show that it outperforms the initial system that was built over the reference data set. Performance is further boosted when both reference and automatic translations are used to learn the network. We perform an elementary analysis of the translated corpus and report accuracy results of the proposed approach on English-to-French and English-to-German translation tasks.
Tasks	Machine Translation, Text Simplification
Published	2016-12-19
URL	http://arxiv.org/abs/1612.06139v1
PDF	http://arxiv.org/pdf/1612.06139v1.pdf
PWC	https://paperswithcode.com/paper/neural-machine-translation-from-simplified
Repo
Framework

Discriminating between similar languages in Twitter using label propagation


Title	Discriminating between similar languages in Twitter using label propagation
Authors	Will Radford, Matthias Galle
Abstract	Identifying the language of social media messages is an important first step in linguistic processing. Existing models for Twitter focus on content analysis, which is successful for dissimilar language pairs. We propose a label propagation approach that takes the social graph of tweet authors into account as well as content to better tease apart similar languages. This results in state-of-the-art shared task performance of $76.63%$, $1.4%$ higher than the top system.
Tasks
Published	2016-07-19
URL	http://arxiv.org/abs/1607.05408v1
PDF	http://arxiv.org/pdf/1607.05408v1.pdf
PWC	https://paperswithcode.com/paper/discriminating-between-similar-languages-in
Repo
Framework

Modelling and computation using NCoRM mixtures for density regression


Title	Modelling and computation using NCoRM mixtures for density regression
Authors	Jim Griffin, Fabrizio Leisen
Abstract	Normalized compound random measures are flexible nonparametric priors for related distributions. We consider building general nonparametric regression models using normalized compound random measure mixture models. Posterior inference is made using a novel pseudo-marginal Metropolis-Hastings sampler for normalized compound random measure mixture models. The algorithm makes use of a new general approach to the unbiased estimation of Laplace functionals of compound random measures (which includes completely random measures as a special case). The approach is illustrated on problems of density regression.
Tasks
Published	2016-08-02
URL	http://arxiv.org/abs/1608.00874v3
PDF	http://arxiv.org/pdf/1608.00874v3.pdf
PWC	https://paperswithcode.com/paper/modelling-and-computation-using-ncorm
Repo
Framework

ECMdd: Evidential c-medoids clustering with multiple prototypes


Title	ECMdd: Evidential c-medoids clustering with multiple prototypes
Authors	Kuang Zhou, Arnaud Martin, Quan Pan, Zhun-Ga Liu
Abstract	In this work, a new prototype-based clustering method named Evidential C-Medoids (ECMdd), which belongs to the family of medoid-based clustering for proximity data, is proposed as an extension of Fuzzy C-Medoids (FCMdd) on the theoretical framework of belief functions. In the application of FCMdd and original ECMdd, a single medoid (prototype), which is supposed to belong to the object set, is utilized to represent one class. For the sake of clarity, this kind of ECMdd using a single medoid is denoted by sECMdd. In real clustering applications, using only one pattern to capture or interpret a class may not adequately model different types of group structure and hence limits the clustering performance. In order to address this problem, a variation of ECMdd using multiple weighted medoids, denoted by wECMdd, is presented. Unlike sECMdd, in wECMdd objects in each cluster carry various weights describing their degree of representativeness for that class. This mechanism enables each class to be represented by more than one object. Experimental results in synthetic and real data sets clearly demonstrate the superiority of sECMdd and wECMdd. Moreover, the clustering results by wECMdd can provide richer information for the inner structure of the detected classes with the help of prototype weights.
Tasks
Published	2016-06-03
URL	http://arxiv.org/abs/1606.01113v1
PDF	http://arxiv.org/pdf/1606.01113v1.pdf
PWC	https://paperswithcode.com/paper/ecmdd-evidential-c-medoids-clustering-with
Repo
Framework

A Nonlinear Adaptive Filter Based on the Model of Simple Multilinear Functionals


Title	A Nonlinear Adaptive Filter Based on the Model of Simple Multilinear Functionals
Authors	Felipe C. Pinheiro, Cássio G. Lopes
Abstract	Nonlinear adaptive filtering allows for modeling of some additional aspects of a general system and usually relies on highly complex algorithms, such as those based on the Volterra series. Through the use of the Kronecker product and some basic facts of tensor algebra, we propose a simple model of nonlinearity, one that can be interpreted as a product of the outputs of K FIR linear filters, and compute its cost function together with its gradient, which allows for some analysis of the optimization problem. We use these results it in a stochastic gradient framework, from which we derive an LMS-like algorithm and investigate the problems of multi-modality in the mean-square error surface and the choice of adequate initial conditions. Its computational complexity is calculated. The new algorithm is tested in a system identification setup and is compared with other polynomial algorithms from the literature, presenting favorable convergence and/or computational complexity.
Tasks
Published	2016-03-01
URL	http://arxiv.org/abs/1603.00427v1
PDF	http://arxiv.org/pdf/1603.00427v1.pdf
PWC	https://paperswithcode.com/paper/a-nonlinear-adaptive-filter-based-on-the
Repo
Framework

Captioning Images with Diverse Objects


Title	Captioning Images with Diverse Objects
Authors	Subhashini Venugopalan, Lisa Anne Hendricks, Marcus Rohrbach, Raymond Mooney, Trevor Darrell, Kate Saenko
Abstract	Recent captioning models are limited in their ability to scale and describe concepts unseen in paired image-text corpora. We propose the Novel Object Captioner (NOC), a deep visual semantic captioning model that can describe a large number of object categories not present in existing image-caption datasets. Our model takes advantage of external sources – labeled images from object recognition datasets, and semantic knowledge extracted from unannotated text. We propose minimizing a joint objective which can learn from these diverse data sources and leverage distributional semantic embeddings, enabling the model to generalize and describe novel objects outside of image-caption datasets. We demonstrate that our model exploits semantic information to generate captions for hundreds of object categories in the ImageNet object recognition dataset that are not observed in MSCOCO image-caption training data, as well as many categories that are observed very rarely. Both automatic evaluations and human judgements show that our model considerably outperforms prior work in being able to describe many more categories of objects.
Tasks	Object Recognition
Published	2016-06-24
URL	http://arxiv.org/abs/1606.07770v3
PDF	http://arxiv.org/pdf/1606.07770v3.pdf
PWC	https://paperswithcode.com/paper/captioning-images-with-diverse-objects
Repo
Framework

Boosting Neural Machine Translation


Title	Boosting Neural Machine Translation
Authors	Dakun Zhang, Jungi Kim, Josep Crego, Jean Senellart
Abstract	Training efficiency is one of the main problems for Neural Machine Translation (NMT). Deep networks need for very large data as well as many training iterations to achieve state-of-the-art performance. This results in very high computation cost, slowing down research and industrialisation. In this paper, we propose to alleviate this problem with several training methods based on data boosting and bootstrap with no modifications to the neural network. It imitates the learning process of humans, which typically spend more time when learning “difficult” concepts than easier ones. We experiment on an English-French translation task showing accuracy improvements of up to 1.63 BLEU while saving 20% of training time.
Tasks	Machine Translation
Published	2016-12-19
URL	http://arxiv.org/abs/1612.06138v2
PDF	http://arxiv.org/pdf/1612.06138v2.pdf
PWC	https://paperswithcode.com/paper/boosting-neural-machine-translation
Repo
Framework

Image Super-Resolution Based on Sparsity Prior via Smoothed $l_0$ Norm


Title	Image Super-Resolution Based on Sparsity Prior via Smoothed $l_0$ Norm
Authors	Mohammad Rostami, Zhou Wang
Abstract	In this paper we aim to tackle the problem of reconstructing a high-resolution image from a single low-resolution input image, known as single image super-resolution. In the literature, sparse representation has been used to address this problem, where it is assumed that both low-resolution and high-resolution images share the same sparse representation over a pair of coupled jointly trained dictionaries. This assumption enables us to use the compressed sensing theory to find the jointly sparse representation via the low-resolution image and then use it to recover the high-resolution image. However, sparse representation of a signal over a known dictionary is an ill-posed, combinatorial optimization problem. Here we propose an algorithm that adopts the smoothed $l_0$-norm (SL0) approach to find the jointly sparse representation. Improved quality of the reconstructed image is obtained for most images in terms of both peak signal-to-noise-ratio (PSNR) and structural similarity (SSIM) measures.
Tasks	Combinatorial Optimization, Image Super-Resolution, Super-Resolution
Published	2016-03-22
URL	http://arxiv.org/abs/1603.06680v1
PDF	http://arxiv.org/pdf/1603.06680v1.pdf
PWC	https://paperswithcode.com/paper/image-super-resolution-based-on-sparsity
Repo
Framework

Flight Dynamics-based Recovery of a UAV Trajectory using Ground Cameras


Title	Flight Dynamics-based Recovery of a UAV Trajectory using Ground Cameras
Authors	Artem Rozantsev, Sudipta N. Sinha, Debadeepta Dey, Pascal Fua
Abstract	We propose a new method to estimate the 6-dof trajectory of a flying object such as a quadrotor UAV within a 3D airspace monitored using multiple fixed ground cameras. It is based on a new structure from motion formulation for the 3D reconstruction of a single moving point with known motion dynamics. Our main contribution is a new bundle adjustment procedure which in addition to optimizing the camera poses, regularizes the point trajectory using a prior based on motion dynamics (or specifically flight dynamics). Furthermore, we can infer the underlying control input sent to the UAV’s autopilot that determined its flight trajectory. Our method requires neither perfect single-view tracking nor appearance matching across views. For robustness, we allow the tracker to generate multiple detections per frame in each video. The true detections and the data association across videos is estimated using robust multi-view triangulation and subsequently refined during our bundle adjustment procedure. Quantitative evaluation on simulated data and experiments on real videos from indoor and outdoor scenes demonstrates the effectiveness of our method.
Tasks	3D Reconstruction
Published	2016-12-01
URL	http://arxiv.org/abs/1612.00192v2
PDF	http://arxiv.org/pdf/1612.00192v2.pdf
PWC	https://paperswithcode.com/paper/flight-dynamics-based-recovery-of-a-uav
Repo
Framework