July 27, 2019

3214 words 16 mins read

Paper Group ANR 703

Single Image Super-Resolution Using Multi-Scale Convolutional Neural Network. Learning User Preferences to Incentivize Exploration in the Sharing Economy. Speedup from a different parametrization within the Neural Network algorithm. Sistema de Navegação Autônomo Baseado em Visão Computacional. Learning Multi-item Auctions with (or without) Samples. …

Single Image Super-Resolution Using Multi-Scale Convolutional Neural Network


Title	Single Image Super-Resolution Using Multi-Scale Convolutional Neural Network
Authors	Xiaoyi Jia, Xiangmin Xu, Bolun Cai, Kailing Guo
Abstract	Methods based on convolutional neural network (CNN) have demonstrated tremendous improvements on single image super-resolution. However, the previous methods mainly restore images from one single area in the low resolution (LR) input, which limits the flexibility of models to infer various scales of details for high resolution (HR) output. Moreover, most of them train a specific model for each up-scale factor. In this paper, we propose a multi-scale super resolution (MSSR) network. Our network consists of multi-scale paths to make the HR inference, which can learn to synthesize features from different scales. This property helps reconstruct various kinds of regions in HR images. In addition, only one single model is needed for multiple up-scale factors, which is more efficient without loss of restoration quality. Experiments on four public datasets demonstrate that the proposed method achieved state-of-the-art performance with fast speed.
Tasks	Image Super-Resolution, Super-Resolution
Published	2017-05-15
URL	http://arxiv.org/abs/1705.05084v1
PDF	http://arxiv.org/pdf/1705.05084v1.pdf
PWC	https://paperswithcode.com/paper/single-image-super-resolution-using-multi
Repo
Framework


Title	Learning User Preferences to Incentivize Exploration in the Sharing Economy
Authors	Christoph Hirnschall, Adish Singla, Sebastian Tschiatschek, Andreas Krause
Abstract	We study platforms in the sharing economy and discuss the need for incentivizing users to explore options that otherwise would not be chosen. For instance, rental platforms such as Airbnb typically rely on customer reviews to provide users with relevant information about different options. Yet, often a large fraction of options does not have any reviews available. Such options are frequently neglected as viable choices, and in turn are unlikely to be evaluated, creating a vicious cycle. Platforms can engage users to deviate from their preferred choice by offering monetary incentives for choosing a different option instead. To efficiently learn the optimal incentives to offer, we consider structural information in user preferences and introduce a novel algorithm - Coordinated Online Learning (CoOL) - for learning with structural information modeled as convex constraints. We provide formal guarantees on the performance of our algorithm and test the viability of our approach in a user study with data of apartments on Airbnb. Our findings suggest that our approach is well-suited to learn appropriate incentives and increase exploration on the investigated platform.
Tasks
Published	2017-11-17
URL	http://arxiv.org/abs/1711.08331v2
PDF	http://arxiv.org/pdf/1711.08331v2.pdf
PWC	https://paperswithcode.com/paper/learning-user-preferences-to-incentivize
Repo
Framework

Speedup from a different parametrization within the Neural Network algorithm


Title	Speedup from a different parametrization within the Neural Network algorithm
Authors	Michael F. Zimmer
Abstract	A different parametrization of the hyperplanes is used in the neural network algorithm. As demonstrated on several autoencoder examples it significantly outperforms the usual parametrization, reaching lower training error values with only a fraction of the number of epochs. It’s argued that it makes it easier to understand and initialize the parameters.
Tasks
Published	2017-05-20
URL	http://arxiv.org/abs/1705.07250v3
PDF	http://arxiv.org/pdf/1705.07250v3.pdf
PWC	https://paperswithcode.com/paper/speedup-from-a-different-parametrization
Repo
Framework

Sistema de Navegação Autônomo Baseado em Visão Computacional


Title	Sistema de Navegação Autônomo Baseado em Visão Computacional
Authors	Michel Conrado Cardoso Meneses
Abstract	Autonomous robots are used as the tool to solve many kinds of problems, such as environmental mapping and monitoring. Either for adverse conditions related to the human presence or even for the need to reduce costs, it is certain that many efforts have been made to develop robots with an increasingly high level of autonomy. They must be capable of locomotion through dynamic environments, without human operators or assistant systems’ help. It is noted, thus, that the form of perception and modeling of the environment becomes significantly relevant to navigation. Among the main sensing methods are those based on vision. Through this, it is possible to create highly-detailed models about the environment, since many characteristics can be measured, such as texture, color, and illumination. However, the most accurate vision-based navigation techniques are computationally expensive to run on low-cost mobile platforms. Therefore, the goal of this work was to develop a low-cost robot, controlled by a Raspberry Pi, whose navigation system is based on vision. For this purpose, the strategy used consisted in identifying obstacles via optical flow pattern recognition. Through this signal, it is possible to infer the relative displacement between the robot and other elements in the environment. Its estimation was done using the Lucas-Kanade algorithm, which can be executed by the Raspberry Pi without harming its performance. Finally, an SVM based classifier was used to identify patterns of this signal associated with obstacles movement. The developed system was evaluated considering its execution over an optical flow pattern dataset extracted from a real navigation environment. In the end, it was verified that the processing frequency of the system was superior to the others. Furthermore, its accuracy and acquisition cost were, respectively, higher and lower than most of the cited works.
Tasks	Optical Flow Estimation
Published	2017-10-17
URL	http://arxiv.org/abs/1710.06518v1
PDF	http://arxiv.org/pdf/1710.06518v1.pdf
PWC	https://paperswithcode.com/paper/sistema-de-navegacao-autonomo-baseado-em
Repo
Framework

Learning Multi-item Auctions with (or without) Samples


Title	Learning Multi-item Auctions with (or without) Samples
Authors	Yang Cai, Constantinos Daskalakis
Abstract	We provide algorithms that learn simple auctions whose revenue is approximately optimal in multi-item multi-bidder settings, for a wide range of valuations including unit-demand, additive, constrained additive, XOS, and subadditive. We obtain our learning results in two settings. The first is the commonly studied setting where sample access to the bidders’ distributions over valuations is given, for both regular distributions and arbitrary distributions with bounded support. Our algorithms require polynomially many samples in the number of items and bidders. The second is a more general max-min learning setting that we introduce, where we are given “approximate distributions,” and we seek to compute an auction whose revenue is approximately optimal simultaneously for all “true distributions” that are close to the given ones. These results are more general in that they imply the sample-based results, and are also applicable in settings where we have no sample access to the underlying distributions but have estimated them indirectly via market research or by observation of previously run, potentially non-truthful auctions. Our results hold for valuation distributions satisfying the standard (and necessary) independence-across-items property. They also generalize and improve upon recent works, which have provided algorithms that learn approximately optimal auctions in more restricted settings with additive, subadditive and unit-demand valuations using sample access to distributions. We generalize these results to the complete unit-demand, additive, and XOS setting, to i.i.d. subadditive bidders, and to the max-min setting. Our results are enabled by new uniform convergence bounds for hypotheses classes under product measures. Our bounds result in exponential savings in sample complexity compared to bounds derived by bounding the VC dimension, and are of independent interest.
Tasks
Published	2017-09-01
URL	http://arxiv.org/abs/1709.00228v1
PDF	http://arxiv.org/pdf/1709.00228v1.pdf
PWC	https://paperswithcode.com/paper/learning-multi-item-auctions-with-or-without
Repo
Framework

Stein Variational Message Passing for Continuous Graphical Models


Title	Stein Variational Message Passing for Continuous Graphical Models
Authors	Dilin Wang, Zhe Zeng, Qiang Liu
Abstract	We propose a novel distributed inference algorithm for continuous graphical models, by extending Stein variational gradient descent (SVGD) to leverage the Markov dependency structure of the distribution of interest. Our approach combines SVGD with a set of structured local kernel functions defined on the Markov blanket of each node, which alleviates the curse of high dimensionality and simultaneously yields a distributed algorithm for decentralized inference tasks. We justify our method with theoretical analysis and show that the use of local kernels can be viewed as a new type of localized approximation that matches the target distribution on the conditional distributions of each node over its Markov blanket. Our empirical results show that our method outperforms a variety of baselines including standard MCMC and particle message passing methods.
Tasks
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07168v3
PDF	http://arxiv.org/pdf/1711.07168v3.pdf
PWC	https://paperswithcode.com/paper/stein-variational-message-passing-for
Repo
Framework

Two-stream Collaborative Learning with Spatial-Temporal Attention for Video Classification


Title	Two-stream Collaborative Learning with Spatial-Temporal Attention for Video Classification
Authors	Yuxin Peng, Yunzhen Zhao, Junchao Zhang
Abstract	Video classification is highly important with wide applications, such as video search and intelligent surveillance. Video naturally consists of static and motion information, which can be represented by frame and optical flow. Recently, researchers generally adopt the deep networks to capture the static and motion information \textbf{\emph{separately}}, which mainly has two limitations: (1) Ignoring the coexistence relationship between spatial and temporal attention, while they should be jointly modelled as the spatial and temporal evolutions of video, thus discriminative video features can be extracted.(2) Ignoring the strong complementarity between static and motion information coexisted in video, while they should be collaboratively learned to boost each other. For addressing the above two limitations, this paper proposes the approach of two-stream collaborative learning with spatial-temporal attention (TCLSTA), which consists of two models: (1) Spatial-temporal attention model: The spatial-level attention emphasizes the salient regions in frame, and the temporal-level attention exploits the discriminative frames in video. They are jointly learned and mutually boosted to learn the discriminative static and motion features for better classification performance. (2) Static-motion collaborative model: It not only achieves mutual guidance on static and motion information to boost the feature learning, but also adaptively learns the fusion weights of static and motion streams, so as to exploit the strong complementarity between static and motion information to promote video classification. Experiments on 4 widely-used datasets show that our TCLSTA approach achieves the best performance compared with more than 10 state-of-the-art methods.
Tasks	Optical Flow Estimation, Video Classification
Published	2017-11-09
URL	http://arxiv.org/abs/1711.03273v1
PDF	http://arxiv.org/pdf/1711.03273v1.pdf
PWC	https://paperswithcode.com/paper/two-stream-collaborative-learning-with
Repo
Framework

On the effectiveness of feature set augmentation using clusters of word embeddings


Title	On the effectiveness of feature set augmentation using clusters of word embeddings
Authors	Georgios Balikas, Ioannis Partalas
Abstract	Word clusters have been empirically shown to offer important performance improvements on various tasks. Despite their importance, their incorporation in the standard pipeline of feature engineering relies more on a trial-and-error procedure where one evaluates several hyper-parameters, like the number of clusters to be used. In order to better understand the role of such features we systematically evaluate their effect on four tasks, those of named entity segmentation and classification as well as, those of five-point sentiment classification and quantification. Our results strongly suggest that cluster membership features improve the performance.
Tasks	Feature Engineering, Sentiment Analysis, Word Embeddings
Published	2017-05-03
URL	http://arxiv.org/abs/1705.01265v2
PDF	http://arxiv.org/pdf/1705.01265v2.pdf
PWC	https://paperswithcode.com/paper/on-the-effectiveness-of-feature-set
Repo
Framework

Convolutional Drift Networks for Video Classification


Title	Convolutional Drift Networks for Video Classification
Authors	Dillon Graham, Seyed Hamed Fatemi Langroudi, Christopher Kanan, Dhireesha Kudithipudi
Abstract	Analyzing spatio-temporal data like video is a challenging task that requires processing visual and temporal information effectively. Convolutional Neural Networks have shown promise as baseline fixed feature extractors through transfer learning, a technique that helps minimize the training cost on visual information. Temporal information is often handled using hand-crafted features or Recurrent Neural Networks, but this can be overly specific or prohibitively complex. Building a fully trainable system that can efficiently analyze spatio-temporal data without hand-crafted features or complex training is an open challenge. We present a new neural network architecture to address this challenge, the Convolutional Drift Network (CDN). Our CDN architecture combines the visual feature extraction power of deep Convolutional Neural Networks with the intrinsically efficient temporal processing provided by Reservoir Computing. In this introductory paper on the CDN, we provide a very simple baseline implementation tested on two egocentric (first-person) video activity datasets.We achieve video-level activity classification results on-par with state-of-the art methods. Notably, performance on this complex spatio-temporal task was produced by only training a single feed-forward layer in the CDN.
Tasks	Transfer Learning, Video Classification
Published	2017-11-03
URL	http://arxiv.org/abs/1711.01201v1
PDF	http://arxiv.org/pdf/1711.01201v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-drift-networks-for-video
Repo
Framework

Large-scale image analysis using docker sandboxing


Title	Large-scale image analysis using docker sandboxing
Authors	B Sengupta, E Vazquez, M Sasdelli, Y Qian, M Peniak, L Netherton, G Delfino
Abstract	With the advent of specialized hardware such as Graphics Processing Units (GPUs), large scale image localization, classification and retrieval have seen increased prevalence. Designing scalable software architecture that co-evolves with such specialized hardware is a challenge in the commercial setting. In this paper, we describe one such architecture (\textit{Cortexica}) that leverages scalability of GPUs and sandboxing offered by docker containers. This allows for the flexibility of mixing different computer architectures as well as computational algorithms with the security of a trusted environment. We illustrate the utility of this framework in a commercial setting i.e., searching for multiple products in an image by combining image localisation and retrieval.
Tasks
Published	2017-03-07
URL	http://arxiv.org/abs/1703.02898v1
PDF	http://arxiv.org/pdf/1703.02898v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-image-analysis-using-docker
Repo
Framework

Demography-based Facial Retouching Detection using Subclass Supervised Sparse Autoencoder


Title	Demography-based Facial Retouching Detection using Subclass Supervised Sparse Autoencoder
Authors	Aparna Bharati, Mayank Vatsa, Richa Singh, Kevin W. Bowyer, Xin Tong
Abstract	Digital retouching of face images is becoming more widespread due to the introduction of software packages that automate the task. Several researchers have introduced algorithms to detect whether a face image is original or retouched. However, previous work on this topic has not considered whether or how accuracy of retouching detection varies with the demography of face images. In this paper, we introduce a new Multi-Demographic Retouched Faces (MDRF) dataset, which contains images belonging to two genders, male and female, and three ethnicities, Indian, Chinese, and Caucasian. Further, retouched images are created using two different retouching software packages. The second major contribution of this research is a novel semi-supervised autoencoder incorporating “subclass” information to improve classification. The proposed approach outperforms existing state-of-the-art detection algorithms for the task of generalized retouching detection. Experiments conducted with multiple combinations of ethnicities show that accuracy of retouching detection can vary greatly based on the demographics of the training and testing images.
Tasks
Published	2017-09-22
URL	http://arxiv.org/abs/1709.07598v1
PDF	http://arxiv.org/pdf/1709.07598v1.pdf
PWC	https://paperswithcode.com/paper/demography-based-facial-retouching-detection
Repo
Framework

Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor


Title	Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor
Authors	Aaron Chadha, Alhabib Abbas, Yiannis Andreopoulos
Abstract	We investigate video classification via a two-stream convolutional neural network (CNN) design that directly ingests information extracted from compressed video bitstreams. Our approach begins with the observation that all modern video codecs divide the input frames into macroblocks (MBs). We demonstrate that selective access to MB motion vector (MV) information within compressed video bitstreams can also provide for selective, motion-adaptive, MB pixel decoding (a.k.a., MB texture decoding). This in turn allows for the derivation of spatio-temporal video activity regions at extremely high speed in comparison to conventional full-frame decoding followed by optical flow estimation. In order to evaluate the accuracy of a video classification framework based on such activity data, we independently train two CNN architectures on MB texture and MV correspondences and then fuse their scores to derive the final classification of each test video. Evaluation on two standard datasets shows that the proposed approach is competitive to the best two-stream video classification approaches found in the literature. At the same time: (i) a CPU-based realization of our MV extraction is over 977 times faster than GPU-based optical flow methods; (ii) selective decoding is up to 12 times faster than full-frame decoding; (iii) our proposed spatial and temporal CNNs perform inference at 5 to 49 times lower cloud computing cost than the fastest methods from the literature.
Tasks	Optical Flow Estimation, Video Classification
Published	2017-10-14
URL	http://arxiv.org/abs/1710.05112v2
PDF	http://arxiv.org/pdf/1710.05112v2.pdf
PWC	https://paperswithcode.com/paper/video-classification-with-cnns-using-the
Repo
Framework

On $w$-mixtures: Finite convex combinations of prescribed component distributions


Title	On $w$-mixtures: Finite convex combinations of prescribed component distributions
Authors	Frank Nielsen, Richard Nock
Abstract	We consider the space of $w$-mixtures that is the set of finite statistical mixtures sharing the same prescribed component distributions. The geometry induced by the Kullback-Leibler (KL) divergence on this family of $w$-mixtures is a dually flat space in information geometry called the mixture family manifold. It follows that the KL divergence between two $w$-mixtures is equivalent to a Bregman Divergence (BD) defined for the negative Shannon entropy generator. Thus the KL divergence between two Gaussian Mixture Models (GMMs) sharing the same components is (theoretically) a Bregman divergence. This KL-BD equivalence implies that we can perform optimal KL-averaging aggregation of $w$-mixtures without information loss. More generally, we prove that the skew Jensen-Shannon divergence between $w$-mixtures is equivalent to a skew Jensen divergence on their parameters. Finally, we state several divergence identity and inequalities relating $w$-mixtures.
Tasks
Published	2017-08-02
URL	http://arxiv.org/abs/1708.00568v1
PDF	http://arxiv.org/pdf/1708.00568v1.pdf
PWC	https://paperswithcode.com/paper/on-w-mixtures-finite-convex-combinations-of
Repo
Framework

Unsupervised Sentence Representations as Word Information Series: Revisiting TF–IDF


Title	Unsupervised Sentence Representations as Word Information Series: Revisiting TF–IDF
Authors	Ignacio Arroyo-Fernández, Carlos-Francisco Méndez-Cruz, Gerardo Sierra, Juan-Manuel Torres-Moreno, Grigori Sidorov
Abstract	Sentence representation at the semantic level is a challenging task for Natural Language Processing and Artificial Intelligence. Despite the advances in word embeddings (i.e. word vector representations), capturing sentence meaning is an open question due to complexities of semantic interactions among words. In this paper, we present an embedding method, which is aimed at learning unsupervised sentence representations from unlabeled text. We propose an unsupervised method that models a sentence as a weighted series of word embeddings. The weights of the word embeddings are fitted by using Shannon’s word entropies provided by the Term Frequency–Inverse Document Frequency (TF–IDF) transform. The hyperparameters of the model can be selected according to the properties of data (e.g. sentence length and textual gender). Hyperparameter selection involves word embedding methods and dimensionalities, as well as weighting schemata. Our method offers advantages over existing methods: identifiable modules, short-term training, online inference of (unseen) sentence representations, as well as independence from domain, external knowledge and language resources. Results showed that our model outperformed the state of the art in well-known Semantic Textual Similarity (STS) benchmarks. Moreover, our model reached state-of-the-art performance when compared to supervised and knowledge-based STS systems.
Tasks	Semantic Textual Similarity, Word Embeddings
Published	2017-10-17
URL	http://arxiv.org/abs/1710.06524v2
PDF	http://arxiv.org/pdf/1710.06524v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-sentence-representations-as-word
Repo
Framework

Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT


Title	Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT
Authors	Yining Wang, Long Zhou, Jiajun Zhang, Chengqing Zong
Abstract	Neural machine translation (NMT), a new approach to machine translation, has been proved to outperform conventional statistical machine translation (SMT) across a variety of language pairs. Translation is an open-vocabulary problem, but most existing NMT systems operate with a fixed vocabulary, which causes the incapability of translating rare words. This problem can be alleviated by using different translation granularities, such as character, subword and hybrid word-character. Translation involving Chinese is one of the most difficult tasks in machine translation, however, to the best of our knowledge, there has not been any other work exploring which translation granularity is most suitable for Chinese in NMT. In this paper, we conduct an extensive comparison using Chinese-English NMT as a case study. Furthermore, we discuss the advantages and disadvantages of various translation granularities in detail. Our experiments show that subword model performs best for Chinese-to-English translation with the vocabulary which is not so big while hybrid word-character model is most suitable for English-to-Chinese translation. Moreover, experiments of different granularities show that Hybrid_BPE method can achieve best result on Chinese-to-English translation task.
Tasks	Machine Translation
Published	2017-11-13
URL	http://arxiv.org/abs/1711.04457v1
PDF	http://arxiv.org/pdf/1711.04457v1.pdf
PWC	https://paperswithcode.com/paper/word-subword-or-character-an-empirical-study
Repo
Framework