May 6, 2019

2702 words 13 mins read

Paper Group ANR 185

Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task. The Asymptotic Performance of Linear Echo State Neural Networks. Fast DPP Sampling for Nyström with Application to Kernel Methods. Practical Riemannian Neural Networks. Probabilistic Saliency Estimation. Product Offerings in Malicious Hacker Markets. Qua …

Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task


Title	Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task
Authors	Ashkan Mokarian, Mateusz Malinowski, Mario Fritz
Abstract	We present Mean Box Pooling, a novel visual representation that pools over CNN representations of a large number, highly overlapping object proposals. We show that such representation together with nCCA, a successful multimodal embedding technique, achieves state-of-the-art performance on the Visual Madlibs task. Moreover, inspired by the nCCA’s objective function, we extend classical CNN+LSTM approach to train the network by directly maximizing the similarity between the internal representation of the deep learning architecture and candidate answers. Again, such approach achieves a significant improvement over the prior work that also uses CNN+LSTM approach on Visual Madlibs.
Tasks
Published	2016-08-09
URL	http://arxiv.org/abs/1608.02717v1
PDF	http://arxiv.org/pdf/1608.02717v1.pdf
PWC	https://paperswithcode.com/paper/mean-box-pooling-a-rich-image-representation
Repo
Framework

The Asymptotic Performance of Linear Echo State Neural Networks


Title	The Asymptotic Performance of Linear Echo State Neural Networks
Authors	Romain Couillet, Gilles Wainrib, Harry Sevi, Hafiz Tiomoko Ali
Abstract	In this article, a study of the mean-square error (MSE) performance of linear echo-state neural networks is performed, both for training and testing tasks. Considering the realistic setting of noise present at the network nodes, we derive deterministic equivalents for the aforementioned MSE in the limit where the number of input data $T$ and network size $n$ both grow large. Specializing then the network connectivity matrix to specific random settings, we further obtain simple formulas that provide new insights on the performance of such networks.
Tasks
Published	2016-03-25
URL	http://arxiv.org/abs/1603.07866v1
PDF	http://arxiv.org/pdf/1603.07866v1.pdf
PWC	https://paperswithcode.com/paper/the-asymptotic-performance-of-linear-echo
Repo
Framework

Fast DPP Sampling for Nyström with Application to Kernel Methods


Title	Fast DPP Sampling for Nyström with Application to Kernel Methods
Authors	Chengtao Li, Stefanie Jegelka, Suvrit Sra
Abstract	The Nystr"om method has long been popular for scaling up kernel methods. Its theoretical guarantees and empirical performance rely critically on the quality of the landmarks selected. We study landmark selection for Nystr"om using Determinantal Point Processes (DPPs), discrete probability models that allow tractable generation of diverse samples. We prove that landmarks selected via DPPs guarantee bounds on approximation errors; subsequently, we analyze implications for kernel ridge regression. Contrary to prior reservations due to cubic complexity of DPPsampling, we show that (under certain conditions) Markov chain DPP sampling requires only linear time in the size of the data. We present several empirical results that support our theoretical analysis, and demonstrate the superior performance of DPP-based landmark selection compared with existing approaches.
Tasks	Point Processes
Published	2016-03-19
URL	http://arxiv.org/abs/1603.06052v2
PDF	http://arxiv.org/pdf/1603.06052v2.pdf
PWC	https://paperswithcode.com/paper/fast-dpp-sampling-for-nystrom-with
Repo
Framework

Practical Riemannian Neural Networks


Title	Practical Riemannian Neural Networks
Authors	Gaétan Marceau-Caron, Yann Ollivier
Abstract	We provide the first experimental results on non-synthetic datasets for the quasi-diagonal Riemannian gradient descents for neural networks introduced in [Ollivier, 2015]. These include the MNIST, SVHN, and FACE datasets as well as a previously unpublished electroencephalogram dataset. The quasi-diagonal Riemannian algorithms consistently beat simple stochastic gradient gradient descents by a varying margin. The computational overhead with respect to simple backpropagation is around a factor $2$. Perhaps more interestingly, these methods also reach their final performance quickly, thus requiring fewer training epochs and a smaller total computation time. We also present an implementation guide to these Riemannian gradient descents for neural networks, showing how the quasi-diagonal versions can be implemented with minimal effort on top of existing routines which compute gradients.
Tasks
Published	2016-02-25
URL	http://arxiv.org/abs/1602.08007v1
PDF	http://arxiv.org/pdf/1602.08007v1.pdf
PWC	https://paperswithcode.com/paper/practical-riemannian-neural-networks
Repo
Framework

Probabilistic Saliency Estimation


Title	Probabilistic Saliency Estimation
Authors	Caglar Aytekin, Alexandros Iosifidis, Moncef Gabbouj
Abstract	In this paper, we model the salient object detection problem under a probabilistic framework encoding the boundary connectivity saliency cue and smoothness constraints in an optimization problem. We show that this problem has a closed form global optimum which estimates the salient object. We further show that along with the probabilistic framework, the proposed method also enjoys a wide range of interpretations, i.e. graph cut, diffusion maps and one-class classification. With an analysis according to these interpretations, we also find that our proposed method provides approximations to the global optimum to another criterion that integrates local/global contrast and large area saliency cues. The proposed approach achieves mostly leading performance compared to the state-of-the-art algorithms over a large set of salient object detection datasets including around 17k images for several evaluation metrics. Furthermore, the computational complexity of the proposed method is favorable/comparable to many state-of-the-art techniques.
Tasks	Object Detection, Saliency Prediction, Salient Object Detection
Published	2016-09-13
URL	http://arxiv.org/abs/1609.03868v2
PDF	http://arxiv.org/pdf/1609.03868v2.pdf
PWC	https://paperswithcode.com/paper/probabilistic-saliency-estimation
Repo
Framework

Product Offerings in Malicious Hacker Markets


Title	Product Offerings in Malicious Hacker Markets
Authors	Ericsson Marin, Ahmad Diab, Paulo Shakarian
Abstract	Marketplaces specializing in malicious hacking products - including malware and exploits - have recently become more prominent on the darkweb and deepweb. We scrape 17 such sites and collect information about such products in a unified database schema. Using a combination of manual labeling and unsupervised clustering, we examine a corpus of products in order to understand their various categories and how they become specialized with respect to vendor and marketplace. This initial study presents how we effectively employed unsupervised techniques to this data as well as the types of insights we gained on various categories of malicious hacking products.
Tasks
Published	2016-07-26
URL	http://arxiv.org/abs/1607.07903v1
PDF	http://arxiv.org/pdf/1607.07903v1.pdf
PWC	https://paperswithcode.com/paper/product-offerings-in-malicious-hacker-markets
Repo
Framework

Quantum-enhanced machine learning


Title	Quantum-enhanced machine learning
Authors	Vedran Dunjko, Jacob M. Taylor, Hans J. Briegel
Abstract	The emerging field of quantum machine learning has the potential to substantially aid in the problems and scope of artificial intelligence. This is only enhanced by recent successes in the field of classical machine learning. In this work we propose an approach for the systematic treatment of machine learning, from the perspective of quantum information. Our approach is general and covers all three main branches of machine learning: supervised, unsupervised and reinforcement learning. While quantum improvements in supervised and unsupervised learning have been reported, reinforcement learning has received much less attention. Within our approach, we tackle the problem of quantum enhancements in reinforcement learning as well, and propose a systematic scheme for providing improvements. As an example, we show that quadratic improvements in learning efficiency, and exponential improvements in performance over limited time periods, can be obtained for a broad class of learning problems.
Tasks	Quantum Machine Learning
Published	2016-10-26
URL	http://arxiv.org/abs/1610.08251v1
PDF	http://arxiv.org/pdf/1610.08251v1.pdf
PWC	https://paperswithcode.com/paper/quantum-enhanced-machine-learning
Repo
Framework

AutoScaler: Scale-Attention Networks for Visual Correspondence


Title	AutoScaler: Scale-Attention Networks for Visual Correspondence
Authors	Shenlong Wang, Linjie Luo, Ning Zhang, Jia Li
Abstract	Finding visual correspondence between local features is key to many computer vision problems. While defining features with larger contextual scales usually implies greater discriminativeness, it could also lead to less spatial accuracy of the features. We propose AutoScaler, a scale-attention network to explicitly optimize this trade-off in visual correspondence tasks. Our network consists of a weight-sharing feature network to compute multi-scale feature maps and an attention network to combine them optimally in the scale space. This allows our network to have adaptive receptive field sizes over different scales of the input. The entire network is trained end-to-end in a siamese framework for visual correspondence tasks. Our method achieves favorable results compared to state-of-the-art methods on challenging optical flow and semantic matching benchmarks, including Sintel, KITTI and CUB-2011. We also show that our method can generalize to improve hand-crafted descriptors (e.g Daisy) on general visual correspondence tasks. Finally, our attention network can generate visually interpretable scale attention maps.
Tasks	Optical Flow Estimation
Published	2016-11-17
URL	http://arxiv.org/abs/1611.05837v1
PDF	http://arxiv.org/pdf/1611.05837v1.pdf
PWC	https://paperswithcode.com/paper/autoscaler-scale-attention-networks-for
Repo
Framework

Out-of-Sample Extension for Dimensionality Reduction of Noisy Time Series


Title	Out-of-Sample Extension for Dimensionality Reduction of Noisy Time Series
Authors	Hamid Dadkhahi, Marco F. Duarte, Benjamin Marlin
Abstract	This paper proposes an out-of-sample extension framework for a global manifold learning algorithm (Isomap) that uses temporal information in out-of-sample points in order to make the embedding more robust to noise and artifacts. Given a set of noise-free training data and its embedding, the proposed framework extends the embedding for a noisy time series. This is achieved by adding a spatio-temporal compactness term to the optimization objective of the embedding. To the best of our knowledge, this is the first method for out-of-sample extension of manifold embeddings that leverages timing information available for the extension set. Experimental results demonstrate that our out-of-sample extension algorithm renders a more robust and accurate embedding of sequentially ordered image data in the presence of various noise and artifacts when compared to other timing-aware embeddings. Additionally, we show that an out-of-sample extension framework based on the proposed algorithm outperforms the state of the art in eye-gaze estimation.
Tasks	Dimensionality Reduction, Gaze Estimation, Time Series
Published	2016-06-27
URL	http://arxiv.org/abs/1606.08282v3
PDF	http://arxiv.org/pdf/1606.08282v3.pdf
PWC	https://paperswithcode.com/paper/out-of-sample-extension-for-dimensionality
Repo
Framework


Title	Exploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning
Authors	Junwei Liang, Lu Jiang, Deyu Meng, Alexander Hauptmann
Abstract	Learning video concept detectors automatically from the big but noisy web data with no additional manual annotations is a novel but challenging area in the multimedia and the machine learning community. A considerable amount of videos on the web are associated with rich but noisy contextual information, such as the title, which provides weak annotations or labels about the video content. To leverage the big noisy web labels, this paper proposes a novel method called WEbly-Labeled Learning (WELL), which is established on the state-of-the-art machine learning algorithm inspired by the learning process of human. WELL introduces a number of novel multi-modal approaches to incorporate meaningful prior knowledge called curriculum from the noisy web videos. To investigate this problem, we empirically study the curriculum constructed from the multi-modal features of the videos collected from YouTube and Flickr. The efficacy and the scalability of WELL have been extensively demonstrated on two public benchmarks, including the largest multimedia dataset and the largest manually-labeled video set. The comprehensive experimental results demonstrate that WELL outperforms state-of-the-art studies by a statically significant margin on learning concepts from noisy web video data. In addition, the results also verify that WELL is robust to the level of noisiness in the video data. Notably, WELL trained on sufficient noisy web labels is able to achieve a comparable accuracy to supervised learning methods trained on the clean manually-labeled data.
Tasks
Published	2016-07-16
URL	http://arxiv.org/abs/1607.04780v1
PDF	http://arxiv.org/pdf/1607.04780v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-multi-modal-curriculum-in-noisy
Repo
Framework

Low-rank Bandits with Latent Mixtures


Title	Low-rank Bandits with Latent Mixtures
Authors	Aditya Gopalan, Odalric-Ambrym Maillard, Mohammadi Zaki
Abstract	We study the task of maximizing rewards from recommending items (actions) to users sequentially interacting with a recommender system. Users are modeled as latent mixtures of C many representative user classes, where each class specifies a mean reward profile across actions. Both the user features (mixture distribution over classes) and the item features (mean reward vector per class) are unknown a priori. The user identity is the only contextual information available to the learner while interacting. This induces a low-rank structure on the matrix of expected rewards r a,b from recommending item a to user b. The problem reduces to the well-known linear bandit when either user or item-side features are perfectly known. In the setting where each user, with its stochastically sampled taste profile, interacts only for a small number of sessions, we develop a bandit algorithm for the two-sided uncertainty. It combines the Robust Tensor Power Method of Anandkumar et al. (2014b) with the OFUL linear bandit algorithm of Abbasi-Yadkori et al. (2011). We provide the first rigorous regret analysis of this combination, showing that its regret after T user interactions is $\tilde O(C\sqrt{BT})$, with B the number of users. An ingredient towards this result is a novel robustness property of OFUL, of independent interest.
Tasks	Recommendation Systems
Published	2016-09-06
URL	http://arxiv.org/abs/1609.01508v1
PDF	http://arxiv.org/pdf/1609.01508v1.pdf
PWC	https://paperswithcode.com/paper/low-rank-bandits-with-latent-mixtures
Repo
Framework

Real-Time Human Motion Capture with Multiple Depth Cameras


Title	Real-Time Human Motion Capture with Multiple Depth Cameras
Authors	Alireza Shafaei, James J. Little
Abstract	Commonly used human motion capture systems require intrusive attachment of markers that are visually tracked with multiple cameras. In this work we present an efficient and inexpensive solution to markerless motion capture using only a few Kinect sensors. Unlike the previous work on 3d pose estimation using a single depth camera, we relax constraints on the camera location and do not assume a co-operative user. We apply recent image segmentation techniques to depth images and use curriculum learning to train our system on purely synthetic data. Our method accurately localizes body parts without requiring an explicit shape model. The body joint locations are then recovered by combining evidence from multiple views in real-time. We also introduce a dataset of ~6 million synthetic depth frames for pose estimation from multiple cameras and exceed state-of-the-art results on the Berkeley MHAD dataset.
Tasks	3D Pose Estimation, Markerless Motion Capture, Motion Capture, Pose Estimation, Semantic Segmentation
Published	2016-05-25
URL	http://arxiv.org/abs/1605.08068v1
PDF	http://arxiv.org/pdf/1605.08068v1.pdf
PWC	https://paperswithcode.com/paper/real-time-human-motion-capture-with-multiple
Repo
Framework

Diverse Sampling for Self-Supervised Learning of Semantic Segmentation


Title	Diverse Sampling for Self-Supervised Learning of Semantic Segmentation
Authors	Mohammadreza Mostajabi, Nicholas Kolkin, Gregory Shakhnarovich
Abstract	We propose an approach for learning category-level semantic segmentation purely from image-level classification tags indicating presence of categories. It exploits localization cues that emerge from training classification-tasked convolutional networks, to drive a “self-supervision” process that automatically labels a sparse, diverse training set of points likely to belong to classes of interest. Our approach has almost no hyperparameters, is modular, and allows for very fast training of segmentation in less than 3 minutes. It obtains competitive results on the VOC 2012 segmentation benchmark. More, significantly the modularity and fast training of our framework allows new classes to efficiently added for inference.
Tasks	Semantic Segmentation
Published	2016-12-06
URL	http://arxiv.org/abs/1612.01991v1
PDF	http://arxiv.org/pdf/1612.01991v1.pdf
PWC	https://paperswithcode.com/paper/diverse-sampling-for-self-supervised-learning
Repo
Framework

Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction


Title	Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction
Authors	Allen Schmaltz, Yoon Kim, Alexander M. Rush, Stuart M. Shieber
Abstract	We demonstrate that an attention-based encoder-decoder model can be used for sentence-level grammatical error identification for the Automated Evaluation of Scientific Writing (AESW) Shared Task 2016. The attention-based encoder-decoder models can be used for the generation of corrections, in addition to error identification, which is of interest for certain end-user applications. We show that a character-based encoder-decoder model is particularly effective, outperforming other results on the AESW Shared Task on its own, and showing gains over a word-based counterpart. Our final model–a combination of three character-based encoder-decoder models, one word-based encoder-decoder model, and a sentence-level CNN–is the highest performing system on the AESW 2016 binary prediction Shared Task.
Tasks
Published	2016-04-16
URL	http://arxiv.org/abs/1604.04677v1
PDF	http://arxiv.org/pdf/1604.04677v1.pdf
PWC	https://paperswithcode.com/paper/sentence-level-grammatical-error
Repo
Framework

Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions


Title	Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions
Authors	Igor Colin, Aurélien Bellet, Joseph Salmon, Stéphan Clémençon
Abstract	In decentralized networks (of sensors, connected objects, etc.), there is an important need for efficient algorithms to optimize a global cost function, for instance to learn a global model from the local data collected by each computing unit. In this paper, we address the problem of decentralized minimization of pairwise functions of the data points, where these points are distributed over the nodes of a graph defining the communication topology of the network. This general problem finds applications in ranking, distance metric learning and graph inference, among others. We propose new gossip algorithms based on dual averaging which aims at solving such problems both in synchronous and asynchronous settings. The proposed framework is flexible enough to deal with constrained and regularized variants of the optimization problem. Our theoretical analysis reveals that the proposed algorithms preserve the convergence rate of centralized dual averaging up to an additive bias term. We present numerical simulations on Area Under the ROC Curve (AUC) maximization and metric learning problems which illustrate the practical interest of our approach.
Tasks	Metric Learning
Published	2016-06-08
URL	http://arxiv.org/abs/1606.02421v1
PDF	http://arxiv.org/pdf/1606.02421v1.pdf
PWC	https://paperswithcode.com/paper/gossip-dual-averaging-for-decentralized
Repo
Framework