Paper Group ANR 185
Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task. The Asymptotic Performance of Linear Echo State Neural Networks. Fast DPP Sampling for Nyström with Application to Kernel Methods. Practical Riemannian Neural Networks. Probabilistic Saliency Estimation. Product Offerings in Malicious Hacker Markets. Qua …
Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task
Title | Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task |
Authors | Ashkan Mokarian, Mateusz Malinowski, Mario Fritz |
Abstract | We present Mean Box Pooling, a novel visual representation that pools over CNN representations of a large number, highly overlapping object proposals. We show that such representation together with nCCA, a successful multimodal embedding technique, achieves state-of-the-art performance on the Visual Madlibs task. Moreover, inspired by the nCCA’s objective function, we extend classical CNN+LSTM approach to train the network by directly maximizing the similarity between the internal representation of the deep learning architecture and candidate answers. Again, such approach achieves a significant improvement over the prior work that also uses CNN+LSTM approach on Visual Madlibs. |
Tasks | |
Published | 2016-08-09 |
URL | http://arxiv.org/abs/1608.02717v1 |
http://arxiv.org/pdf/1608.02717v1.pdf | |
PWC | https://paperswithcode.com/paper/mean-box-pooling-a-rich-image-representation |
Repo | |
Framework | |
The Asymptotic Performance of Linear Echo State Neural Networks
Title | The Asymptotic Performance of Linear Echo State Neural Networks |
Authors | Romain Couillet, Gilles Wainrib, Harry Sevi, Hafiz Tiomoko Ali |
Abstract | In this article, a study of the mean-square error (MSE) performance of linear echo-state neural networks is performed, both for training and testing tasks. Considering the realistic setting of noise present at the network nodes, we derive deterministic equivalents for the aforementioned MSE in the limit where the number of input data $T$ and network size $n$ both grow large. Specializing then the network connectivity matrix to specific random settings, we further obtain simple formulas that provide new insights on the performance of such networks. |
Tasks | |
Published | 2016-03-25 |
URL | http://arxiv.org/abs/1603.07866v1 |
http://arxiv.org/pdf/1603.07866v1.pdf | |
PWC | https://paperswithcode.com/paper/the-asymptotic-performance-of-linear-echo |
Repo | |
Framework | |
Fast DPP Sampling for Nyström with Application to Kernel Methods
Title | Fast DPP Sampling for Nyström with Application to Kernel Methods |
Authors | Chengtao Li, Stefanie Jegelka, Suvrit Sra |
Abstract | The Nystr"om method has long been popular for scaling up kernel methods. Its theoretical guarantees and empirical performance rely critically on the quality of the landmarks selected. We study landmark selection for Nystr"om using Determinantal Point Processes (DPPs), discrete probability models that allow tractable generation of diverse samples. We prove that landmarks selected via DPPs guarantee bounds on approximation errors; subsequently, we analyze implications for kernel ridge regression. Contrary to prior reservations due to cubic complexity of DPPsampling, we show that (under certain conditions) Markov chain DPP sampling requires only linear time in the size of the data. We present several empirical results that support our theoretical analysis, and demonstrate the superior performance of DPP-based landmark selection compared with existing approaches. |
Tasks | Point Processes |
Published | 2016-03-19 |
URL | http://arxiv.org/abs/1603.06052v2 |
http://arxiv.org/pdf/1603.06052v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-dpp-sampling-for-nystrom-with |
Repo | |
Framework | |
Practical Riemannian Neural Networks
Title | Practical Riemannian Neural Networks |
Authors | Gaétan Marceau-Caron, Yann Ollivier |
Abstract | We provide the first experimental results on non-synthetic datasets for the quasi-diagonal Riemannian gradient descents for neural networks introduced in [Ollivier, 2015]. These include the MNIST, SVHN, and FACE datasets as well as a previously unpublished electroencephalogram dataset. The quasi-diagonal Riemannian algorithms consistently beat simple stochastic gradient gradient descents by a varying margin. The computational overhead with respect to simple backpropagation is around a factor $2$. Perhaps more interestingly, these methods also reach their final performance quickly, thus requiring fewer training epochs and a smaller total computation time. We also present an implementation guide to these Riemannian gradient descents for neural networks, showing how the quasi-diagonal versions can be implemented with minimal effort on top of existing routines which compute gradients. |
Tasks | |
Published | 2016-02-25 |
URL | http://arxiv.org/abs/1602.08007v1 |
http://arxiv.org/pdf/1602.08007v1.pdf | |
PWC | https://paperswithcode.com/paper/practical-riemannian-neural-networks |
Repo | |
Framework | |
Probabilistic Saliency Estimation
Title | Probabilistic Saliency Estimation |
Authors | Caglar Aytekin, Alexandros Iosifidis, Moncef Gabbouj |
Abstract | In this paper, we model the salient object detection problem under a probabilistic framework encoding the boundary connectivity saliency cue and smoothness constraints in an optimization problem. We show that this problem has a closed form global optimum which estimates the salient object. We further show that along with the probabilistic framework, the proposed method also enjoys a wide range of interpretations, i.e. graph cut, diffusion maps and one-class classification. With an analysis according to these interpretations, we also find that our proposed method provides approximations to the global optimum to another criterion that integrates local/global contrast and large area saliency cues. The proposed approach achieves mostly leading performance compared to the state-of-the-art algorithms over a large set of salient object detection datasets including around 17k images for several evaluation metrics. Furthermore, the computational complexity of the proposed method is favorable/comparable to many state-of-the-art techniques. |
Tasks | Object Detection, Saliency Prediction, Salient Object Detection |
Published | 2016-09-13 |
URL | http://arxiv.org/abs/1609.03868v2 |
http://arxiv.org/pdf/1609.03868v2.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-saliency-estimation |
Repo | |
Framework | |
Product Offerings in Malicious Hacker Markets
Title | Product Offerings in Malicious Hacker Markets |
Authors | Ericsson Marin, Ahmad Diab, Paulo Shakarian |
Abstract | Marketplaces specializing in malicious hacking products - including malware and exploits - have recently become more prominent on the darkweb and deepweb. We scrape 17 such sites and collect information about such products in a unified database schema. Using a combination of manual labeling and unsupervised clustering, we examine a corpus of products in order to understand their various categories and how they become specialized with respect to vendor and marketplace. This initial study presents how we effectively employed unsupervised techniques to this data as well as the types of insights we gained on various categories of malicious hacking products. |
Tasks | |
Published | 2016-07-26 |
URL | http://arxiv.org/abs/1607.07903v1 |
http://arxiv.org/pdf/1607.07903v1.pdf | |
PWC | https://paperswithcode.com/paper/product-offerings-in-malicious-hacker-markets |
Repo | |
Framework | |
Quantum-enhanced machine learning
Title | Quantum-enhanced machine learning |
Authors | Vedran Dunjko, Jacob M. Taylor, Hans J. Briegel |
Abstract | The emerging field of quantum machine learning has the potential to substantially aid in the problems and scope of artificial intelligence. This is only enhanced by recent successes in the field of classical machine learning. In this work we propose an approach for the systematic treatment of machine learning, from the perspective of quantum information. Our approach is general and covers all three main branches of machine learning: supervised, unsupervised and reinforcement learning. While quantum improvements in supervised and unsupervised learning have been reported, reinforcement learning has received much less attention. Within our approach, we tackle the problem of quantum enhancements in reinforcement learning as well, and propose a systematic scheme for providing improvements. As an example, we show that quadratic improvements in learning efficiency, and exponential improvements in performance over limited time periods, can be obtained for a broad class of learning problems. |
Tasks | Quantum Machine Learning |
Published | 2016-10-26 |
URL | http://arxiv.org/abs/1610.08251v1 |
http://arxiv.org/pdf/1610.08251v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-enhanced-machine-learning |
Repo | |
Framework | |
AutoScaler: Scale-Attention Networks for Visual Correspondence
Title | AutoScaler: Scale-Attention Networks for Visual Correspondence |
Authors | Shenlong Wang, Linjie Luo, Ning Zhang, Jia Li |
Abstract | Finding visual correspondence between local features is key to many computer vision problems. While defining features with larger contextual scales usually implies greater discriminativeness, it could also lead to less spatial accuracy of the features. We propose AutoScaler, a scale-attention network to explicitly optimize this trade-off in visual correspondence tasks. Our network consists of a weight-sharing feature network to compute multi-scale feature maps and an attention network to combine them optimally in the scale space. This allows our network to have adaptive receptive field sizes over different scales of the input. The entire network is trained end-to-end in a siamese framework for visual correspondence tasks. Our method achieves favorable results compared to state-of-the-art methods on challenging optical flow and semantic matching benchmarks, including Sintel, KITTI and CUB-2011. We also show that our method can generalize to improve hand-crafted descriptors (e.g Daisy) on general visual correspondence tasks. Finally, our attention network can generate visually interpretable scale attention maps. |
Tasks | Optical Flow Estimation |
Published | 2016-11-17 |
URL | http://arxiv.org/abs/1611.05837v1 |
http://arxiv.org/pdf/1611.05837v1.pdf | |
PWC | https://paperswithcode.com/paper/autoscaler-scale-attention-networks-for |
Repo | |
Framework | |
Out-of-Sample Extension for Dimensionality Reduction of Noisy Time Series
Title | Out-of-Sample Extension for Dimensionality Reduction of Noisy Time Series |
Authors | Hamid Dadkhahi, Marco F. Duarte, Benjamin Marlin |
Abstract | This paper proposes an out-of-sample extension framework for a global manifold learning algorithm (Isomap) that uses temporal information in out-of-sample points in order to make the embedding more robust to noise and artifacts. Given a set of noise-free training data and its embedding, the proposed framework extends the embedding for a noisy time series. This is achieved by adding a spatio-temporal compactness term to the optimization objective of the embedding. To the best of our knowledge, this is the first method for out-of-sample extension of manifold embeddings that leverages timing information available for the extension set. Experimental results demonstrate that our out-of-sample extension algorithm renders a more robust and accurate embedding of sequentially ordered image data in the presence of various noise and artifacts when compared to other timing-aware embeddings. Additionally, we show that an out-of-sample extension framework based on the proposed algorithm outperforms the state of the art in eye-gaze estimation. |
Tasks | Dimensionality Reduction, Gaze Estimation, Time Series |
Published | 2016-06-27 |
URL | http://arxiv.org/abs/1606.08282v3 |
http://arxiv.org/pdf/1606.08282v3.pdf | |
PWC | https://paperswithcode.com/paper/out-of-sample-extension-for-dimensionality |
Repo | |
Framework | |
Exploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning
Title | Exploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning |
Authors | Junwei Liang, Lu Jiang, Deyu Meng, Alexander Hauptmann |
Abstract | Learning video concept detectors automatically from the big but noisy web data with no additional manual annotations is a novel but challenging area in the multimedia and the machine learning community. A considerable amount of videos on the web are associated with rich but noisy contextual information, such as the title, which provides weak annotations or labels about the video content. To leverage the big noisy web labels, this paper proposes a novel method called WEbly-Labeled Learning (WELL), which is established on the state-of-the-art machine learning algorithm inspired by the learning process of human. WELL introduces a number of novel multi-modal approaches to incorporate meaningful prior knowledge called curriculum from the noisy web videos. To investigate this problem, we empirically study the curriculum constructed from the multi-modal features of the videos collected from YouTube and Flickr. The efficacy and the scalability of WELL have been extensively demonstrated on two public benchmarks, including the largest multimedia dataset and the largest manually-labeled video set. The comprehensive experimental results demonstrate that WELL outperforms state-of-the-art studies by a statically significant margin on learning concepts from noisy web video data. In addition, the results also verify that WELL is robust to the level of noisiness in the video data. Notably, WELL trained on sufficient noisy web labels is able to achieve a comparable accuracy to supervised learning methods trained on the clean manually-labeled data. |
Tasks | |
Published | 2016-07-16 |
URL | http://arxiv.org/abs/1607.04780v1 |
http://arxiv.org/pdf/1607.04780v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-multi-modal-curriculum-in-noisy |
Repo | |
Framework | |
Low-rank Bandits with Latent Mixtures
Title | Low-rank Bandits with Latent Mixtures |
Authors | Aditya Gopalan, Odalric-Ambrym Maillard, Mohammadi Zaki |
Abstract | We study the task of maximizing rewards from recommending items (actions) to users sequentially interacting with a recommender system. Users are modeled as latent mixtures of C many representative user classes, where each class specifies a mean reward profile across actions. Both the user features (mixture distribution over classes) and the item features (mean reward vector per class) are unknown a priori. The user identity is the only contextual information available to the learner while interacting. This induces a low-rank structure on the matrix of expected rewards r a,b from recommending item a to user b. The problem reduces to the well-known linear bandit when either user or item-side features are perfectly known. In the setting where each user, with its stochastically sampled taste profile, interacts only for a small number of sessions, we develop a bandit algorithm for the two-sided uncertainty. It combines the Robust Tensor Power Method of Anandkumar et al. (2014b) with the OFUL linear bandit algorithm of Abbasi-Yadkori et al. (2011). We provide the first rigorous regret analysis of this combination, showing that its regret after T user interactions is $\tilde O(C\sqrt{BT})$, with B the number of users. An ingredient towards this result is a novel robustness property of OFUL, of independent interest. |
Tasks | Recommendation Systems |
Published | 2016-09-06 |
URL | http://arxiv.org/abs/1609.01508v1 |
http://arxiv.org/pdf/1609.01508v1.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-bandits-with-latent-mixtures |
Repo | |
Framework | |
Real-Time Human Motion Capture with Multiple Depth Cameras
Title | Real-Time Human Motion Capture with Multiple Depth Cameras |
Authors | Alireza Shafaei, James J. Little |
Abstract | Commonly used human motion capture systems require intrusive attachment of markers that are visually tracked with multiple cameras. In this work we present an efficient and inexpensive solution to markerless motion capture using only a few Kinect sensors. Unlike the previous work on 3d pose estimation using a single depth camera, we relax constraints on the camera location and do not assume a co-operative user. We apply recent image segmentation techniques to depth images and use curriculum learning to train our system on purely synthetic data. Our method accurately localizes body parts without requiring an explicit shape model. The body joint locations are then recovered by combining evidence from multiple views in real-time. We also introduce a dataset of ~6 million synthetic depth frames for pose estimation from multiple cameras and exceed state-of-the-art results on the Berkeley MHAD dataset. |
Tasks | 3D Pose Estimation, Markerless Motion Capture, Motion Capture, Pose Estimation, Semantic Segmentation |
Published | 2016-05-25 |
URL | http://arxiv.org/abs/1605.08068v1 |
http://arxiv.org/pdf/1605.08068v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-human-motion-capture-with-multiple |
Repo | |
Framework | |
Diverse Sampling for Self-Supervised Learning of Semantic Segmentation
Title | Diverse Sampling for Self-Supervised Learning of Semantic Segmentation |
Authors | Mohammadreza Mostajabi, Nicholas Kolkin, Gregory Shakhnarovich |
Abstract | We propose an approach for learning category-level semantic segmentation purely from image-level classification tags indicating presence of categories. It exploits localization cues that emerge from training classification-tasked convolutional networks, to drive a “self-supervision” process that automatically labels a sparse, diverse training set of points likely to belong to classes of interest. Our approach has almost no hyperparameters, is modular, and allows for very fast training of segmentation in less than 3 minutes. It obtains competitive results on the VOC 2012 segmentation benchmark. More, significantly the modularity and fast training of our framework allows new classes to efficiently added for inference. |
Tasks | Semantic Segmentation |
Published | 2016-12-06 |
URL | http://arxiv.org/abs/1612.01991v1 |
http://arxiv.org/pdf/1612.01991v1.pdf | |
PWC | https://paperswithcode.com/paper/diverse-sampling-for-self-supervised-learning |
Repo | |
Framework | |
Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction
Title | Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction |
Authors | Allen Schmaltz, Yoon Kim, Alexander M. Rush, Stuart M. Shieber |
Abstract | We demonstrate that an attention-based encoder-decoder model can be used for sentence-level grammatical error identification for the Automated Evaluation of Scientific Writing (AESW) Shared Task 2016. The attention-based encoder-decoder models can be used for the generation of corrections, in addition to error identification, which is of interest for certain end-user applications. We show that a character-based encoder-decoder model is particularly effective, outperforming other results on the AESW Shared Task on its own, and showing gains over a word-based counterpart. Our final model–a combination of three character-based encoder-decoder models, one word-based encoder-decoder model, and a sentence-level CNN–is the highest performing system on the AESW 2016 binary prediction Shared Task. |
Tasks | |
Published | 2016-04-16 |
URL | http://arxiv.org/abs/1604.04677v1 |
http://arxiv.org/pdf/1604.04677v1.pdf | |
PWC | https://paperswithcode.com/paper/sentence-level-grammatical-error |
Repo | |
Framework | |
Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions
Title | Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions |
Authors | Igor Colin, Aurélien Bellet, Joseph Salmon, Stéphan Clémençon |
Abstract | In decentralized networks (of sensors, connected objects, etc.), there is an important need for efficient algorithms to optimize a global cost function, for instance to learn a global model from the local data collected by each computing unit. In this paper, we address the problem of decentralized minimization of pairwise functions of the data points, where these points are distributed over the nodes of a graph defining the communication topology of the network. This general problem finds applications in ranking, distance metric learning and graph inference, among others. We propose new gossip algorithms based on dual averaging which aims at solving such problems both in synchronous and asynchronous settings. The proposed framework is flexible enough to deal with constrained and regularized variants of the optimization problem. Our theoretical analysis reveals that the proposed algorithms preserve the convergence rate of centralized dual averaging up to an additive bias term. We present numerical simulations on Area Under the ROC Curve (AUC) maximization and metric learning problems which illustrate the practical interest of our approach. |
Tasks | Metric Learning |
Published | 2016-06-08 |
URL | http://arxiv.org/abs/1606.02421v1 |
http://arxiv.org/pdf/1606.02421v1.pdf | |
PWC | https://paperswithcode.com/paper/gossip-dual-averaging-for-decentralized |
Repo | |
Framework | |