Paper Group ANR 648
Geometric Consistency for Self-Supervised End-to-End Visual Odometry. Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning. Deterministic Hypothesis Generation for Robust Fitting of Multiple Structures. Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm. Cross-domain attr …
Geometric Consistency for Self-Supervised End-to-End Visual Odometry
Title | Geometric Consistency for Self-Supervised End-to-End Visual Odometry |
Authors | Ganesh Iyer, J. Krishna Murthy, Gunshi Gupta, K. Madhava Krishna, Liam Paull |
Abstract | With the success of deep learning based approaches in tackling challenging problems in computer vision, a wide range of deep architectures have recently been proposed for the task of visual odometry (VO) estimation. Most of these proposed solutions rely on supervision, which requires the acquisition of precise ground-truth camera pose information, collected using expensive motion capture systems or high-precision IMU/GPS sensor rigs. In this work, we propose an unsupervised paradigm for deep visual odometry learning. We show that using a noisy teacher, which could be a standard VO pipeline, and by designing a loss term that enforces geometric consistency of the trajectory, we can train accurate deep models for VO that do not require ground-truth labels. We leverage geometry as a self-supervisory signal and propose “Composite Transformation Constraints (CTCs)", that automatically generate supervisory signals for training and enforce geometric consistency in the VO estimate. We also present a method of characterizing the uncertainty in VO estimates thus obtained. To evaluate our VO pipeline, we present exhaustive ablation studies that demonstrate the efficacy of end-to-end, self-supervised methodologies to train deep models for monocular VO. We show that leveraging concepts from geometry and incorporating them into the training of a recurrent neural network results in performance competitive to supervised deep VO methods. |
Tasks | Motion Capture, Visual Odometry |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.03789v1 |
http://arxiv.org/pdf/1804.03789v1.pdf | |
PWC | https://paperswithcode.com/paper/geometric-consistency-for-self-supervised-end |
Repo | |
Framework | |
Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning
Title | Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning |
Authors | Yanbin Liu, Juho Lee, Minseop Park, Saehoon Kim, Eunho Yang, Sung Ju Hwang, Yi Yang |
Abstract | The goal of few-shot learning is to learn a classifier that generalizes well even when trained with a limited number of training instances per class. The recently introduced meta-learning approaches tackle this problem by learning a generic classifier across a large number of multiclass classification tasks and generalizing the model to a new task. Yet, even with such meta-learning, the low-data problem in the novel classification task still remains. In this paper, we propose Transductive Propagation Network (TPN), a novel meta-learning framework for transductive inference that classifies the entire test set at once to alleviate the low-data problem. Specifically, we propose to learn to propagate labels from labeled instances to unlabeled test instances, by learning a graph construction module that exploits the manifold structure in the data. TPN jointly learns both the parameters of feature embedding and the graph construction in an end-to-end manner. We validate TPN on multiple benchmark datasets, on which it largely outperforms existing few-shot learning approaches and achieves the state-of-the-art results. |
Tasks | Few-Shot Learning, graph construction, Meta-Learning |
Published | 2018-05-25 |
URL | http://arxiv.org/abs/1805.10002v5 |
http://arxiv.org/pdf/1805.10002v5.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-propagate-labels-transductive |
Repo | |
Framework | |
Deterministic Hypothesis Generation for Robust Fitting of Multiple Structures
Title | Deterministic Hypothesis Generation for Robust Fitting of Multiple Structures |
Authors | Kwang Hee Lee, Chanki Yu, Sang Wook Lee |
Abstract | We present a novel algorithm for generating robust and consistent hypotheses for multiple-structure model fitting. Most of the existing methods utilize random sampling which produce varying results especially when outlier ratio is high. For a structure where a model is fitted, the inliers of other structures are regarded as outliers when multiple structures are present. Global optimization has recently been investigated to provide stable and unique solutions, but the computational cost of the algorithms is prohibitively high for most image data with reasonable sizes. The algorithm presented in this paper uses a maximum feasible subsystem (MaxFS) algorithm to generate consistent initial hypotheses only from partial datasets in spatially overlapping local image regions. Our assumption is that each genuine structure will exist as a dominant structure in at least one of the local regions. To refine initial hypotheses estimated from partial datasets and to remove residual tolerance dependency of the MaxFS algorithm, iterative re-weighted L1 (IRL1) minimization is performed for all the image data. Initial weights of IRL1 framework are determined from the initial hypotheses generated in local regions. Our approach is significantly more efficient than those that use only global optimization for all the image data. Experimental results demonstrate that the presented method can generate more reliable and consistent hypotheses than random-sampling methods for estimating single and multiple structures from data with a large amount of outliers. We clearly expose the influence of algorithm parameter settings on the results in our experiments. |
Tasks | |
Published | 2018-07-25 |
URL | http://arxiv.org/abs/1807.09408v1 |
http://arxiv.org/pdf/1807.09408v1.pdf | |
PWC | https://paperswithcode.com/paper/deterministic-hypothesis-generation-for |
Repo | |
Framework | |
Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm
Title | Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm |
Authors | Taylor Arnold, Lauren Tilton |
Abstract | Word choice is dependent on the cultural context of writers and their subjects. Different words are used to describe similar actions, objects, and features based on factors such as class, race, gender, geography and political affinity. Exploratory techniques based on locating and counting words may, therefore, lead to conclusions that reinforce culturally inflected boundaries. We offer a new method, the DualNeighbors algorithm, for linking thematically similar documents both within and across discursive and linguistic barriers to reveal cross-cultural connections. Qualitative and quantitative evaluations of this technique are shown as applied to two cultural datasets of interest to researchers across the humanities and social sciences. An open-source implementation of the DualNeighbors algorithm is provided to assist in its application. |
Tasks | |
Published | 2018-06-28 |
URL | http://arxiv.org/abs/1806.11183v1 |
http://arxiv.org/pdf/1806.11183v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-discourse-and-multilingual-exploration |
Repo | |
Framework | |
Cross-domain attribute representation based on convolutional neural network
Title | Cross-domain attribute representation based on convolutional neural network |
Authors | Guohui Zhang, Gaoyuan Liang, Fang Su, Fanxin Qu, Jing-Yan Wang |
Abstract | In the problem of domain transfer learning, we learn a model for the predic-tion in a target domain from the data of both some source domains and the target domain, where the target domain is in lack of labels while the source domain has sufficient labels. Besides the instances of the data, recently the attributes of data shared across domains are also explored and proven to be very helpful to leverage the information of different domains. In this paper, we propose a novel learning framework for domain-transfer learning based on both instances and attributes. We proposed to embed the attributes of dif-ferent domains by a shared convolutional neural network (CNN), learn a domain-independent CNN model to represent the information shared by dif-ferent domains by matching across domains, and a domain-specific CNN model to represent the information of each domain. The concatenation of the three CNN model outputs is used to predict the class label. An iterative algo-rithm based on gradient descent method is developed to learn the parameters of the model. The experiments over benchmark datasets show the advantage of the proposed model. |
Tasks | Transfer Learning |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.07295v1 |
http://arxiv.org/pdf/1805.07295v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-attribute-representation-based |
Repo | |
Framework | |
An Element Sensitive Saliency Model with Position Prior Learning for Web Pages
Title | An Element Sensitive Saliency Model with Position Prior Learning for Web Pages |
Authors | Yujun Gu, Jie Chang, Ya Zhang, Yanfeng Wang |
Abstract | Understanding human visual attention is important for multimedia applications. Many studies have attempted to learn from eye-tracking data and build computational saliency prediction models. However, limited efforts have been devoted to saliency prediction for Web pages, which are characterized by more diverse content elements and spatial layouts. In this paper, we propose a novel end-to-end deep generative saliency model for Web pages. To capture position biases introduced by page layouts, a Position Prior Learning sub-network is proposed, which models position biases as multivariate Gaussian distribution using variational auto-encoder. To model different elements of a Web page, a Multi Discriminative Region Detection (MDRD) branch and a Text Region Detection(TRD) branch are introduced, which target to extract discriminative localizations and “prominent” text regions likely to correspond to human attention, respectively. We validate the proposed model with FiWI, a public Web-page dataset, and shows that the proposed model outperforms the state-of-art models for Web-page saliency prediction. |
Tasks | Eye Tracking, Saliency Prediction |
Published | 2018-04-27 |
URL | http://arxiv.org/abs/1804.10361v2 |
http://arxiv.org/pdf/1804.10361v2.pdf | |
PWC | https://paperswithcode.com/paper/an-element-sensitive-saliency-model-with |
Repo | |
Framework | |
Fast greedy algorithms for dictionary selection with generalized sparsity constraints
Title | Fast greedy algorithms for dictionary selection with generalized sparsity constraints |
Authors | Kaito Fujii, Tasuku Soma |
Abstract | In dictionary selection, several atoms are selected from finite candidates that successfully approximate given data points in the sparse representation. We propose a novel efficient greedy algorithm for dictionary selection. Not only does our algorithm work much faster than the known methods, but it can also handle more complex sparsity constraints, such as average sparsity. Using numerical experiments, we show that our algorithm outperforms the known methods for dictionary selection, achieving competitive performances with dictionary learning algorithms in a smaller running time. |
Tasks | Dictionary Learning |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.02314v1 |
http://arxiv.org/pdf/1809.02314v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-greedy-algorithms-for-dictionary |
Repo | |
Framework | |
Transfer with Model Features in Reinforcement Learning
Title | Transfer with Model Features in Reinforcement Learning |
Authors | Lucas Lehnert, Michael L. Littman |
Abstract | A key question in Reinforcement Learning is which representation an agent can learn to efficiently reuse knowledge between different tasks. Recently the Successor Representation was shown to have empirical benefits for transferring knowledge between tasks with shared transition dynamics. This paper presents Model Features: a feature representation that clusters behaviourally equivalent states and that is equivalent to a Model-Reduction. Further, we present a Successor Feature model which shows that learning Successor Features is equivalent to learning a Model-Reduction. A novel optimization objective is developed and we provide bounds showing that minimizing this objective results in an increasingly improved approximation of a Model-Reduction. Further, we provide transfer experiments on randomly generated MDPs which vary in their transition and reward functions but approximately preserve behavioural equivalence between states. These results demonstrate that Model Features are suitable for transfer between tasks with varying transition and reward functions. |
Tasks | |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.01736v1 |
http://arxiv.org/pdf/1807.01736v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-with-model-features-in-reinforcement |
Repo | |
Framework | |
A Learning-Based Visual Saliency Prediction Model for Stereoscopic 3D Video (LBVS-3D)
Title | A Learning-Based Visual Saliency Prediction Model for Stereoscopic 3D Video (LBVS-3D) |
Authors | Amin Banitalebi-Dehkordi, Mahsa T. Pourazad, Panos Nasiopoulos |
Abstract | Over the past decade, many computational saliency prediction models have been proposed for 2D images and videos. Considering that the human visual system has evolved in a natural 3D environment, it is only natural to want to design visual attention models for 3D content. Existing monocular saliency models are not able to accurately predict the attentive regions when applied to 3D image/video content, as they do not incorporate depth information. This paper explores stereoscopic video saliency prediction by exploiting both low-level attributes such as brightness, color, texture, orientation, motion, and depth, as well as high-level cues such as face, person, vehicle, animal, text, and horizon. Our model starts with a rough segmentation and quantifies several intuitive observations such as the effects of visual discomfort level, depth abruptness, motion acceleration, elements of surprise, size and compactness of the salient regions, and emphasizing only a few salient objects in a scene. A new fovea-based model of spatial distance between the image regions is adopted for considering local and global feature calculations. To efficiently fuse the conspicuity maps generated by our method to one single saliency map that is highly correlated with the eye-fixation data, a random forest based algorithm is utilized. The performance of the proposed saliency model is evaluated against the results of an eye-tracking experiment, which involved 24 subjects and an in-house database of 61 captured stereoscopic videos. Our stereo video database as well as the eye-tracking data are publicly available along with this paper. Experiment results show that the proposed saliency prediction method achieves competitive performance compared to the state-of-the-art approaches. |
Tasks | Eye Tracking, Saliency Prediction |
Published | 2018-03-13 |
URL | http://arxiv.org/abs/1803.04842v1 |
http://arxiv.org/pdf/1803.04842v1.pdf | |
PWC | https://paperswithcode.com/paper/a-learning-based-visual-saliency-prediction |
Repo | |
Framework | |
Defending Malware Classification Networks Against Adversarial Perturbations with Non-Negative Weight Restrictions
Title | Defending Malware Classification Networks Against Adversarial Perturbations with Non-Negative Weight Restrictions |
Authors | Alex Kouzemtchenko |
Abstract | There is a growing body of literature showing that deep neural networks are vulnerable to adversarial input modification. Recently this work has been extended from image classification to malware classification over boolean features. In this paper we present several new methods for training restricted networks in this specific domain that are highly effective at preventing adversarial perturbations. We start with a fully adversarially resistant neural network that has hard non-negative weight restrictions and is equivalent to learning a monotonic boolean function and then attempt to relax the constraints to improve classifier accuracy. |
Tasks | Image Classification, Malware Classification |
Published | 2018-06-23 |
URL | http://arxiv.org/abs/1806.09035v1 |
http://arxiv.org/pdf/1806.09035v1.pdf | |
PWC | https://paperswithcode.com/paper/defending-malware-classification-networks |
Repo | |
Framework | |
MediaEval 2018: Predicting Media Memorability Task
Title | MediaEval 2018: Predicting Media Memorability Task |
Authors | Romain Cohendet, Claire-Hélène Demarty, Ngoc Duong, Mats Sjöberg, Bogdan Ionescu, Thanh-Toan Do, France Rennes |
Abstract | In this paper, we present the Predicting Media Memorability task, which is proposed as part of the MediaEval 2018 Benchmarking Initiative for Multimedia Evaluation. Participants are expected to design systems that automatically predict memorability scores for videos, which reflect the probability of a video being remembered. In contrast to previous work in image memorability prediction, where memorability was measured a few minutes after memorization, the proposed dataset comes with short-term and long-term memorability annotations. All task characteristics are described, namely: the task’s challenges and breakthrough, the released data set and ground truth, the required participant runs and the evaluation metrics. |
Tasks | |
Published | 2018-07-03 |
URL | http://arxiv.org/abs/1807.01052v1 |
http://arxiv.org/pdf/1807.01052v1.pdf | |
PWC | https://paperswithcode.com/paper/mediaeval-2018-predicting-media-memorability |
Repo | |
Framework | |
Active Learning Methods based on Statistical Leverage Scores
Title | Active Learning Methods based on Statistical Leverage Scores |
Authors | Cem Orhan, Oznur Tastan |
Abstract | In many real-world machine learning applications, unlabeled data are abundant whereas class labels are expensive and scarce. An active learner aims to obtain a model of high accuracy with as few labeled instances as possible by effectively selecting useful examples for labeling. We propose a new selection criterion that is based on statistical leverage scores and present two novel active learning methods based on this criterion: ALEVS for querying single example at each iteration and DBALEVS for querying a batch of examples. To assess the representativeness of the examples in the pool, ALEVS and DBALEVS use the statistical leverage scores of the kernel matrices computed on the examples of each class. Additionally, DBALEVS selects a diverse a set of examples that are highly representative but are dissimilar to already labeled examples through maximizing a submodular set function defined with the statistical leverage scores and the kernel matrix computed on the pool of the examples. The submodularity property of the set scoring function let us identify batches with a constant factor approximate to the optimal batch in an efficient manner. Our experiments on diverse datasets show that querying based on leverage scores is a powerful strategy for active learning. |
Tasks | Active Learning |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02497v1 |
http://arxiv.org/pdf/1812.02497v1.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-methods-based-on-statistical |
Repo | |
Framework | |
Multiple Kernel $k$-Means Clustering by Selecting Representative Kernels
Title | Multiple Kernel $k$-Means Clustering by Selecting Representative Kernels |
Authors | Yaqiang Yao, Huanhuan Chen |
Abstract | To cluster data that are not linearly separable in the original feature space, $k$-means clustering was extended to the kernel version. However, the performance of kernel $k$-means clustering largely depends on the choice of kernel function. To mitigate this problem, multiple kernel learning has been introduced into the $k$-means clustering to obtain an optimal kernel combination for clustering. Despite the success of multiple kernel $k$-means clustering in various scenarios, few of the existing work update the combination coefficients based on the diversity of kernels, which leads to the result that the selected kernels contain high redundancy and would degrade the clustering performance and efficiency. In this paper, we propose a simple but efficient strategy that selects a diverse subset from the pre-specified kernels as the representative kernels, and then incorporate the subset selection process into the framework of multiple $k$-means clustering. The representative kernels can be indicated as the significant combination weights. Due to the non-convexity of the obtained objective function, we develop an alternating minimization method to optimize the combination coefficients of the selected kernels and the cluster membership alternatively. We evaluate the proposed approach on several benchmark and real-world datasets. The experimental results demonstrate the competitiveness of our approach in comparison with the state-of-the-art methods. |
Tasks | |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00264v1 |
http://arxiv.org/pdf/1811.00264v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-kernel-k-means-clustering-by |
Repo | |
Framework | |
Early Detection of Social Media Hoaxes at Scale
Title | Early Detection of Social Media Hoaxes at Scale |
Authors | Arkaitz Zubiaga, Aiqi Jiang |
Abstract | The unmoderated nature of social media enables the diffusion of hoaxes, which in turn jeopardises the credibility of information gathered from social media platforms. Existing research on automated detection of hoaxes has the limitation of using relatively small datasets, owing to the difficulty of getting labelled data. This in turn has limited research exploring early detection of hoaxes as well as exploring other factors such as the effect of the size of the training data or the use of sliding windows. To mitigate this problem, we introduce a semi-automated method that leverages the Wikidata knowledge base to build large-scale datasets for veracity classification, focusing on celebrity death reports. This enables us to create a dataset with 4,007 reports including over 13 million tweets, 15% of which are fake. Experiments using class-specific representations of word embeddings show that we can achieve F1 scores nearing 72% within 10 minutes of the first tweet being posted when we expand the size of the training data following our semi-automated means. Our dataset represents a realistic scenario with a real distribution of true, commemorative and false stories, which we release for further use as a benchmark in future research. |
Tasks | Word Embeddings |
Published | 2018-01-22 |
URL | https://arxiv.org/abs/1801.07311v2 |
https://arxiv.org/pdf/1801.07311v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-class-specific-word-representations |
Repo | |
Framework | |
The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure
Title | The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure |
Authors | Saeed Mahloujifar, Dimitrios I. Diochnos, Mohammad Mahmoody |
Abstract | Many modern machine learning classifiers are shown to be vulnerable to adversarial perturbations of the instances. Despite a massive amount of work focusing on making classifiers robust, the task seems quite challenging. In this work, through a theoretical study, we investigate the adversarial risk and robustness of classifiers and draw a connection to the well-known phenomenon of concentration of measure in metric measure spaces. We show that if the metric probability space of the test instance is concentrated, any classifier with some initial constant error is inherently vulnerable to adversarial perturbations. One class of concentrated metric probability spaces are the so-called Levy families that include many natural distributions. In this special case, our attacks only need to perturb the test instance by at most $O(\sqrt n)$ to make it misclassified, where $n$ is the data dimension. Using our general result about Levy instance spaces, we first recover as special case some of the previously proved results about the existence of adversarial examples. However, many more Levy families are known (e.g., product distribution under the Hamming distance) for which we immediately obtain new attacks that find adversarial examples of distance $O(\sqrt n)$. Finally, we show that concentration of measure for product spaces implies the existence of forms of “poisoning” attacks in which the adversary tampers with the training data with the goal of degrading the classifier. In particular, we show that for any learning algorithm that uses $m$ training examples, there is an adversary who can increase the probability of any “bad property” (e.g., failing on a particular test instance) that initially happens with non-negligible probability to $\approx 1$ by substituting only $\tilde{O}(\sqrt m)$ of the examples with other (still correctly labeled) examples. |
Tasks | |
Published | 2018-09-09 |
URL | http://arxiv.org/abs/1809.03063v2 |
http://arxiv.org/pdf/1809.03063v2.pdf | |
PWC | https://paperswithcode.com/paper/the-curse-of-concentration-in-robust-learning |
Repo | |
Framework | |