October 20, 2019

2725 words 13 mins read

Paper Group AWR 184

Paper Group AWR 184

Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification. Sequence-Aware Recommender Systems. Single-View Place Recognition under Seasonal Changes. SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks. TallyQA: Answering Complex Counting Questions. 3DFeat-Net: Weakly Supervised Local 3D Features for Po …

Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification

Title Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification
Authors Cheng-Yaw Low, Jaewoo Park, Andrew Beng-Jin Teoh
Abstract Stacking-based deep neural network (S-DNN) is aggregated with pluralities of basic learning modules, one after another, to synthesize a deep neural network (DNN) alternative for pattern classification. Contrary to the DNNs trained end to end by backpropagation (BP), each S-DNN layer, i.e., a self-learnable module, is to be trained decisively and independently without BP intervention. In this paper, a ridge regression-based S-DNN, dubbed deep analytic network (DAN), along with its kernelization (K-DAN), are devised for multilayer feature re-learning from the pre-extracted baseline features and the structured features. Our theoretical formulation demonstrates that DAN/K-DAN re-learn by perturbing the intra/inter-class variations, apart from diminishing the prediction errors. We scrutinize the DAN/K-DAN performance for pattern classification on datasets of varying domains - faces, handwritten digits, generic objects, to name a few. Unlike the typical BP-optimized DNNs to be trained from gigantic datasets by GPU, we disclose that DAN/K-DAN are trainable using only CPU even for small-scale training sets. Our experimental results disclose that DAN/K-DAN outperform the present S-DNNs and also the BP-trained DNNs, including multiplayer perceptron, deep belief network, etc., without data augmentation applied.
Tasks Data Augmentation
Published 2018-11-17
URL https://arxiv.org/abs/1811.07184v2
PDF https://arxiv.org/pdf/1811.07184v2.pdf
PWC https://paperswithcode.com/paper/stacking-based-deep-neural-network-deep
Repo https://github.com/chengyawlow/DAN
Framework none

Sequence-Aware Recommender Systems

Title Sequence-Aware Recommender Systems
Authors Massimo Quadrana, Paolo Cremonesi, Dietmar Jannach
Abstract Recommender systems are one of the most successful applications of data mining and machine learning technology in practice. Academic research in the field is historically often based on the matrix completion problem formulation, where for each user-item-pair only one interaction (e.g., a rating) is considered. In many application domains, however, multiple user-item interactions of different types can be recorded over time. And, a number of recent works have shown that this information can be used to build richer individual user models and to discover additional behavioral patterns that can be leveraged in the recommendation process. In this work we review existing works that consider information from such sequentially-ordered user- item interaction logs in the recommendation process. Based on this review, we propose a categorization of the corresponding recommendation tasks and goals, summarize existing algorithmic solutions, discuss methodological approaches when benchmarking what we call sequence-aware recommender systems, and outline open challenges in the area.
Tasks Matrix Completion, Recommendation Systems
Published 2018-02-23
URL https://arxiv.org/abs/1802.08452v1
PDF https://arxiv.org/pdf/1802.08452v1.pdf
PWC https://paperswithcode.com/paper/sequence-aware-recommender-systems
Repo https://github.com/taylorhawks/Recommender
Framework none

Single-View Place Recognition under Seasonal Changes

Title Single-View Place Recognition under Seasonal Changes
Authors Daniel Olid, José M. Fácil, Javier Civera
Abstract Single-view place recognition, that we can define as finding an image that corresponds to the same place as a given query image, is a key capability for autonomous navigation and mapping. Although there has been a considerable amount of research in the topic, the high degree of image variability (with viewpoint, illumination or occlusions for example) makes it a research challenge. One of the particular challenges, that we address in this work, is weather variation. Seasonal changes can produce drastic appearance changes, that classic low-level features do not model properly. Our contributions in this paper are twofold. First we pre-process and propose a partition for the Nordland dataset, frequently used for place recognition research without consensus on the partitions. And second, we evaluate several neural network architectures such as pre-trained, siamese and triplet for this problem. Our best results outperform the state of the art of the field. A video showing our results can be found in https://youtu.be/VrlxsYZoHDM. The partitioned version of the Nordland dataset at http://webdiis.unizar.es/~jmfacil/pr-nordland/.
Tasks Autonomous Navigation
Published 2018-08-20
URL http://arxiv.org/abs/1808.06516v1
PDF http://arxiv.org/pdf/1808.06516v1.pdf
PWC https://paperswithcode.com/paper/single-view-place-recognition-under-seasonal
Repo https://github.com/jmfacil/single-view-place-recognition
Framework caffe2

SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks

Title SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks
Authors Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, Junjie Yan
Abstract Siamese network based trackers formulate tracking as convolutional feature cross-correlation between target template and searching region. However, Siamese trackers still have accuracy gap compared with state-of-the-art algorithms and they cannot take advantage of feature from deep networks, such as ResNet-50 or deeper. In this work we prove the core reason comes from the lack of strict translation invariance. By comprehensive theoretical analysis and experimental validations, we break this restriction through a simple yet effective spatial aware sampling strategy and successfully train a ResNet-driven Siamese tracker with significant performance gain. Moreover, we propose a new model architecture to perform depth-wise and layer-wise aggregations, which not only further improves the accuracy but also reduces the model size. We conduct extensive ablation studies to demonstrate the effectiveness of the proposed tracker, which obtains currently the best results on four large tracking benchmarks, including OTB2015, VOT2018, UAV123, and LaSOT. Our model will be released to facilitate further studies based on this problem.
Tasks Visual Object Tracking, Visual Tracking
Published 2018-12-31
URL http://arxiv.org/abs/1812.11703v1
PDF http://arxiv.org/pdf/1812.11703v1.pdf
PWC https://paperswithcode.com/paper/siamrpn-evolution-of-siamese-visual-tracking
Repo https://github.com/STVIR/pysot
Framework pytorch

TallyQA: Answering Complex Counting Questions

Title TallyQA: Answering Complex Counting Questions
Authors Manoj Acharya, Kushal Kafle, Christopher Kanan
Abstract Most counting questions in visual question answering (VQA) datasets are simple and require no more than object detection. Here, we study algorithms for complex counting questions that involve relationships between objects, attribute identification, reasoning, and more. To do this, we created TallyQA, the world’s largest dataset for open-ended counting. We propose a new algorithm for counting that uses relation networks with region proposals. Our method lets relation networks be efficiently used with high-resolution imagery. It yields state-of-the-art results compared to baseline and recent systems on both TallyQA and the HowMany-QA benchmark.
Tasks Object Detection, Question Answering, Visual Question Answering
Published 2018-10-29
URL http://arxiv.org/abs/1810.12440v2
PDF http://arxiv.org/pdf/1810.12440v2.pdf
PWC https://paperswithcode.com/paper/tallyqa-answering-complex-counting-questions
Repo https://github.com/manoja328/tallyqa
Framework none

3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration

Title 3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration
Authors Zi Jian Yew, Gim Hee Lee
Abstract In this paper, we propose the 3DFeat-Net which learns both 3D feature detector and descriptor for point cloud matching using weak supervision. Unlike many existing works, we do not require manual annotation of matching point clusters. Instead, we leverage on alignment and attention mechanisms to learn feature correspondences from GPS/INS tagged 3D point clouds without explicitly specifying them. We create training and benchmark outdoor Lidar datasets, and experiments show that 3DFeat-Net obtains state-of-the-art performance on these gravity-aligned datasets.
Tasks Point Cloud Registration
Published 2018-07-25
URL http://arxiv.org/abs/1807.09413v1
PDF http://arxiv.org/pdf/1807.09413v1.pdf
PWC https://paperswithcode.com/paper/3dfeat-net-weakly-supervised-local-3d
Repo https://github.com/yewzijian/3DFeatNet
Framework tf

Robustness May Be at Odds with Accuracy

Title Robustness May Be at Odds with Accuracy
Authors Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, Aleksander Madry
Abstract We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed empirically in more complex settings. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the representations learned by robust models tend to align better with salient data characteristics and human perception.
Tasks
Published 2018-05-30
URL https://arxiv.org/abs/1805.12152v5
PDF https://arxiv.org/pdf/1805.12152v5.pdf
PWC https://paperswithcode.com/paper/robustness-may-be-at-odds-with-accuracy
Repo https://github.com/louis2889184/pytorch-adversarial-training
Framework pytorch

Crowdsourcing Semantic Label Propagation in Relation Classification

Title Crowdsourcing Semantic Label Propagation in Relation Classification
Authors Anca Dumitrache, Lora Aroyo, Chris Welty
Abstract Distant supervision is a popular method for performing relation extraction from text that is known to produce noisy labels. Most progress in relation extraction and classification has been made with crowdsourced corrections to distant-supervised labels, and there is evidence that indicates still more would be better. In this paper, we explore the problem of propagating human annotation signals gathered for open-domain relation classification through the CrowdTruth methodology for crowdsourcing, that captures ambiguity in annotations by measuring inter-annotator disagreement. Our approach propagates annotations to sentences that are similar in a low dimensional embedding space, expanding the number of labels by two orders of magnitude. Our experiments show significant improvement in a sentence-level multi-class relation classifier.
Tasks Relation Classification, Relation Extraction
Published 2018-09-03
URL http://arxiv.org/abs/1809.00537v1
PDF http://arxiv.org/pdf/1809.00537v1.pdf
PWC https://paperswithcode.com/paper/crowdsourcing-semantic-label-propagation-in
Repo https://github.com/CrowdTruth/Open-Domain-Relation-Extraction
Framework none

Coarse-to-Fine Decoding for Neural Semantic Parsing

Title Coarse-to-Fine Decoding for Neural Semantic Parsing
Authors Li Dong, Mirella Lapata
Abstract Semantic parsing aims at mapping natural language utterances into structured meaning representations. In this work, we propose a structure-aware neural architecture which decomposes the semantic parsing process into two stages. Given an input utterance, we first generate a rough sketch of its meaning, where low-level information (such as variable names and arguments) is glossed over. Then, we fill in missing details by taking into account the natural language input and the sketch itself. Experimental results on four datasets characteristic of different domains and meaning representations show that our approach consistently improves performance, achieving competitive results despite the use of relatively simple decoders.
Tasks Semantic Parsing
Published 2018-05-12
URL http://arxiv.org/abs/1805.04793v1
PDF http://arxiv.org/pdf/1805.04793v1.pdf
PWC https://paperswithcode.com/paper/coarse-to-fine-decoding-for-neural-semantic
Repo https://github.com/donglixp/coarse2fine
Framework pytorch

MPST: A Corpus of Movie Plot Synopses with Tags

Title MPST: A Corpus of Movie Plot Synopses with Tags
Authors Sudipta Kar, Suraj Maharjan, A. Pastor López-Monroy, Thamar Solorio
Abstract Social tagging of movies reveals a wide range of heterogeneous information about movies, like the genre, plot structure, soundtracks, metadata, visual and emotional experiences. Such information can be valuable in building automatic systems to create tags for movies. Automatic tagging systems can help recommendation engines to improve the retrieval of similar movies as well as help viewers to know what to expect from a movie in advance. In this paper, we set out to the task of collecting a corpus of movie plot synopses and tags. We describe a methodology that enabled us to build a fine-grained set of around 70 tags exposing heterogeneous characteristics of movie plots and the multi-label associations of these tags with some 14K movie plot synopses. We investigate how these tags correlate with movies and the flow of emotions throughout different types of movies. Finally, we use this corpus to explore the feasibility of inferring tags from plot synopses. We expect the corpus will be useful in other tasks where analysis of narratives is relevant.
Tasks
Published 2018-02-22
URL http://arxiv.org/abs/1802.07858v2
PDF http://arxiv.org/pdf/1802.07858v2.pdf
PWC https://paperswithcode.com/paper/mpst-a-corpus-of-movie-plot-synopses-with
Repo https://github.com/anandborad/MPST
Framework tf

SpectralNet: Spectral Clustering using Deep Neural Networks

Title SpectralNet: Spectral Clustering using Deep Neural Networks
Authors Uri Shaham, Kelly Stanton, Henry Li, Boaz Nadler, Ronen Basri, Yuval Kluger
Abstract Spectral clustering is a leading and popular technique in unsupervised data analysis. Two of its major limitations are scalability and generalization of the spectral embedding (i.e., out-of-sample-extension). In this paper we introduce a deep learning approach to spectral clustering that overcomes the above shortcomings. Our network, which we call SpectralNet, learns a map that embeds input data points into the eigenspace of their associated graph Laplacian matrix and subsequently clusters them. We train SpectralNet using a procedure that involves constrained stochastic optimization. Stochastic optimization allows it to scale to large datasets, while the constraints, which are implemented using a special-purpose output layer, allow us to keep the network output orthogonal. Moreover, the map learned by SpectralNet naturally generalizes the spectral embedding to unseen data points. To further improve the quality of the clustering, we replace the standard pairwise Gaussian affinities with affinities leaned from unlabeled data using a Siamese network. Additional improvement can be achieved by applying the network to code representations produced, e.g., by standard autoencoders. Our end-to-end learning procedure is fully unsupervised. In addition, we apply VC dimension theory to derive a lower bound on the size of SpectralNet. State-of-the-art clustering results are reported on the Reuters dataset. Our implementation is publicly available at https://github.com/kstant0725/SpectralNet .
Tasks Stochastic Optimization
Published 2018-01-04
URL http://arxiv.org/abs/1801.01587v6
PDF http://arxiv.org/pdf/1801.01587v6.pdf
PWC https://paperswithcode.com/paper/spectralnet-spectral-clustering-using-deep
Repo https://github.com/chenjs12/ML
Framework none

Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace

Title Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace
Authors Yoonho Lee, Seungjin Choi
Abstract Gradient-based meta-learning methods leverage gradient descent to learn the commonalities among various tasks. While previous such methods have been successful in meta-learning tasks, they resort to simple gradient descent during meta-testing. Our primary contribution is the {\em MT-net}, which enables the meta-learner to learn on each layer’s activation space a subspace that the task-specific learner performs gradient descent on. Additionally, a task-specific learner of an {\em MT-net} performs gradient descent with respect to a meta-learned distance metric, which warps the activation space to be more sensitive to task identity. We demonstrate that the dimension of this learned subspace reflects the complexity of the task-specific learner’s adaptation task, and also that our model is less sensitive to the choice of initial learning rates than previous gradient-based meta-learning methods. Our method achieves state-of-the-art or comparable performance on few-shot classification and regression tasks.
Tasks Few-Shot Image Classification, Meta-Learning
Published 2018-01-17
URL http://arxiv.org/abs/1801.05558v3
PDF http://arxiv.org/pdf/1801.05558v3.pdf
PWC https://paperswithcode.com/paper/gradient-based-meta-learning-with-learned
Repo https://github.com/yoonholee/MT-net
Framework tf

Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development

Title Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development
Authors Jacopo Gobbi, Evgeny Stepanov, Giuseppe Riccardi
Abstract Concept tagging is a type of structured learning needed for natural language understanding (NLU) systems. In this task, meaning labels from a domain ontology are assigned to word sequences. In this paper, we review the algorithms developed over the last twenty five years. We perform a comparative evaluation of generative, discriminative and deep learning methods on two public datasets. We report on the statistical variability performance measurements. The third contribution is the release of a repository of the algorithms, datasets and recipes for NLU evaluation.
Tasks
Published 2018-07-27
URL http://arxiv.org/abs/1807.10661v1
PDF http://arxiv.org/pdf/1807.10661v1.pdf
PWC https://paperswithcode.com/paper/concept-tagging-for-natural-language
Repo https://github.com/fruttasecca/concept-tagging-with-neural-networks
Framework pytorch

Road User Abnormal Trajectory Detection using a Deep Autoencoder

Title Road User Abnormal Trajectory Detection using a Deep Autoencoder
Authors Pankaj Raj Roy, Guillaume-Alexandre Bilodeau
Abstract In this paper, we focus on the development of a method that detects abnormal trajectories of road users at traffic intersections. The main difficulty with this is the fact that there are very few abnormal data and the normal ones are insufficient for the training of any kinds of machine learning model. To tackle these problems, we proposed the solution of using a deep autoencoder network trained solely through augmented data considered as normal. By generating artificial abnormal trajectories, our method is tested on four different outdoor urban users scenes and performs better compared to some classical outlier detection methods.
Tasks Outlier Detection
Published 2018-08-25
URL http://arxiv.org/abs/1809.00957v1
PDF http://arxiv.org/pdf/1809.00957v1.pdf
PWC https://paperswithcode.com/paper/road-user-abnormal-trajectory-detection-using
Repo https://github.com/proy3/Abnormal_Trajectory_Classifier
Framework tf

You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery

Title You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery
Authors Adam Van Etten
Abstract Detection of small objects in large swaths of imagery is one of the primary problems in satellite imagery analytics. While object detection in ground-based imagery has benefited from research into new deep learning approaches, transitioning such technology to overhead imagery is nontrivial. Among the challenges is the sheer number of pixels and geographic extent per image: a single DigitalGlobe satellite image encompasses >64 km2 and over 250 million pixels. Another challenge is that objects of interest are minuscule (often only ~10 pixels in extent), which complicates traditional computer vision techniques. To address these issues, we propose a pipeline (You Only Look Twice, or YOLT) that evaluates satellite images of arbitrary size at a rate of >0.5 km2/s. The proposed approach can rapidly detect objects of vastly different scales with relatively little training data over multiple sensors. We evaluate large test images at native resolution, and yield scores of F1 > 0.8 for vehicle localization. We further explore resolution and object size requirements by systematically testing the pipeline at decreasing resolution, and conclude that objects only ~5 pixels in size can still be localized with high confidence. Code is available at https://github.com/CosmiQ/yolt.
Tasks Object Detection
Published 2018-05-24
URL http://arxiv.org/abs/1805.09512v1
PDF http://arxiv.org/pdf/1805.09512v1.pdf
PWC https://paperswithcode.com/paper/you-only-look-twice-rapid-multi-scale-object
Repo https://github.com/avanetten/yolt
Framework none
comments powered by Disqus