October 20, 2019

2725 words 13 mins read

Paper Group AWR 184

Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification. Sequence-Aware Recommender Systems. Single-View Place Recognition under Seasonal Changes. SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks. TallyQA: Answering Complex Counting Questions. 3DFeat-Net: Weakly Supervised Local 3D Features for Po …

Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification


Title	Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification
Authors	Cheng-Yaw Low, Jaewoo Park, Andrew Beng-Jin Teoh
Abstract	Stacking-based deep neural network (S-DNN) is aggregated with pluralities of basic learning modules, one after another, to synthesize a deep neural network (DNN) alternative for pattern classification. Contrary to the DNNs trained end to end by backpropagation (BP), each S-DNN layer, i.e., a self-learnable module, is to be trained decisively and independently without BP intervention. In this paper, a ridge regression-based S-DNN, dubbed deep analytic network (DAN), along with its kernelization (K-DAN), are devised for multilayer feature re-learning from the pre-extracted baseline features and the structured features. Our theoretical formulation demonstrates that DAN/K-DAN re-learn by perturbing the intra/inter-class variations, apart from diminishing the prediction errors. We scrutinize the DAN/K-DAN performance for pattern classification on datasets of varying domains - faces, handwritten digits, generic objects, to name a few. Unlike the typical BP-optimized DNNs to be trained from gigantic datasets by GPU, we disclose that DAN/K-DAN are trainable using only CPU even for small-scale training sets. Our experimental results disclose that DAN/K-DAN outperform the present S-DNNs and also the BP-trained DNNs, including multiplayer perceptron, deep belief network, etc., without data augmentation applied.
Tasks	Data Augmentation
Published	2018-11-17
URL	https://arxiv.org/abs/1811.07184v2
PDF	https://arxiv.org/pdf/1811.07184v2.pdf
PWC	https://paperswithcode.com/paper/stacking-based-deep-neural-network-deep
Repo	https://github.com/chengyawlow/DAN
Framework	none

Sequence-Aware Recommender Systems


Title	Sequence-Aware Recommender Systems
Authors	Massimo Quadrana, Paolo Cremonesi, Dietmar Jannach
Abstract	Recommender systems are one of the most successful applications of data mining and machine learning technology in practice. Academic research in the field is historically often based on the matrix completion problem formulation, where for each user-item-pair only one interaction (e.g., a rating) is considered. In many application domains, however, multiple user-item interactions of different types can be recorded over time. And, a number of recent works have shown that this information can be used to build richer individual user models and to discover additional behavioral patterns that can be leveraged in the recommendation process. In this work we review existing works that consider information from such sequentially-ordered user- item interaction logs in the recommendation process. Based on this review, we propose a categorization of the corresponding recommendation tasks and goals, summarize existing algorithmic solutions, discuss methodological approaches when benchmarking what we call sequence-aware recommender systems, and outline open challenges in the area.
Tasks	Matrix Completion, Recommendation Systems
Published	2018-02-23
URL	https://arxiv.org/abs/1802.08452v1
PDF	https://arxiv.org/pdf/1802.08452v1.pdf
PWC	https://paperswithcode.com/paper/sequence-aware-recommender-systems
Repo	https://github.com/taylorhawks/Recommender
Framework	none

Single-View Place Recognition under Seasonal Changes


Title	Single-View Place Recognition under Seasonal Changes
Authors	Daniel Olid, José M. Fácil, Javier Civera
Abstract	Single-view place recognition, that we can define as finding an image that corresponds to the same place as a given query image, is a key capability for autonomous navigation and mapping. Although there has been a considerable amount of research in the topic, the high degree of image variability (with viewpoint, illumination or occlusions for example) makes it a research challenge. One of the particular challenges, that we address in this work, is weather variation. Seasonal changes can produce drastic appearance changes, that classic low-level features do not model properly. Our contributions in this paper are twofold. First we pre-process and propose a partition for the Nordland dataset, frequently used for place recognition research without consensus on the partitions. And second, we evaluate several neural network architectures such as pre-trained, siamese and triplet for this problem. Our best results outperform the state of the art of the field. A video showing our results can be found in https://youtu.be/VrlxsYZoHDM. The partitioned version of the Nordland dataset at http://webdiis.unizar.es/~jmfacil/pr-nordland/.
Tasks	Autonomous Navigation
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06516v1
PDF	http://arxiv.org/pdf/1808.06516v1.pdf
PWC	https://paperswithcode.com/paper/single-view-place-recognition-under-seasonal
Repo	https://github.com/jmfacil/single-view-place-recognition
Framework	caffe2

SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks


Title	SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks
Authors	Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, Junjie Yan
Abstract	Siamese network based trackers formulate tracking as convolutional feature cross-correlation between target template and searching region. However, Siamese trackers still have accuracy gap compared with state-of-the-art algorithms and they cannot take advantage of feature from deep networks, such as ResNet-50 or deeper. In this work we prove the core reason comes from the lack of strict translation invariance. By comprehensive theoretical analysis and experimental validations, we break this restriction through a simple yet effective spatial aware sampling strategy and successfully train a ResNet-driven Siamese tracker with significant performance gain. Moreover, we propose a new model architecture to perform depth-wise and layer-wise aggregations, which not only further improves the accuracy but also reduces the model size. We conduct extensive ablation studies to demonstrate the effectiveness of the proposed tracker, which obtains currently the best results on four large tracking benchmarks, including OTB2015, VOT2018, UAV123, and LaSOT. Our model will be released to facilitate further studies based on this problem.
Tasks	Visual Object Tracking, Visual Tracking
Published	2018-12-31
URL	http://arxiv.org/abs/1812.11703v1
PDF	http://arxiv.org/pdf/1812.11703v1.pdf
PWC	https://paperswithcode.com/paper/siamrpn-evolution-of-siamese-visual-tracking
Repo	https://github.com/STVIR/pysot
Framework	pytorch

TallyQA: Answering Complex Counting Questions


Title	TallyQA: Answering Complex Counting Questions
Authors	Manoj Acharya, Kushal Kafle, Christopher Kanan
Abstract	Most counting questions in visual question answering (VQA) datasets are simple and require no more than object detection. Here, we study algorithms for complex counting questions that involve relationships between objects, attribute identification, reasoning, and more. To do this, we created TallyQA, the world’s largest dataset for open-ended counting. We propose a new algorithm for counting that uses relation networks with region proposals. Our method lets relation networks be efficiently used with high-resolution imagery. It yields state-of-the-art results compared to baseline and recent systems on both TallyQA and the HowMany-QA benchmark.
Tasks	Object Detection, Question Answering, Visual Question Answering
Published	2018-10-29
URL	http://arxiv.org/abs/1810.12440v2
PDF	http://arxiv.org/pdf/1810.12440v2.pdf
PWC	https://paperswithcode.com/paper/tallyqa-answering-complex-counting-questions
Repo	https://github.com/manoja328/tallyqa
Framework	none

3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration


Title	3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration
Authors	Zi Jian Yew, Gim Hee Lee
Abstract	In this paper, we propose the 3DFeat-Net which learns both 3D feature detector and descriptor for point cloud matching using weak supervision. Unlike many existing works, we do not require manual annotation of matching point clusters. Instead, we leverage on alignment and attention mechanisms to learn feature correspondences from GPS/INS tagged 3D point clouds without explicitly specifying them. We create training and benchmark outdoor Lidar datasets, and experiments show that 3DFeat-Net obtains state-of-the-art performance on these gravity-aligned datasets.
Tasks	Point Cloud Registration
Published	2018-07-25
URL	http://arxiv.org/abs/1807.09413v1
PDF	http://arxiv.org/pdf/1807.09413v1.pdf
PWC	https://paperswithcode.com/paper/3dfeat-net-weakly-supervised-local-3d
Repo	https://github.com/yewzijian/3DFeatNet
Framework	tf

Robustness May Be at Odds with Accuracy


Title	Robustness May Be at Odds with Accuracy
Authors	Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, Aleksander Madry
Abstract	We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed empirically in more complex settings. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the representations learned by robust models tend to align better with salient data characteristics and human perception.
Tasks
Published	2018-05-30
URL	https://arxiv.org/abs/1805.12152v5
PDF	https://arxiv.org/pdf/1805.12152v5.pdf
PWC	https://paperswithcode.com/paper/robustness-may-be-at-odds-with-accuracy
Repo	https://github.com/louis2889184/pytorch-adversarial-training
Framework	pytorch

Crowdsourcing Semantic Label Propagation in Relation Classification


Title	Crowdsourcing Semantic Label Propagation in Relation Classification
Authors	Anca Dumitrache, Lora Aroyo, Chris Welty
Abstract	Distant supervision is a popular method for performing relation extraction from text that is known to produce noisy labels. Most progress in relation extraction and classification has been made with crowdsourced corrections to distant-supervised labels, and there is evidence that indicates still more would be better. In this paper, we explore the problem of propagating human annotation signals gathered for open-domain relation classification through the CrowdTruth methodology for crowdsourcing, that captures ambiguity in annotations by measuring inter-annotator disagreement. Our approach propagates annotations to sentences that are similar in a low dimensional embedding space, expanding the number of labels by two orders of magnitude. Our experiments show significant improvement in a sentence-level multi-class relation classifier.
Tasks	Relation Classification, Relation Extraction
Published	2018-09-03
URL	http://arxiv.org/abs/1809.00537v1
PDF	http://arxiv.org/pdf/1809.00537v1.pdf
PWC	https://paperswithcode.com/paper/crowdsourcing-semantic-label-propagation-in
Repo	https://github.com/CrowdTruth/Open-Domain-Relation-Extraction
Framework	none

Coarse-to-Fine Decoding for Neural Semantic Parsing


Title	Coarse-to-Fine Decoding for Neural Semantic Parsing
Authors	Li Dong, Mirella Lapata
Abstract	Semantic parsing aims at mapping natural language utterances into structured meaning representations. In this work, we propose a structure-aware neural architecture which decomposes the semantic parsing process into two stages. Given an input utterance, we first generate a rough sketch of its meaning, where low-level information (such as variable names and arguments) is glossed over. Then, we fill in missing details by taking into account the natural language input and the sketch itself. Experimental results on four datasets characteristic of different domains and meaning representations show that our approach consistently improves performance, achieving competitive results despite the use of relatively simple decoders.
Tasks	Semantic Parsing
Published	2018-05-12
URL	http://arxiv.org/abs/1805.04793v1
PDF	http://arxiv.org/pdf/1805.04793v1.pdf
PWC	https://paperswithcode.com/paper/coarse-to-fine-decoding-for-neural-semantic
Repo	https://github.com/donglixp/coarse2fine
Framework	pytorch

MPST: A Corpus of Movie Plot Synopses with Tags


Title	MPST: A Corpus of Movie Plot Synopses with Tags
Authors	Sudipta Kar, Suraj Maharjan, A. Pastor López-Monroy, Thamar Solorio
Abstract	Social tagging of movies reveals a wide range of heterogeneous information about movies, like the genre, plot structure, soundtracks, metadata, visual and emotional experiences. Such information can be valuable in building automatic systems to create tags for movies. Automatic tagging systems can help recommendation engines to improve the retrieval of similar movies as well as help viewers to know what to expect from a movie in advance. In this paper, we set out to the task of collecting a corpus of movie plot synopses and tags. We describe a methodology that enabled us to build a fine-grained set of around 70 tags exposing heterogeneous characteristics of movie plots and the multi-label associations of these tags with some 14K movie plot synopses. We investigate how these tags correlate with movies and the flow of emotions throughout different types of movies. Finally, we use this corpus to explore the feasibility of inferring tags from plot synopses. We expect the corpus will be useful in other tasks where analysis of narratives is relevant.
Tasks
Published	2018-02-22
URL	http://arxiv.org/abs/1802.07858v2
PDF	http://arxiv.org/pdf/1802.07858v2.pdf
PWC	https://paperswithcode.com/paper/mpst-a-corpus-of-movie-plot-synopses-with
Repo	https://github.com/anandborad/MPST
Framework	tf

SpectralNet: Spectral Clustering using Deep Neural Networks


Title	SpectralNet: Spectral Clustering using Deep Neural Networks
Authors	Uri Shaham, Kelly Stanton, Henry Li, Boaz Nadler, Ronen Basri, Yuval Kluger
Abstract	Spectral clustering is a leading and popular technique in unsupervised data analysis. Two of its major limitations are scalability and generalization of the spectral embedding (i.e., out-of-sample-extension). In this paper we introduce a deep learning approach to spectral clustering that overcomes the above shortcomings. Our network, which we call SpectralNet, learns a map that embeds input data points into the eigenspace of their associated graph Laplacian matrix and subsequently clusters them. We train SpectralNet using a procedure that involves constrained stochastic optimization. Stochastic optimization allows it to scale to large datasets, while the constraints, which are implemented using a special-purpose output layer, allow us to keep the network output orthogonal. Moreover, the map learned by SpectralNet naturally generalizes the spectral embedding to unseen data points. To further improve the quality of the clustering, we replace the standard pairwise Gaussian affinities with affinities leaned from unlabeled data using a Siamese network. Additional improvement can be achieved by applying the network to code representations produced, e.g., by standard autoencoders. Our end-to-end learning procedure is fully unsupervised. In addition, we apply VC dimension theory to derive a lower bound on the size of SpectralNet. State-of-the-art clustering results are reported on the Reuters dataset. Our implementation is publicly available at https://github.com/kstant0725/SpectralNet .
Tasks	Stochastic Optimization
Published	2018-01-04
URL	http://arxiv.org/abs/1801.01587v6
PDF	http://arxiv.org/pdf/1801.01587v6.pdf
PWC	https://paperswithcode.com/paper/spectralnet-spectral-clustering-using-deep
Repo	https://github.com/chenjs12/ML
Framework	none

Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace


Title	Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace
Authors	Yoonho Lee, Seungjin Choi
Abstract	Gradient-based meta-learning methods leverage gradient descent to learn the commonalities among various tasks. While previous such methods have been successful in meta-learning tasks, they resort to simple gradient descent during meta-testing. Our primary contribution is the {\em MT-net}, which enables the meta-learner to learn on each layer’s activation space a subspace that the task-specific learner performs gradient descent on. Additionally, a task-specific learner of an {\em MT-net} performs gradient descent with respect to a meta-learned distance metric, which warps the activation space to be more sensitive to task identity. We demonstrate that the dimension of this learned subspace reflects the complexity of the task-specific learner’s adaptation task, and also that our model is less sensitive to the choice of initial learning rates than previous gradient-based meta-learning methods. Our method achieves state-of-the-art or comparable performance on few-shot classification and regression tasks.
Tasks	Few-Shot Image Classification, Meta-Learning
Published	2018-01-17
URL	http://arxiv.org/abs/1801.05558v3
PDF	http://arxiv.org/pdf/1801.05558v3.pdf
PWC	https://paperswithcode.com/paper/gradient-based-meta-learning-with-learned
Repo	https://github.com/yoonholee/MT-net
Framework	tf

Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development


Title	Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development
Authors	Jacopo Gobbi, Evgeny Stepanov, Giuseppe Riccardi
Abstract	Concept tagging is a type of structured learning needed for natural language understanding (NLU) systems. In this task, meaning labels from a domain ontology are assigned to word sequences. In this paper, we review the algorithms developed over the last twenty five years. We perform a comparative evaluation of generative, discriminative and deep learning methods on two public datasets. We report on the statistical variability performance measurements. The third contribution is the release of a repository of the algorithms, datasets and recipes for NLU evaluation.
Tasks
Published	2018-07-27
URL	http://arxiv.org/abs/1807.10661v1
PDF	http://arxiv.org/pdf/1807.10661v1.pdf
PWC	https://paperswithcode.com/paper/concept-tagging-for-natural-language
Repo	https://github.com/fruttasecca/concept-tagging-with-neural-networks
Framework	pytorch

Road User Abnormal Trajectory Detection using a Deep Autoencoder


Title	Road User Abnormal Trajectory Detection using a Deep Autoencoder
Authors	Pankaj Raj Roy, Guillaume-Alexandre Bilodeau
Abstract	In this paper, we focus on the development of a method that detects abnormal trajectories of road users at traffic intersections. The main difficulty with this is the fact that there are very few abnormal data and the normal ones are insufficient for the training of any kinds of machine learning model. To tackle these problems, we proposed the solution of using a deep autoencoder network trained solely through augmented data considered as normal. By generating artificial abnormal trajectories, our method is tested on four different outdoor urban users scenes and performs better compared to some classical outlier detection methods.
Tasks	Outlier Detection
Published	2018-08-25
URL	http://arxiv.org/abs/1809.00957v1
PDF	http://arxiv.org/pdf/1809.00957v1.pdf
PWC	https://paperswithcode.com/paper/road-user-abnormal-trajectory-detection-using
Repo	https://github.com/proy3/Abnormal_Trajectory_Classifier
Framework	tf

You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery


Title	You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery
Authors	Adam Van Etten
Abstract	Detection of small objects in large swaths of imagery is one of the primary problems in satellite imagery analytics. While object detection in ground-based imagery has benefited from research into new deep learning approaches, transitioning such technology to overhead imagery is nontrivial. Among the challenges is the sheer number of pixels and geographic extent per image: a single DigitalGlobe satellite image encompasses >64 km2 and over 250 million pixels. Another challenge is that objects of interest are minuscule (often only ~10 pixels in extent), which complicates traditional computer vision techniques. To address these issues, we propose a pipeline (You Only Look Twice, or YOLT) that evaluates satellite images of arbitrary size at a rate of >0.5 km2/s. The proposed approach can rapidly detect objects of vastly different scales with relatively little training data over multiple sensors. We evaluate large test images at native resolution, and yield scores of F1 > 0.8 for vehicle localization. We further explore resolution and object size requirements by systematically testing the pipeline at decreasing resolution, and conclude that objects only ~5 pixels in size can still be localized with high confidence. Code is available at https://github.com/CosmiQ/yolt.
Tasks	Object Detection
Published	2018-05-24
URL	http://arxiv.org/abs/1805.09512v1
PDF	http://arxiv.org/pdf/1805.09512v1.pdf
PWC	https://paperswithcode.com/paper/you-only-look-twice-rapid-multi-scale-object
Repo	https://github.com/avanetten/yolt
Framework	none