Paper Group AWR 184
Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification. Sequence-Aware Recommender Systems. Single-View Place Recognition under Seasonal Changes. SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks. TallyQA: Answering Complex Counting Questions. 3DFeat-Net: Weakly Supervised Local 3D Features for Po …
Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification
Title | Stacking-Based Deep Neural Network: Deep Analytic Network for Pattern Classification |
Authors | Cheng-Yaw Low, Jaewoo Park, Andrew Beng-Jin Teoh |
Abstract | Stacking-based deep neural network (S-DNN) is aggregated with pluralities of basic learning modules, one after another, to synthesize a deep neural network (DNN) alternative for pattern classification. Contrary to the DNNs trained end to end by backpropagation (BP), each S-DNN layer, i.e., a self-learnable module, is to be trained decisively and independently without BP intervention. In this paper, a ridge regression-based S-DNN, dubbed deep analytic network (DAN), along with its kernelization (K-DAN), are devised for multilayer feature re-learning from the pre-extracted baseline features and the structured features. Our theoretical formulation demonstrates that DAN/K-DAN re-learn by perturbing the intra/inter-class variations, apart from diminishing the prediction errors. We scrutinize the DAN/K-DAN performance for pattern classification on datasets of varying domains - faces, handwritten digits, generic objects, to name a few. Unlike the typical BP-optimized DNNs to be trained from gigantic datasets by GPU, we disclose that DAN/K-DAN are trainable using only CPU even for small-scale training sets. Our experimental results disclose that DAN/K-DAN outperform the present S-DNNs and also the BP-trained DNNs, including multiplayer perceptron, deep belief network, etc., without data augmentation applied. |
Tasks | Data Augmentation |
Published | 2018-11-17 |
URL | https://arxiv.org/abs/1811.07184v2 |
https://arxiv.org/pdf/1811.07184v2.pdf | |
PWC | https://paperswithcode.com/paper/stacking-based-deep-neural-network-deep |
Repo | https://github.com/chengyawlow/DAN |
Framework | none |
Sequence-Aware Recommender Systems
Title | Sequence-Aware Recommender Systems |
Authors | Massimo Quadrana, Paolo Cremonesi, Dietmar Jannach |
Abstract | Recommender systems are one of the most successful applications of data mining and machine learning technology in practice. Academic research in the field is historically often based on the matrix completion problem formulation, where for each user-item-pair only one interaction (e.g., a rating) is considered. In many application domains, however, multiple user-item interactions of different types can be recorded over time. And, a number of recent works have shown that this information can be used to build richer individual user models and to discover additional behavioral patterns that can be leveraged in the recommendation process. In this work we review existing works that consider information from such sequentially-ordered user- item interaction logs in the recommendation process. Based on this review, we propose a categorization of the corresponding recommendation tasks and goals, summarize existing algorithmic solutions, discuss methodological approaches when benchmarking what we call sequence-aware recommender systems, and outline open challenges in the area. |
Tasks | Matrix Completion, Recommendation Systems |
Published | 2018-02-23 |
URL | https://arxiv.org/abs/1802.08452v1 |
https://arxiv.org/pdf/1802.08452v1.pdf | |
PWC | https://paperswithcode.com/paper/sequence-aware-recommender-systems |
Repo | https://github.com/taylorhawks/Recommender |
Framework | none |
Single-View Place Recognition under Seasonal Changes
Title | Single-View Place Recognition under Seasonal Changes |
Authors | Daniel Olid, José M. Fácil, Javier Civera |
Abstract | Single-view place recognition, that we can define as finding an image that corresponds to the same place as a given query image, is a key capability for autonomous navigation and mapping. Although there has been a considerable amount of research in the topic, the high degree of image variability (with viewpoint, illumination or occlusions for example) makes it a research challenge. One of the particular challenges, that we address in this work, is weather variation. Seasonal changes can produce drastic appearance changes, that classic low-level features do not model properly. Our contributions in this paper are twofold. First we pre-process and propose a partition for the Nordland dataset, frequently used for place recognition research without consensus on the partitions. And second, we evaluate several neural network architectures such as pre-trained, siamese and triplet for this problem. Our best results outperform the state of the art of the field. A video showing our results can be found in https://youtu.be/VrlxsYZoHDM. The partitioned version of the Nordland dataset at http://webdiis.unizar.es/~jmfacil/pr-nordland/. |
Tasks | Autonomous Navigation |
Published | 2018-08-20 |
URL | http://arxiv.org/abs/1808.06516v1 |
http://arxiv.org/pdf/1808.06516v1.pdf | |
PWC | https://paperswithcode.com/paper/single-view-place-recognition-under-seasonal |
Repo | https://github.com/jmfacil/single-view-place-recognition |
Framework | caffe2 |
SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks
Title | SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks |
Authors | Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, Junjie Yan |
Abstract | Siamese network based trackers formulate tracking as convolutional feature cross-correlation between target template and searching region. However, Siamese trackers still have accuracy gap compared with state-of-the-art algorithms and they cannot take advantage of feature from deep networks, such as ResNet-50 or deeper. In this work we prove the core reason comes from the lack of strict translation invariance. By comprehensive theoretical analysis and experimental validations, we break this restriction through a simple yet effective spatial aware sampling strategy and successfully train a ResNet-driven Siamese tracker with significant performance gain. Moreover, we propose a new model architecture to perform depth-wise and layer-wise aggregations, which not only further improves the accuracy but also reduces the model size. We conduct extensive ablation studies to demonstrate the effectiveness of the proposed tracker, which obtains currently the best results on four large tracking benchmarks, including OTB2015, VOT2018, UAV123, and LaSOT. Our model will be released to facilitate further studies based on this problem. |
Tasks | Visual Object Tracking, Visual Tracking |
Published | 2018-12-31 |
URL | http://arxiv.org/abs/1812.11703v1 |
http://arxiv.org/pdf/1812.11703v1.pdf | |
PWC | https://paperswithcode.com/paper/siamrpn-evolution-of-siamese-visual-tracking |
Repo | https://github.com/STVIR/pysot |
Framework | pytorch |
TallyQA: Answering Complex Counting Questions
Title | TallyQA: Answering Complex Counting Questions |
Authors | Manoj Acharya, Kushal Kafle, Christopher Kanan |
Abstract | Most counting questions in visual question answering (VQA) datasets are simple and require no more than object detection. Here, we study algorithms for complex counting questions that involve relationships between objects, attribute identification, reasoning, and more. To do this, we created TallyQA, the world’s largest dataset for open-ended counting. We propose a new algorithm for counting that uses relation networks with region proposals. Our method lets relation networks be efficiently used with high-resolution imagery. It yields state-of-the-art results compared to baseline and recent systems on both TallyQA and the HowMany-QA benchmark. |
Tasks | Object Detection, Question Answering, Visual Question Answering |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12440v2 |
http://arxiv.org/pdf/1810.12440v2.pdf | |
PWC | https://paperswithcode.com/paper/tallyqa-answering-complex-counting-questions |
Repo | https://github.com/manoja328/tallyqa |
Framework | none |
3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration
Title | 3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration |
Authors | Zi Jian Yew, Gim Hee Lee |
Abstract | In this paper, we propose the 3DFeat-Net which learns both 3D feature detector and descriptor for point cloud matching using weak supervision. Unlike many existing works, we do not require manual annotation of matching point clusters. Instead, we leverage on alignment and attention mechanisms to learn feature correspondences from GPS/INS tagged 3D point clouds without explicitly specifying them. We create training and benchmark outdoor Lidar datasets, and experiments show that 3DFeat-Net obtains state-of-the-art performance on these gravity-aligned datasets. |
Tasks | Point Cloud Registration |
Published | 2018-07-25 |
URL | http://arxiv.org/abs/1807.09413v1 |
http://arxiv.org/pdf/1807.09413v1.pdf | |
PWC | https://paperswithcode.com/paper/3dfeat-net-weakly-supervised-local-3d |
Repo | https://github.com/yewzijian/3DFeatNet |
Framework | tf |
Robustness May Be at Odds with Accuracy
Title | Robustness May Be at Odds with Accuracy |
Authors | Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, Aleksander Madry |
Abstract | We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed empirically in more complex settings. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the representations learned by robust models tend to align better with salient data characteristics and human perception. |
Tasks | |
Published | 2018-05-30 |
URL | https://arxiv.org/abs/1805.12152v5 |
https://arxiv.org/pdf/1805.12152v5.pdf | |
PWC | https://paperswithcode.com/paper/robustness-may-be-at-odds-with-accuracy |
Repo | https://github.com/louis2889184/pytorch-adversarial-training |
Framework | pytorch |
Crowdsourcing Semantic Label Propagation in Relation Classification
Title | Crowdsourcing Semantic Label Propagation in Relation Classification |
Authors | Anca Dumitrache, Lora Aroyo, Chris Welty |
Abstract | Distant supervision is a popular method for performing relation extraction from text that is known to produce noisy labels. Most progress in relation extraction and classification has been made with crowdsourced corrections to distant-supervised labels, and there is evidence that indicates still more would be better. In this paper, we explore the problem of propagating human annotation signals gathered for open-domain relation classification through the CrowdTruth methodology for crowdsourcing, that captures ambiguity in annotations by measuring inter-annotator disagreement. Our approach propagates annotations to sentences that are similar in a low dimensional embedding space, expanding the number of labels by two orders of magnitude. Our experiments show significant improvement in a sentence-level multi-class relation classifier. |
Tasks | Relation Classification, Relation Extraction |
Published | 2018-09-03 |
URL | http://arxiv.org/abs/1809.00537v1 |
http://arxiv.org/pdf/1809.00537v1.pdf | |
PWC | https://paperswithcode.com/paper/crowdsourcing-semantic-label-propagation-in |
Repo | https://github.com/CrowdTruth/Open-Domain-Relation-Extraction |
Framework | none |
Coarse-to-Fine Decoding for Neural Semantic Parsing
Title | Coarse-to-Fine Decoding for Neural Semantic Parsing |
Authors | Li Dong, Mirella Lapata |
Abstract | Semantic parsing aims at mapping natural language utterances into structured meaning representations. In this work, we propose a structure-aware neural architecture which decomposes the semantic parsing process into two stages. Given an input utterance, we first generate a rough sketch of its meaning, where low-level information (such as variable names and arguments) is glossed over. Then, we fill in missing details by taking into account the natural language input and the sketch itself. Experimental results on four datasets characteristic of different domains and meaning representations show that our approach consistently improves performance, achieving competitive results despite the use of relatively simple decoders. |
Tasks | Semantic Parsing |
Published | 2018-05-12 |
URL | http://arxiv.org/abs/1805.04793v1 |
http://arxiv.org/pdf/1805.04793v1.pdf | |
PWC | https://paperswithcode.com/paper/coarse-to-fine-decoding-for-neural-semantic |
Repo | https://github.com/donglixp/coarse2fine |
Framework | pytorch |
MPST: A Corpus of Movie Plot Synopses with Tags
Title | MPST: A Corpus of Movie Plot Synopses with Tags |
Authors | Sudipta Kar, Suraj Maharjan, A. Pastor López-Monroy, Thamar Solorio |
Abstract | Social tagging of movies reveals a wide range of heterogeneous information about movies, like the genre, plot structure, soundtracks, metadata, visual and emotional experiences. Such information can be valuable in building automatic systems to create tags for movies. Automatic tagging systems can help recommendation engines to improve the retrieval of similar movies as well as help viewers to know what to expect from a movie in advance. In this paper, we set out to the task of collecting a corpus of movie plot synopses and tags. We describe a methodology that enabled us to build a fine-grained set of around 70 tags exposing heterogeneous characteristics of movie plots and the multi-label associations of these tags with some 14K movie plot synopses. We investigate how these tags correlate with movies and the flow of emotions throughout different types of movies. Finally, we use this corpus to explore the feasibility of inferring tags from plot synopses. We expect the corpus will be useful in other tasks where analysis of narratives is relevant. |
Tasks | |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.07858v2 |
http://arxiv.org/pdf/1802.07858v2.pdf | |
PWC | https://paperswithcode.com/paper/mpst-a-corpus-of-movie-plot-synopses-with |
Repo | https://github.com/anandborad/MPST |
Framework | tf |
SpectralNet: Spectral Clustering using Deep Neural Networks
Title | SpectralNet: Spectral Clustering using Deep Neural Networks |
Authors | Uri Shaham, Kelly Stanton, Henry Li, Boaz Nadler, Ronen Basri, Yuval Kluger |
Abstract | Spectral clustering is a leading and popular technique in unsupervised data analysis. Two of its major limitations are scalability and generalization of the spectral embedding (i.e., out-of-sample-extension). In this paper we introduce a deep learning approach to spectral clustering that overcomes the above shortcomings. Our network, which we call SpectralNet, learns a map that embeds input data points into the eigenspace of their associated graph Laplacian matrix and subsequently clusters them. We train SpectralNet using a procedure that involves constrained stochastic optimization. Stochastic optimization allows it to scale to large datasets, while the constraints, which are implemented using a special-purpose output layer, allow us to keep the network output orthogonal. Moreover, the map learned by SpectralNet naturally generalizes the spectral embedding to unseen data points. To further improve the quality of the clustering, we replace the standard pairwise Gaussian affinities with affinities leaned from unlabeled data using a Siamese network. Additional improvement can be achieved by applying the network to code representations produced, e.g., by standard autoencoders. Our end-to-end learning procedure is fully unsupervised. In addition, we apply VC dimension theory to derive a lower bound on the size of SpectralNet. State-of-the-art clustering results are reported on the Reuters dataset. Our implementation is publicly available at https://github.com/kstant0725/SpectralNet . |
Tasks | Stochastic Optimization |
Published | 2018-01-04 |
URL | http://arxiv.org/abs/1801.01587v6 |
http://arxiv.org/pdf/1801.01587v6.pdf | |
PWC | https://paperswithcode.com/paper/spectralnet-spectral-clustering-using-deep |
Repo | https://github.com/chenjs12/ML |
Framework | none |
Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace
Title | Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace |
Authors | Yoonho Lee, Seungjin Choi |
Abstract | Gradient-based meta-learning methods leverage gradient descent to learn the commonalities among various tasks. While previous such methods have been successful in meta-learning tasks, they resort to simple gradient descent during meta-testing. Our primary contribution is the {\em MT-net}, which enables the meta-learner to learn on each layer’s activation space a subspace that the task-specific learner performs gradient descent on. Additionally, a task-specific learner of an {\em MT-net} performs gradient descent with respect to a meta-learned distance metric, which warps the activation space to be more sensitive to task identity. We demonstrate that the dimension of this learned subspace reflects the complexity of the task-specific learner’s adaptation task, and also that our model is less sensitive to the choice of initial learning rates than previous gradient-based meta-learning methods. Our method achieves state-of-the-art or comparable performance on few-shot classification and regression tasks. |
Tasks | Few-Shot Image Classification, Meta-Learning |
Published | 2018-01-17 |
URL | http://arxiv.org/abs/1801.05558v3 |
http://arxiv.org/pdf/1801.05558v3.pdf | |
PWC | https://paperswithcode.com/paper/gradient-based-meta-learning-with-learned |
Repo | https://github.com/yoonholee/MT-net |
Framework | tf |
Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development
Title | Concept Tagging for Natural Language Understanding: Two Decadelong Algorithm Development |
Authors | Jacopo Gobbi, Evgeny Stepanov, Giuseppe Riccardi |
Abstract | Concept tagging is a type of structured learning needed for natural language understanding (NLU) systems. In this task, meaning labels from a domain ontology are assigned to word sequences. In this paper, we review the algorithms developed over the last twenty five years. We perform a comparative evaluation of generative, discriminative and deep learning methods on two public datasets. We report on the statistical variability performance measurements. The third contribution is the release of a repository of the algorithms, datasets and recipes for NLU evaluation. |
Tasks | |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10661v1 |
http://arxiv.org/pdf/1807.10661v1.pdf | |
PWC | https://paperswithcode.com/paper/concept-tagging-for-natural-language |
Repo | https://github.com/fruttasecca/concept-tagging-with-neural-networks |
Framework | pytorch |
Road User Abnormal Trajectory Detection using a Deep Autoencoder
Title | Road User Abnormal Trajectory Detection using a Deep Autoencoder |
Authors | Pankaj Raj Roy, Guillaume-Alexandre Bilodeau |
Abstract | In this paper, we focus on the development of a method that detects abnormal trajectories of road users at traffic intersections. The main difficulty with this is the fact that there are very few abnormal data and the normal ones are insufficient for the training of any kinds of machine learning model. To tackle these problems, we proposed the solution of using a deep autoencoder network trained solely through augmented data considered as normal. By generating artificial abnormal trajectories, our method is tested on four different outdoor urban users scenes and performs better compared to some classical outlier detection methods. |
Tasks | Outlier Detection |
Published | 2018-08-25 |
URL | http://arxiv.org/abs/1809.00957v1 |
http://arxiv.org/pdf/1809.00957v1.pdf | |
PWC | https://paperswithcode.com/paper/road-user-abnormal-trajectory-detection-using |
Repo | https://github.com/proy3/Abnormal_Trajectory_Classifier |
Framework | tf |
You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery
Title | You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery |
Authors | Adam Van Etten |
Abstract | Detection of small objects in large swaths of imagery is one of the primary problems in satellite imagery analytics. While object detection in ground-based imagery has benefited from research into new deep learning approaches, transitioning such technology to overhead imagery is nontrivial. Among the challenges is the sheer number of pixels and geographic extent per image: a single DigitalGlobe satellite image encompasses >64 km2 and over 250 million pixels. Another challenge is that objects of interest are minuscule (often only ~10 pixels in extent), which complicates traditional computer vision techniques. To address these issues, we propose a pipeline (You Only Look Twice, or YOLT) that evaluates satellite images of arbitrary size at a rate of >0.5 km2/s. The proposed approach can rapidly detect objects of vastly different scales with relatively little training data over multiple sensors. We evaluate large test images at native resolution, and yield scores of F1 > 0.8 for vehicle localization. We further explore resolution and object size requirements by systematically testing the pipeline at decreasing resolution, and conclude that objects only ~5 pixels in size can still be localized with high confidence. Code is available at https://github.com/CosmiQ/yolt. |
Tasks | Object Detection |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1805.09512v1 |
http://arxiv.org/pdf/1805.09512v1.pdf | |
PWC | https://paperswithcode.com/paper/you-only-look-twice-rapid-multi-scale-object |
Repo | https://github.com/avanetten/yolt |
Framework | none |