July 27, 2019

3270 words 16 mins read

Paper Group ANR 604

Paper Group ANR 604

Learning Large-Scale Topological Maps Using Sum-Product Networks. Neural Skill Transfer from Supervised Language Tasks to Reading Comprehension. Distributed Robust Subspace Recovery. SCDA: School Compatibility Decomposition Algorithm for Solving the Multi-School Bus Routing and Scheduling Problem. 25 Tweets to Know You: A New Model to Predict Perso …

Learning Large-Scale Topological Maps Using Sum-Product Networks

Title Learning Large-Scale Topological Maps Using Sum-Product Networks
Authors Kaiyu Zheng
Abstract In order to perform complex actions in human environments, an autonomous robot needs the ability to understand the environment, that is, to gather and maintain spatial knowledge. Topological map is commonly used for representing large scale, global maps such as floor plans. Although much work has been done in topological map extraction, we have found little previous work on the problem of learning the topological map using a probabilistic model. Learning a topological map means learning the structure of the large-scale space and dependency between places, for example, how the evidence of a group of places influence the attributes of other places. This is an important step towards planning complex actions in the environment. In this thesis, we consider the problem of using probabilistic deep learning model to learn the topological map, which is essentially a sparse undirected graph where nodes represent places annotated with their semantic attributes (e.g. place category). We propose to use a novel probabilistic deep model, Sum-Product Networks (SPNs), due to their unique properties. We present two methods for learning topological maps using SPNs: the place grid method and the template-based method. We contribute an algorithm that builds SPNs for graphs using template models. Our experiments evaluate the ability of our models to enable robots to infer semantic attributes and detect maps with novel semantic attribute arrangements. Our results demonstrate their understanding of the topological map structure and spatial relations between places.
Tasks
Published 2017-06-11
URL http://arxiv.org/abs/1706.03416v2
PDF http://arxiv.org/pdf/1706.03416v2.pdf
PWC https://paperswithcode.com/paper/learning-large-scale-topological-maps-using
Repo
Framework

Neural Skill Transfer from Supervised Language Tasks to Reading Comprehension

Title Neural Skill Transfer from Supervised Language Tasks to Reading Comprehension
Authors Todor Mihaylov, Zornitsa Kozareva, Anette Frank
Abstract Reading comprehension is a challenging task in natural language processing and requires a set of skills to be solved. While current approaches focus on solving the task as a whole, in this paper, we propose to use a neural network `skill’ transfer approach. We transfer knowledge from several lower-level language tasks (skills) including textual entailment, named entity recognition, paraphrase detection and question type classification into the reading comprehension model. We conduct an empirical evaluation and show that transferring language skill knowledge leads to significant improvements for the task with much fewer steps compared to the baseline model. We also show that the skill transfer approach is effective even with small amounts of training data. Another finding of this work is that using token-wise deep label supervision for text classification improves the performance of transfer learning. |
Tasks Named Entity Recognition, Natural Language Inference, Reading Comprehension, Text Classification, Transfer Learning
Published 2017-11-10
URL http://arxiv.org/abs/1711.03754v1
PDF http://arxiv.org/pdf/1711.03754v1.pdf
PWC https://paperswithcode.com/paper/neural-skill-transfer-from-supervised
Repo
Framework

Distributed Robust Subspace Recovery

Title Distributed Robust Subspace Recovery
Authors Vahan Huroyan, Gilad Lerman
Abstract We propose distributed solutions to the problem of Robust Subspace Recovery (RSR). Our setting assumes a huge dataset in an ad hoc network without a central processor, where each node has access only to one chunk of the dataset. Furthermore, part of the whole dataset lies around a low-dimensional subspace and the other part is composed of outliers that lie away from that subspace. The goal is to recover the underlying subspace for the whole dataset, without transferring the data itself between the nodes. We first apply the Consensus-Based Gradient method to the Geometric Median Subspace algorithm for RSR. For this purpose, we propose an iterative solution for the local dual minimization problem and establish its r-linear convergence. We then explain how to distributedly implement the Reaper and Fast Median Subspace algorithms for RSR. The proposed algorithms display competitive performance on both synthetic and real data.
Tasks
Published 2017-05-25
URL http://arxiv.org/abs/1705.09382v3
PDF http://arxiv.org/pdf/1705.09382v3.pdf
PWC https://paperswithcode.com/paper/distributed-robust-subspace-recovery
Repo
Framework

SCDA: School Compatibility Decomposition Algorithm for Solving the Multi-School Bus Routing and Scheduling Problem

Title SCDA: School Compatibility Decomposition Algorithm for Solving the Multi-School Bus Routing and Scheduling Problem
Authors Zhongxiang Wang, Ali Shafahi, Ali Haghani
Abstract Safely serving the school transportation demand with the minimum number of buses is one of the highest financial goals of school transportation directors. To achieve that objective, a good and efficient way to solve the routing and scheduling problem is required. Due to the growth of the computing power, the spotlight has been shed on solving the combined problem of the school bus routing and scheduling problem. We show that an integrated multi-school bus routing and scheduling can be formulated with the help of trip compatibility. A novel decomposition algorithm is proposed to solve the integrated model. The merit of this integrated model and the decomposition method is that with the consideration of the trip compatibility, the interrelationship between the routing and scheduling sub-problems will not be lost in the process of decomposition. Results show the proposed decomposed problem could provide the solutions using the same number of buses as the integrated model in much shorter time (as little as 0.6%) and that the proposed method can save up to 26% number of buses from existing research.
Tasks
Published 2017-11-01
URL http://arxiv.org/abs/1711.00532v3
PDF http://arxiv.org/pdf/1711.00532v3.pdf
PWC https://paperswithcode.com/paper/scda-school-compatibility-decomposition
Repo
Framework

25 Tweets to Know You: A New Model to Predict Personality with Social Media

Title 25 Tweets to Know You: A New Model to Predict Personality with Social Media
Authors Pierre-Hadrien Arnoux, Anbang Xu, Neil Boyette, Jalal Mahmud, Rama Akkiraju, Vibha Sinha
Abstract Predicting personality is essential for social applications supporting human-centered activities, yet prior modeling methods with users written text require too much input data to be realistically used in the context of social media. In this work, we aim to drastically reduce the data requirement for personality modeling and develop a model that is applicable to most users on Twitter. Our model integrates Word Embedding features with Gaussian Processes regression. Based on the evaluation of over 1.3K users on Twitter, we find that our model achieves comparable or better accuracy than state of the art techniques with 8 times fewer data.
Tasks Gaussian Processes
Published 2017-04-18
URL http://arxiv.org/abs/1704.05513v1
PDF http://arxiv.org/pdf/1704.05513v1.pdf
PWC https://paperswithcode.com/paper/25-tweets-to-know-you-a-new-model-to-predict
Repo
Framework

Drone-based Object Counting by Spatially Regularized Regional Proposal Network

Title Drone-based Object Counting by Spatially Regularized Regional Proposal Network
Authors Meng-Ru Hsieh, Yen-Liang Lin, Winston H. Hsu
Abstract Existing counting methods often adopt regression-based approaches and cannot precisely localize the target objects, which hinders the further analysis (e.g., high-level understanding and fine-grained classification). In addition, most of prior work mainly focus on counting objects in static environments with fixed cameras. Motivated by the advent of unmanned flying vehicles (i.e., drones), we are interested in detecting and counting objects in such dynamic environments. We propose Layout Proposal Networks (LPNs) and spatial kernels to simultaneously count and localize target objects (e.g., cars) in videos recorded by the drone. Different from the conventional region proposal methods, we leverage the spatial layout information (e.g., cars often park regularly) and introduce these spatially regularized constraints into our network to improve the localization accuracy. To evaluate our counting method, we present a new large-scale car parking lot dataset (CARPK) that contains nearly 90,000 cars captured from different parking lots. To the best of our knowledge, it is the first and the largest drone view dataset that supports object counting, and provides the bounding box annotations.
Tasks Object Counting
Published 2017-07-19
URL http://arxiv.org/abs/1707.05972v3
PDF http://arxiv.org/pdf/1707.05972v3.pdf
PWC https://paperswithcode.com/paper/drone-based-object-counting-by-spatially
Repo
Framework

Twin Learning for Similarity and Clustering: A Unified Kernel Approach

Title Twin Learning for Similarity and Clustering: A Unified Kernel Approach
Authors Zhao Kang, Chong Peng, Qiang Cheng
Abstract Many similarity-based clustering methods work in two separate steps including similarity matrix computation and subsequent spectral clustering. However, similarity measurement is challenging because it is usually impacted by many factors, e.g., the choice of similarity metric, neighborhood size, scale of data, noise and outliers. Thus the learned similarity matrix is often not suitable, let alone optimal, for the subsequent clustering. In addition, nonlinear similarity often exists in many real world data which, however, has not been effectively considered by most existing methods. To tackle these two challenges, we propose a model to simultaneously learn cluster indicator matrix and similarity information in kernel spaces in a principled way. We show theoretical relationships to kernel k-means, k-means, and spectral clustering methods. Then, to address the practical issue of how to select the most suitable kernel for a particular clustering task, we further extend our model with a multiple kernel learning ability. With this joint model, we can automatically accomplish three subtasks of finding the best cluster indicator matrix, the most accurate similarity relations and the optimal combination of multiple kernels. By leveraging the interactions between these three subtasks in a joint framework, each subtask can be iteratively boosted by using the results of the others towards an overall optimal solution. Extensive experiments are performed to demonstrate the effectiveness of our method.
Tasks
Published 2017-05-01
URL http://arxiv.org/abs/1705.00678v2
PDF http://arxiv.org/pdf/1705.00678v2.pdf
PWC https://paperswithcode.com/paper/twin-learning-for-similarity-and-clustering-a
Repo
Framework

Context Embedding Networks

Title Context Embedding Networks
Authors Kun Ho Kim, Oisin Mac Aodha, Pietro Perona
Abstract Low dimensional embeddings that capture the main variations of interest in collections of data are important for many applications. One way to construct these embeddings is to acquire estimates of similarity from the crowd. However, similarity is a multi-dimensional concept that varies from individual to individual. Existing models for learning embeddings from the crowd typically make simplifying assumptions such as all individuals estimate similarity using the same criteria, the list of criteria is known in advance, or that the crowd workers are not influenced by the data that they see. To overcome these limitations we introduce Context Embedding Networks (CENs). In addition to learning interpretable embeddings from images, CENs also model worker biases for different attributes along with the visual context i.e. the visual attributes highlighted by a set of images. Experiments on two noisy crowd annotated datasets show that modeling both worker bias and visual context results in more interpretable embeddings compared to existing approaches.
Tasks
Published 2017-09-22
URL http://arxiv.org/abs/1710.01691v3
PDF http://arxiv.org/pdf/1710.01691v3.pdf
PWC https://paperswithcode.com/paper/context-embedding-networks
Repo
Framework

Learning Pain from Action Unit Combinations: A Weakly Supervised Approach via Multiple Instance Learning

Title Learning Pain from Action Unit Combinations: A Weakly Supervised Approach via Multiple Instance Learning
Authors Zhanli Chen, Rashid Ansari, Diana J. Wilkie
Abstract Patient pain can be detected highly reliably from facial expressions using a set of facial muscle-based action units (AUs) defined by the Facial Action Coding System (FACS). A key characteristic of facial expression of pain is the simultaneous occurrence of pain-related AU combinations, whose automated detection would be highly beneficial for efficient and practical pain monitoring. Existing general Automated Facial Expression Recognition (AFER) systems prove inadequate when applied specifically for detecting pain as they either focus on detecting individual pain-related AUs but not on combinations or they seek to bypass AU detection by training a binary pain classifier directly on pain intensity data but are limited by lack of enough labeled data for satisfactory training. In this paper, we propose a new approach that mimics the strategy of human coders of decoupling pain detection into two consecutive tasks: one performed at the individual video-frame level and the other at video-sequence level. Using state-of-the-art AFER tools to detect single AUs at the frame level, we propose two novel data structures to encode AU combinations from single AU scores. Two weakly supervised learning frameworks namely multiple instance learning (MIL) and multiple clustered instance learning (MCIL) are employed corresponding to each data structure to learn pain from video sequences. Experimental results show an 87% pain recognition accuracy with 0.94 AUC (Area Under Curve) on the UNBC-McMaster Shoulder Pain Expression dataset. Tests on long videos in a lung cancer patient video dataset demonstrates the potential value of the proposed system for pain monitoring in clinical settings.
Tasks Facial Expression Recognition, Multiple Instance Learning
Published 2017-12-05
URL http://arxiv.org/abs/1712.01496v2
PDF http://arxiv.org/pdf/1712.01496v2.pdf
PWC https://paperswithcode.com/paper/learning-pain-from-action-unit-combinations-a
Repo
Framework

A Generative Restricted Boltzmann Machine Based Method for High-Dimensional Motion Data Modeling

Title A Generative Restricted Boltzmann Machine Based Method for High-Dimensional Motion Data Modeling
Authors Siqi Nie, Ziheng Wang, Qiang Ji
Abstract Many computer vision applications involve modeling complex spatio-temporal patterns in high-dimensional motion data. Recently, restricted Boltzmann machines (RBMs) have been widely used to capture and represent spatial patterns in a single image or temporal patterns in several time slices. To model global dynamics and local spatial interactions, we propose to theoretically extend the conventional RBMs by introducing another term in the energy function to explicitly model the local spatial interactions in the input data. A learning method is then proposed to perform efficient learning for the proposed model. We further introduce a new method for multi-class classification that can effectively estimate the infeasible partition functions of different RBMs such that RBM is treated as a generative model for classification purpose. The improved RBM model is evaluated on two computer vision applications: facial expression recognition and human action recognition. Experimental results on benchmark databases demonstrate the effectiveness of the proposed algorithm.
Tasks Facial Expression Recognition, Temporal Action Localization
Published 2017-10-21
URL http://arxiv.org/abs/1710.07831v1
PDF http://arxiv.org/pdf/1710.07831v1.pdf
PWC https://paperswithcode.com/paper/a-generative-restricted-boltzmann-machine
Repo
Framework

Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study

Title Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study
Authors Peng Xu, Farbod Roosta-Khorasani, Michael W. Mahoney
Abstract While first-order optimization methods such as stochastic gradient descent (SGD) are popular in machine learning (ML), they come with well-known deficiencies, including relatively-slow convergence, sensitivity to the settings of hyper-parameters such as learning rate, stagnation at high training errors, and difficulty in escaping flat regions and saddle points. These issues are particularly acute in highly non-convex settings such as those arising in neural networks. Motivated by this, there has been recent interest in second-order methods that aim to alleviate these shortcomings by capturing curvature information. In this paper, we report detailed empirical evaluations of a class of Newton-type methods, namely sub-sampled variants of trust region (TR) and adaptive regularization with cubics (ARC) algorithms, for non-convex ML problems. In doing so, we demonstrate that these methods not only can be computationally competitive with hand-tuned SGD with momentum, obtaining comparable or better generalization performance, but also they are highly robust to hyper-parameter settings. Further, in contrast to SGD with momentum, we show that the manner in which these Newton-type methods employ curvature information allows them to seamlessly escape flat regions and saddle points.
Tasks
Published 2017-08-25
URL http://arxiv.org/abs/1708.07827v2
PDF http://arxiv.org/pdf/1708.07827v2.pdf
PWC https://paperswithcode.com/paper/second-order-optimization-for-non-convex
Repo
Framework

Unsupervised feature learning with discriminative encoder

Title Unsupervised feature learning with discriminative encoder
Authors Gaurav Pandey, Ambedkar Dukkipati
Abstract In recent years, deep discriminative models have achieved extraordinary performance on supervised learning tasks, significantly outperforming their generative counterparts. However, their success relies on the presence of a large amount of labeled data. How can one use the same discriminative models for learning useful features in the absence of labels? We address this question in this paper, by jointly modeling the distribution of data and latent features in a manner that explicitly assigns zero probability to unobserved data. Rather than maximizing the marginal probability of observed data, we maximize the joint probability of the data and the latent features using a two step EM-like procedure. To prevent the model from overfitting to our initial selection of latent features, we use adversarial regularization. Depending on the task, we allow the latent features to be one-hot or real-valued vectors and define a suitable prior on the features. For instance, one-hot features correspond to class labels and are directly used for the unsupervised and semi-supervised classification task, whereas real-valued feature vectors are fed as input to simple classifiers for auxiliary supervised discrimination tasks. The proposed model, which we dub discriminative encoder (or DisCoder), is flexible in the type of latent features that it can capture. The proposed model achieves state-of-the-art performance on several challenging tasks.
Tasks
Published 2017-09-03
URL http://arxiv.org/abs/1709.00672v1
PDF http://arxiv.org/pdf/1709.00672v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-feature-learning-with
Repo
Framework

Flexible Computing Services for Comparisons and Analyses of Classical Chinese Poetry

Title Flexible Computing Services for Comparisons and Analyses of Classical Chinese Poetry
Authors Chao-Lin Liu
Abstract We collect nine corpora of representative Chinese poetry for the time span of 1046 BCE and 1644 CE for studying the history of Chinese words, collocations, and patterns. By flexibly integrating our own tools, we are able to provide new perspectives for approaching our goals. We illustrate the ideas with two examples. The first example show a new way to compare word preferences of poets, and the second example demonstrates how we can utilize our corpora in historical studies of the Chinese words. We show the viability of the tools for academic research, and we wish to make it helpful for enriching existing Chinese dictionary as well.
Tasks
Published 2017-09-18
URL http://arxiv.org/abs/1709.05729v1
PDF http://arxiv.org/pdf/1709.05729v1.pdf
PWC https://paperswithcode.com/paper/flexible-computing-services-for-comparisons
Repo
Framework

An empirical study on the effectiveness of images in Multimodal Neural Machine Translation

Title An empirical study on the effectiveness of images in Multimodal Neural Machine Translation
Authors Jean-Benoit Delbrouck, Stéphane Dupont
Abstract In state-of-the-art Neural Machine Translation (NMT), an attention mechanism is used during decoding to enhance the translation. At every step, the decoder uses this mechanism to focus on different parts of the source sentence to gather the most useful information before outputting its target word. Recently, the effectiveness of the attention mechanism has also been explored for multimodal tasks, where it becomes possible to focus both on sentence parts and image regions that they describe. In this paper, we compare several attention mechanism on the multimodal translation task (English, image to German) and evaluate the ability of the model to make use of images to improve translation. We surpass state-of-the-art scores on the Multi30k data set, we nevertheless identify and report different misbehavior of the machine while translating.
Tasks Machine Translation
Published 2017-07-04
URL http://arxiv.org/abs/1707.00995v1
PDF http://arxiv.org/pdf/1707.00995v1.pdf
PWC https://paperswithcode.com/paper/an-empirical-study-on-the-effectiveness-of
Repo
Framework

Constrained Deep Transfer Feature Learning and its Applications

Title Constrained Deep Transfer Feature Learning and its Applications
Authors Yue Wu, Qiang Ji
Abstract Feature learning with deep models has achieved impressive results for both data representation and classification for various vision tasks. Deep feature learning, however, typically requires a large amount of training data, which may not be feasible for some application domains. Transfer learning can be one of the approaches to alleviate this problem by transferring data from data-rich source domain to data-scarce target domain. Existing transfer learning methods typically perform one-shot transfer learning and often ignore the specific properties that the transferred data must satisfy. To address these issues, we introduce a constrained deep transfer feature learning method to perform simultaneous transfer learning and feature learning by performing transfer learning in a progressively improving feature space iteratively in order to better narrow the gap between the target domain and the source domain for effective transfer of the data from the source domain to target domain. Furthermore, we propose to exploit the target domain knowledge and incorporate such prior knowledge as a constraint during transfer learning to ensure that the transferred data satisfies certain properties of the target domain. To demonstrate the effectiveness of the proposed constrained deep transfer feature learning method, we apply it to thermal feature learning for eye detection by transferring from the visible domain. We also applied the proposed method for cross-view facial expression recognition as a second application. The experimental results demonstrate the effectiveness of the proposed method for both applications.
Tasks Facial Expression Recognition, Transfer Learning
Published 2017-09-23
URL http://arxiv.org/abs/1709.08128v1
PDF http://arxiv.org/pdf/1709.08128v1.pdf
PWC https://paperswithcode.com/paper/constrained-deep-transfer-feature-learning
Repo
Framework
comments powered by Disqus