July 27, 2019

2947 words 14 mins read

Paper Group ANR 636

Paper Group ANR 636

Stratified Transfer Learning for Cross-domain Activity Recognition. Bib2vec: An Embedding-based Search System for Bibliographic Information. Compressing Recurrent Neural Network with Tensor Train. A Comparative Study of the Clinical use of Motion Analysis from Kinect Skeleton Data. A Tale of Two Animats: What does it take to have goals?. Finding St …

Stratified Transfer Learning for Cross-domain Activity Recognition

Title Stratified Transfer Learning for Cross-domain Activity Recognition
Authors Jindong Wang, Yiqiang Chen, Lisha Hu, Xiaohui Peng, Philip S. Yu
Abstract In activity recognition, it is often expensive and time-consuming to acquire sufficient activity labels. To solve this problem, transfer learning leverages the labeled samples from the source domain to annotate the target domain which has few or none labels. Existing approaches typically consider learning a global domain shift while ignoring the intra-affinity between classes, which will hinder the performance of the algorithms. In this paper, we propose a novel and general cross-domain learning framework that can exploit the intra-affinity of classes to perform intra-class knowledge transfer. The proposed framework, referred to as Stratified Transfer Learning (STL), can dramatically improve the classification accuracy for cross-domain activity recognition. Specifically, STL first obtains pseudo labels for the target domain via majority voting technique. Then, it performs intra-class knowledge transfer iteratively to transform both domains into the same subspaces. Finally, the labels of target domain are obtained via the second annotation. To evaluate the performance of STL, we conduct comprehensive experiments on three large public activity recognition datasets~(i.e. OPPORTUNITY, PAMAP2, and UCI DSADS), which demonstrates that STL significantly outperforms other state-of-the-art methods w.r.t. classification accuracy (improvement of 7.68%). Furthermore, we extensively investigate the performance of STL across different degrees of similarities and activity levels between domains. And we also discuss the potential of STL in other pervasive computing applications to provide empirical experience for future research.
Tasks Activity Recognition, Cross-Domain Activity Recognition, Transfer Learning
Published 2017-12-25
URL http://arxiv.org/abs/1801.00820v1
PDF http://arxiv.org/pdf/1801.00820v1.pdf
PWC https://paperswithcode.com/paper/stratified-transfer-learning-for-cross-domain
Repo
Framework

Bib2vec: An Embedding-based Search System for Bibliographic Information

Title Bib2vec: An Embedding-based Search System for Bibliographic Information
Authors Takuma Yoneda, Koki Mori, Makoto Miwa, Yutaka Sasaki
Abstract We propose a novel embedding model that represents relationships among several elements in bibliographic information with high representation ability and flexibility. Based on this model, we present a novel search system that shows the relationships among the elements in the ACL Anthology Reference Corpus. The evaluation results show that our model can achieve a high prediction ability and produce reasonable search results.
Tasks
Published 2017-06-16
URL http://arxiv.org/abs/1706.05122v3
PDF http://arxiv.org/pdf/1706.05122v3.pdf
PWC https://paperswithcode.com/paper/bib2vec-an-embedding-based-search-system-for
Repo
Framework

Compressing Recurrent Neural Network with Tensor Train

Title Compressing Recurrent Neural Network with Tensor Train
Authors Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
Abstract Recurrent Neural Network (RNN) are a popular choice for modeling temporal and sequential tasks and achieve many state-of-the-art performance on various complex problems. However, most of the state-of-the-art RNNs have millions of parameters and require many computational resources for training and predicting new data. This paper proposes an alternative RNN model to reduce the number of parameters significantly by representing the weight parameters based on Tensor Train (TT) format. In this paper, we implement the TT-format representation for several RNN architectures such as simple RNN and Gated Recurrent Unit (GRU). We compare and evaluate our proposed RNN model with uncompressed RNN model on sequence classification and sequence prediction tasks. Our proposed RNNs with TT-format are able to preserve the performance while reducing the number of RNN parameters significantly up to 40 times smaller.
Tasks
Published 2017-05-23
URL http://arxiv.org/abs/1705.08052v1
PDF http://arxiv.org/pdf/1705.08052v1.pdf
PWC https://paperswithcode.com/paper/compressing-recurrent-neural-network-with
Repo
Framework

A Comparative Study of the Clinical use of Motion Analysis from Kinect Skeleton Data

Title A Comparative Study of the Clinical use of Motion Analysis from Kinect Skeleton Data
Authors Sean Maudsley-Barton, Jamie McPheey, Anthony Bukowski, Daniel Leightley, Moi Hoon Yap
Abstract The analysis of human motion as a clinical tool can bring many benefits such as the early detection of disease and the monitoring of recovery, so in turn helping people to lead independent lives. However, it is currently under used. Developments in depth cameras, such as Kinect, have opened up the use of motion analysis in settings such as GP surgeries, care homes and private homes. To provide an insight into the use of Kinect in the healthcare domain, we present a review of the current state of the art. We then propose a method that can represent human motions from time-series data of arbitrary length, as a single vector. Finally, we demonstrate the utility of this method by extracting a set of clinically significant features and using them to detect the age related changes in the motions of a set of 54 individuals, with a high degree of certainty (F1- score between 0.9 - 1.0). Indicating its potential application in the detection of a range of age-related motion impairments.
Tasks Time Series
Published 2017-07-27
URL http://arxiv.org/abs/1707.08813v2
PDF http://arxiv.org/pdf/1707.08813v2.pdf
PWC https://paperswithcode.com/paper/a-comparative-study-of-the-clinical-use-of
Repo
Framework

A Tale of Two Animats: What does it take to have goals?

Title A Tale of Two Animats: What does it take to have goals?
Authors Larissa Albantakis
Abstract What does it take for a system, biological or not, to have goals? Here, this question is approached in the context of in silico artificial evolution. By examining the informational and causal properties of artificial organisms (‘animats’) controlled by small, adaptive neural networks (Markov Brains), this essay discusses necessary requirements for intrinsic information, autonomy, and meaning. The focus lies on comparing two types of Markov Brains that evolved in the same simple environment: one with purely feedforward connections between its elements, the other with an integrated set of elements that causally constrain each other. While both types of brains ‘process’ information about their environment and are equally fit, only the integrated one forms a causally autonomous entity above a background of external influences. This suggests that to assess whether goals are meaningful for a system itself, it is important to understand what the system is, rather than what it does.
Tasks
Published 2017-05-30
URL http://arxiv.org/abs/1705.10854v1
PDF http://arxiv.org/pdf/1705.10854v1.pdf
PWC https://paperswithcode.com/paper/a-tale-of-two-animats-what-does-it-take-to
Repo
Framework

Finding Statistically Significant Interactions between Continuous Features

Title Finding Statistically Significant Interactions between Continuous Features
Authors Mahito Sugiyama, Karsten Borgwardt
Abstract The search for higher-order feature interactions that are statistically significantly associated with a class variable is of high relevance in fields such as Genetics or Healthcare, but the combinatorial explosion of the candidate space makes this problem extremely challenging in terms of computational efficiency and proper correction for multiple testing. While recent progress has been made regarding this challenge for binary features, we here present the first solution for continuous features. We propose an algorithm which overcomes the combinatorial explosion of the search space of higher-order interactions by deriving a lower bound on the p-value for each interaction, which enables us to massively prune interactions that can never reach significance and to thereby gain more statistical power. In our experiments, our approach efficiently detects all significant interactions in a variety of synthetic and real-world datasets.
Tasks
Published 2017-02-28
URL https://arxiv.org/abs/1702.08694v3
PDF https://arxiv.org/pdf/1702.08694v3.pdf
PWC https://paperswithcode.com/paper/finding-significant-combinations-of
Repo
Framework

Deep Multimodal Representation Learning from Temporal Data

Title Deep Multimodal Representation Learning from Temporal Data
Authors Xitong Yang, Palghat Ramesh, Radha Chitta, Sriganesh Madhvanath, Edgar A. Bernal, Jiebo Luo
Abstract In recent years, Deep Learning has been successfully applied to multimodal learning problems, with the aim of learning useful joint representations in data fusion applications. When the available modalities consist of time series data such as video, audio and sensor signals, it becomes imperative to consider their temporal structure during the fusion process. In this paper, we propose the Correlational Recurrent Neural Network (CorrRNN), a novel temporal fusion model for fusing multiple input modalities that are inherently temporal in nature. Key features of our proposed model include: (i) simultaneous learning of the joint representation and temporal dependencies between modalities, (ii) use of multiple loss terms in the objective function, including a maximum correlation loss term to enhance learning of cross-modal information, and (iii) the use of an attention model to dynamically adjust the contribution of different input modalities to the joint representation. We validate our model via experimentation on two different tasks: video- and sensor-based activity classification, and audio-visual speech recognition. We empirically analyze the contributions of different components of the proposed CorrRNN model, and demonstrate its robustness, effectiveness and state-of-the-art performance on multiple datasets.
Tasks Audio-Visual Speech Recognition, Representation Learning, Speech Recognition, Time Series, Visual Speech Recognition
Published 2017-04-11
URL http://arxiv.org/abs/1704.03152v1
PDF http://arxiv.org/pdf/1704.03152v1.pdf
PWC https://paperswithcode.com/paper/deep-multimodal-representation-learning-from
Repo
Framework

Learning Surrogate Models of Document Image Quality Metrics for Automated Document Image Processing

Title Learning Surrogate Models of Document Image Quality Metrics for Automated Document Image Processing
Authors Prashant Singh, Ekta Vats, Anders Hast
Abstract Computation of document image quality metrics often depends upon the availability of a ground truth image corresponding to the document. This limits the applicability of quality metrics in applications such as hyperparameter optimization of image processing algorithms that operate on-the-fly on unseen documents. This work proposes the use of surrogate models to learn the behavior of a given document quality metric on existing datasets where ground truth images are available. The trained surrogate model can later be used to predict the metric value on previously unseen document images without requiring access to ground truth images. The surrogate model is empirically evaluated on the Document Image Binarization Competition (DIBCO) and the Handwritten Document Image Binarization Competition (H-DIBCO) datasets.
Tasks Hyperparameter Optimization
Published 2017-12-11
URL http://arxiv.org/abs/1712.03738v1
PDF http://arxiv.org/pdf/1712.03738v1.pdf
PWC https://paperswithcode.com/paper/learning-surrogate-models-of-document-image
Repo
Framework

Vertebral body segmentation with GrowCut: Initial experience, workflow and practical application

Title Vertebral body segmentation with GrowCut: Initial experience, workflow and practical application
Authors Jan Egger, Christopher Nimsky, Xiaojun Chen
Abstract In this contribution, we used the GrowCut segmentation algorithm publicly available in three-dimensional Slicer for three-dimensional segmentation of vertebral bodies. To the best of our knowledge, this is the first time that the GrowCut method has been studied for the usage of vertebral body segmentation. In brief, we found that the GrowCut segmentation times were consistently less than the manual segmentation times. Hence, GrowCut provides an alternative to a manual slice-by-slice segmentation process.
Tasks
Published 2017-11-13
URL http://arxiv.org/abs/1711.04592v1
PDF http://arxiv.org/pdf/1711.04592v1.pdf
PWC https://paperswithcode.com/paper/vertebral-body-segmentation-with-growcut
Repo
Framework

Learning Infinite RBMs with Frank-Wolfe

Title Learning Infinite RBMs with Frank-Wolfe
Authors Wei Ping, Qiang Liu, Alexander Ihler
Abstract In this work, we propose an infinite restricted Boltzmann machine~(RBM), whose maximum likelihood estimation~(MLE) corresponds to a constrained convex optimization. We consider the Frank-Wolfe algorithm to solve the program, which provides a sparse solution that can be interpreted as inserting a hidden unit at each iteration, so that the optimization process takes the form of a sequence of finite models of increasing complexity. As a side benefit, this can be used to easily and efficiently identify an appropriate number of hidden units during the optimization. The resulting model can also be used as an initialization for typical state-of-the-art RBM training algorithms such as contrastive divergence, leading to models with consistently higher test likelihood than random initialization.
Tasks
Published 2017-10-15
URL http://arxiv.org/abs/1710.05270v1
PDF http://arxiv.org/pdf/1710.05270v1.pdf
PWC https://paperswithcode.com/paper/learning-infinite-rbms-with-frank-wolfe
Repo
Framework

Automatic Face Image Quality Prediction

Title Automatic Face Image Quality Prediction
Authors Lacey Best-Rowden, Anil K. Jain
Abstract Face image quality can be defined as a measure of the utility of a face image to automatic face recognition. In this work, we propose (and compare) two methods for automatic face image quality based on target face quality values from (i) human assessments of face image quality (matcher-independent), and (ii) quality values computed from similarity scores (matcher-dependent). A support vector regression model trained on face features extracted using a deep convolutional neural network (ConvNet) is used to predict the quality of a face image. The proposed methods are evaluated on two unconstrained face image databases, LFW and IJB-A, which both contain facial variations with multiple quality factors. Evaluation of the proposed automatic face image quality measures shows we are able to reduce the FNMR at 1% FMR by at least 13% for two face matchers (a COTS matcher and a ConvNet matcher) by using the proposed face quality to select subsets of face images and video frames for matching templates (i.e., multiple faces per subject) in the IJB-A protocol. To our knowledge, this is the first work to utilize human assessments of face image quality in designing a predictor of unconstrained face quality that is shown to be effective in cross-database evaluation.
Tasks Face Recognition
Published 2017-06-29
URL http://arxiv.org/abs/1706.09887v1
PDF http://arxiv.org/pdf/1706.09887v1.pdf
PWC https://paperswithcode.com/paper/automatic-face-image-quality-prediction
Repo
Framework

Automated Body Structure Extraction from Arbitrary 3D Mesh

Title Automated Body Structure Extraction from Arbitrary 3D Mesh
Authors Yong Khoo, Sang Chung
Abstract This paper presents an automated method for 3D character skeleton extraction that can be applied for generic 3D shapes. Our work is motivated by the skeleton-based prior work on automatic rigging focused on skeleton extraction and can automatically aligns the extracted structure to fit the 3D shape of the given 3D mesh. The body mesh can be subsequently skinned based on the extracted skeleton and thus enables rigging process. In the experiment, we apply public dataset to drive the estimated skeleton from different body shapes, as well as the real data obtained from 3D scanning systems. Satisfactory results are obtained compared to the existing approaches.
Tasks
Published 2017-05-16
URL http://arxiv.org/abs/1705.05508v1
PDF http://arxiv.org/pdf/1705.05508v1.pdf
PWC https://paperswithcode.com/paper/automated-body-structure-extraction-from
Repo
Framework

Discussion quality diffuses in the digital public square

Title Discussion quality diffuses in the digital public square
Authors George Berry, Sean J. Taylor
Abstract Studies of online social influence have demonstrated that friends have important effects on many types of behavior in a wide variety of settings. However, we know much less about how influence works among relative strangers in digital public squares, despite important conversations happening in such spaces. We present the results of a study on large public Facebook pages where we randomly used two different methods–most recent and social feedback–to order comments on posts. We find that the social feedback condition results in higher quality viewed comments and response comments. After measuring the average quality of comments written by users before the study, we find that social feedback has a positive effect on response quality for both low and high quality commenters. We draw on a theoretical framework of social norms to explain this empirical result. In order to examine the influence mechanism further, we measure the similarity between comments viewed and written during the study, finding that similarity increases for the highest quality contributors under the social feedback condition. This suggests that, in addition to norms, some individuals may respond with increased relevance to high-quality comments.
Tasks
Published 2017-02-22
URL http://arxiv.org/abs/1702.06677v1
PDF http://arxiv.org/pdf/1702.06677v1.pdf
PWC https://paperswithcode.com/paper/discussion-quality-diffuses-in-the-digital
Repo
Framework

Frustratingly Short Attention Spans in Neural Language Modeling

Title Frustratingly Short Attention Spans in Neural Language Modeling
Authors Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel
Abstract Neural language models predict the next token using a latent representation of the immediate token history. Recently, various methods for augmenting neural language models with an attention mechanism over a differentiable memory have been proposed. For predicting the next token, these models query information from a memory of the recent history which can facilitate learning mid- and long-range dependencies. However, conventional attention mechanisms used in memory-augmented neural language models produce a single output vector per time step. This vector is used both for predicting the next token as well as for the key and value of a differentiable memory of a token history. In this paper, we propose a neural language model with a key-value attention mechanism that outputs separate representations for the key and value of a differentiable memory, as well as for encoding the next-word distribution. This model outperforms existing memory-augmented neural language models on two corpora. Yet, we found that our method mainly utilizes a memory of the five most recent output representations. This led to the unexpected main finding that a much simpler model based only on the concatenation of recent output representations from previous time steps is on par with more sophisticated memory-augmented neural language models.
Tasks Language Modelling
Published 2017-02-15
URL http://arxiv.org/abs/1702.04521v1
PDF http://arxiv.org/pdf/1702.04521v1.pdf
PWC https://paperswithcode.com/paper/frustratingly-short-attention-spans-in-neural
Repo
Framework

Semantic Foggy Scene Understanding with Synthetic Data

Title Semantic Foggy Scene Understanding with Synthetic Data
Authors Christos Sakaridis, Dengxin Dai, Luc Van Gool
Abstract This work addresses the problem of semantic foggy scene understanding (SFSU). Although extensive research has been performed on image dehazing and on semantic scene understanding with clear-weather images, little attention has been paid to SFSU. Due to the difficulty of collecting and annotating foggy images, we choose to generate synthetic fog on real images that depict clear-weather outdoor scenes, and then leverage these partially synthetic data for SFSU by employing state-of-the-art convolutional neural networks (CNN). In particular, a complete pipeline to add synthetic fog to real, clear-weather images using incomplete depth information is developed. We apply our fog synthesis on the Cityscapes dataset and generate Foggy Cityscapes with 20550 images. SFSU is tackled in two ways: 1) with typical supervised learning, and 2) with a novel type of semi-supervised learning, which combines 1) with an unsupervised supervision transfer from clear-weather images to their synthetic foggy counterparts. In addition, we carefully study the usefulness of image dehazing for SFSU. For evaluation, we present Foggy Driving, a dataset with 101 real-world images depicting foggy driving scenes, which come with ground truth annotations for semantic segmentation and object detection. Extensive experiments show that 1) supervised learning with our synthetic data significantly improves the performance of state-of-the-art CNN for SFSU on Foggy Driving; 2) our semi-supervised learning strategy further improves performance; and 3) image dehazing marginally advances SFSU with our learning strategy. The datasets, models and code are made publicly available.
Tasks Image Dehazing, Object Detection, Scene Understanding, Semantic Segmentation
Published 2017-08-25
URL https://arxiv.org/abs/1708.07819v3
PDF https://arxiv.org/pdf/1708.07819v3.pdf
PWC https://paperswithcode.com/paper/semantic-foggy-scene-understanding-with
Repo
Framework
comments powered by Disqus