May 5, 2019

3253 words 16 mins read

Paper Group ANR 515

Paper Group ANR 515

Motifs in Temporal Networks. Innovated scalable efficient estimation in ultra-large Gaussian graphical models. Alleviating Overfitting for Polysemous Words for Word Representation Estimation Using Lexicons. An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data. Evaluating the Performance of a Speech Recogniti …

Motifs in Temporal Networks

Title Motifs in Temporal Networks
Authors Ashwin Paranjape, Austin R. Benson, Jure Leskovec
Abstract Networks are a fundamental tool for modeling complex systems in a variety of domains including social and communication networks as well as biology and neuroscience. Small subgraph patterns in networks, called network motifs, are crucial to understanding the structure and function of these systems. However, the role of network motifs in temporal networks, which contain many timestamped links between the nodes, is not yet well understood. Here we develop a notion of a temporal network motif as an elementary unit of temporal networks and provide a general methodology for counting such motifs. We define temporal network motifs as induced subgraphs on sequences of temporal edges, design fast algorithms for counting temporal motifs, and prove their runtime complexity. Our fast algorithms achieve up to 56.5x speedup compared to a baseline method. Furthermore, we use our algorithms to count temporal motifs in a variety of networks. Results show that networks from different domains have significantly different motif counts, whereas networks from the same domain tend to have similar motif counts. We also find that different motifs occur at different time scales, which provides further insights into structure and function of temporal networks.
Tasks
Published 2016-12-29
URL http://arxiv.org/abs/1612.09259v1
PDF http://arxiv.org/pdf/1612.09259v1.pdf
PWC https://paperswithcode.com/paper/motifs-in-temporal-networks
Repo
Framework

Innovated scalable efficient estimation in ultra-large Gaussian graphical models

Title Innovated scalable efficient estimation in ultra-large Gaussian graphical models
Authors Yingying Fan, Jinchi Lv
Abstract Large-scale precision matrix estimation is of fundamental importance yet challenging in many contemporary applications for recovering Gaussian graphical models. In this paper, we suggest a new approach of innovated scalable efficient estimation (ISEE) for estimating large precision matrix. Motivated by the innovated transformation, we convert the original problem into that of large covariance matrix estimation. The suggested method combines the strengths of recent advances in high-dimensional sparse modeling and large covariance matrix estimation. Compared to existing approaches, our method is scalable and can deal with much larger precision matrices with simple tuning. Under mild regularity conditions, we establish that this procedure can recover the underlying graphical structure with significant probability and provide efficient estimation of link strengths. Both computational and theoretical advantages of the procedure are evidenced through simulation and real data examples.
Tasks
Published 2016-05-11
URL http://arxiv.org/abs/1605.03313v1
PDF http://arxiv.org/pdf/1605.03313v1.pdf
PWC https://paperswithcode.com/paper/innovated-scalable-efficient-estimation-in
Repo
Framework

Alleviating Overfitting for Polysemous Words for Word Representation Estimation Using Lexicons

Title Alleviating Overfitting for Polysemous Words for Word Representation Estimation Using Lexicons
Authors Yuanzhi Ke, Masafumi Hagiwara
Abstract Though there are some works on improving distributed word representations using lexicons, the improper overfitting of the words that have multiple meanings is a remaining issue deteriorating the learning when lexicons are used, which needs to be solved. An alternative method is to allocate a vector per sense instead of a vector per word. However, the word representations estimated in the former way are not as easy to use as the latter one. Our previous work uses a probabilistic method to alleviate the overfitting, but it is not robust with a small corpus. In this paper, we propose a new neural network to estimate distributed word representations using a lexicon and a corpus. We add a lexicon layer in the continuous bag-of-words model and a threshold node after the output of the lexicon layer. The threshold rejects the unreliable outputs of the lexicon layer that are less likely to be the same with their inputs. In this way, it alleviates the overfitting of the polysemous words. The proposed neural network can be trained using negative sampling, which maximizing the log probabilities of target words given the context words, by distinguishing the target words from random noises. We compare the proposed neural network with the continuous bag-of-words model, the other works improving it, and the previous works estimating distributed word representations using both a lexicon and a corpus. The experimental results show that the proposed neural network is more efficient and balanced for both semantic tasks and syntactic tasks than the previous works, and robust to the size of the corpus.
Tasks
Published 2016-12-02
URL http://arxiv.org/abs/1612.00584v2
PDF http://arxiv.org/pdf/1612.00584v2.pdf
PWC https://paperswithcode.com/paper/alleviating-overfitting-for-polysemous-words
Repo
Framework

An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data

Title An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data
Authors Sijie Song, Cuiling Lan, Junliang Xing, Wenjun Zeng, Jiaying Liu
Abstract Human action recognition is an important task in computer vision. Extracting discriminative spatial and temporal features to model the spatial and temporal evolutions of different actions plays a key role in accomplishing this task. In this work, we propose an end-to-end spatial and temporal attention model for human action recognition from skeleton data. We build our model on top of the Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM), which learns to selectively focus on discriminative joints of skeleton within each frame of the inputs and pays different levels of attention to the outputs of different frames. Furthermore, to ensure effective training of the network, we propose a regularized cross-entropy loss to drive the model learning process and develop a joint training strategy accordingly. Experimental results demonstrate the effectiveness of the proposed model,both on the small human action recognition data set of SBU and the currently largest NTU dataset.
Tasks Temporal Action Localization
Published 2016-11-18
URL http://arxiv.org/abs/1611.06067v1
PDF http://arxiv.org/pdf/1611.06067v1.pdf
PWC https://paperswithcode.com/paper/an-end-to-end-spatio-temporal-attention-model
Repo
Framework

Evaluating the Performance of a Speech Recognition based System

Title Evaluating the Performance of a Speech Recognition based System
Authors Vinod Kumar Pandey, Sunil Kumar Kopparapu
Abstract Speech based solutions have taken center stage with growth in the services industry where there is a need to cater to a very large number of people from all strata of the society. While natural language speech interfaces are the talk in the research community, yet in practice, menu based speech solutions thrive. Typically in a menu based speech solution the user is required to respond by speaking from a closed set of words when prompted by the system. A sequence of human speech response to the IVR prompts results in the completion of a transaction. A transaction is deemed successful if the speech solution can correctly recognize all the spoken utterances of the user whenever prompted by the system. The usual mechanism to evaluate the performance of a speech solution is to do an extensive test of the system by putting it to actual people use and then evaluating the performance by analyzing the logs for successful transactions. This kind of evaluation could lead to dissatisfied test users especially if the performance of the system were to result in a poor transaction completion rate. To negate this the Wizard of Oz approach is adopted during evaluation of a speech system. Overall this kind of evaluations is an expensive proposition both in terms of time and cost. In this paper, we propose a method to evaluate the performance of a speech solution without actually putting it to people use. We first describe the methodology and then show experimentally that this can be used to identify the performance bottlenecks of the speech solution even before the system is actually used thus saving evaluation time and expenses.
Tasks Speech Recognition
Published 2016-01-11
URL http://arxiv.org/abs/1601.02543v1
PDF http://arxiv.org/pdf/1601.02543v1.pdf
PWC https://paperswithcode.com/paper/evaluating-the-performance-of-a-speech
Repo
Framework

Integrating Topic Models and Latent Factors for Recommendation

Title Integrating Topic Models and Latent Factors for Recommendation
Authors Danis J. Wilson, Wei Zhang
Abstract The research of personalized recommendation techniques today has mostly parted into two mainstream directions, i.e., the factorization-based approaches and topic models. Practically, they aim to benefit from the numerical ratings and textual reviews, correspondingly, which compose two major information sources in various real-world systems. However, although the two approaches are supposed to be correlated for their same goal of accurate recommendation, there still lacks a clear theoretical understanding of how their objective functions can be mathematically bridged to leverage the numerical ratings and textual reviews collectively, and why such a bridge is intuitively reasonable to match up their learning procedures for the rating prediction and top-N recommendation tasks, respectively. In this work, we exposit with mathematical analysis that, the vector-level randomization functions to coordinate the optimization objectives of factorizational and topic models unfortunately do not exist at all, although they are usually pre-assumed and intuitively designed in the literature. Fortunately, we also point out that one can avoid the seeking of such a randomization function by optimizing a Joint Factorizational Topic (JFT) model directly. We apply our JFT model to restaurant recommendation, and study its performance in both normal and cross-city recommendation scenarios, where the latter is an extremely difficult task for its inherent cold-start nature. Experimental results on real-world datasets verified the appealing performance of our approach against previous methods, on both rating prediction and top-N recommendation tasks.
Tasks Topic Models
Published 2016-10-28
URL http://arxiv.org/abs/1610.09077v2
PDF http://arxiv.org/pdf/1610.09077v2.pdf
PWC https://paperswithcode.com/paper/integrating-topic-models-and-latent-factors
Repo
Framework

Fast Stability Scanning for Future Grid Scenario Analysis

Title Fast Stability Scanning for Future Grid Scenario Analysis
Authors Ruidong Liu, Gregor Verbic, Jin Ma
Abstract Future grid scenario analysis requires a major departure from conventional power system planning, where only a handful of most critical conditions is typically analyzed. To capture the inter-seasonal variations in renewable generation of a future grid scenario necessitates the use of computationally intensive time-series analysis. In this paper, we propose a planning framework for fast stability scanning of future grid scenarios using a novel feature selection algorithm and a novel self-adaptive PSO-k-means clustering algorithm. To achieve the computational speed-up, the stability analysis is performed only on small number of representative cluster centroids instead of on the full set of operating conditions. As a case study, we perform small-signal stability and steady-state voltage stability scanning of a simplified model of the Australian National Electricity Market with significant penetration of renewable generation. The simulation results show the effectiveness of the proposed approach. Compared to an exhaustive time series scanning, the proposed framework reduced the computational burden up to ten times, with an acceptable level of accuracy.
Tasks Feature Selection, Time Series, Time Series Analysis
Published 2016-12-14
URL http://arxiv.org/abs/1701.03436v1
PDF http://arxiv.org/pdf/1701.03436v1.pdf
PWC https://paperswithcode.com/paper/fast-stability-scanning-for-future-grid
Repo
Framework

A Large Contextual Dataset for Classification, Detection and Counting of Cars with Deep Learning

Title A Large Contextual Dataset for Classification, Detection and Counting of Cars with Deep Learning
Authors T. Nathan Mundhenk, Goran Konjevod, Wesam A. Sakla, Kofi Boakye
Abstract We have created a large diverse set of cars from overhead images, which are useful for training a deep learner to binary classify, detect and count them. The dataset and all related material will be made publically available. The set contains contextual matter to aid in identification of difficult targets. We demonstrate classification and detection on this dataset using a neural network we call ResCeption. This network combines residual learning with Inception-style layers and is used to count cars in one look. This is a new way to count objects rather than by localization or density estimation. It is fairly accurate, fast and easy to implement. Additionally, the counting method is not car or scene specific. It would be easy to train this method to count other kinds of objects and counting over new scenes requires no extra set up or assumptions about object locations.
Tasks Density Estimation
Published 2016-09-14
URL http://arxiv.org/abs/1609.04453v1
PDF http://arxiv.org/pdf/1609.04453v1.pdf
PWC https://paperswithcode.com/paper/a-large-contextual-dataset-for-classification
Repo
Framework

Modelling depth for nonparametric foreground segmentation using RGBD devices

Title Modelling depth for nonparametric foreground segmentation using RGBD devices
Authors Gabriel Moyà-Alcover, Ahmed Elgammal, Antoni Jaume-i-Capó, Javier Varona
Abstract The problem of detecting changes in a scene and segmenting the foreground from background is still challenging, despite previous work. Moreover, new RGBD capturing devices include depth cues, which could be incorporated to improve foreground segmentation. In this work, we present a new nonparametric approach where a unified model mixes the device multiple information cues. In order to unify all the device channel cues, a new probabilistic depth data model is also proposed where we show how handle the inaccurate data to improve foreground segmentation. A new RGBD video dataset is presented in order to introduce a new standard for comparison purposes of this kind of algorithms. Results show that the proposed approach can handle several practical situations and obtain good results in all cases.
Tasks
Published 2016-09-29
URL http://arxiv.org/abs/1609.09240v1
PDF http://arxiv.org/pdf/1609.09240v1.pdf
PWC https://paperswithcode.com/paper/modelling-depth-for-nonparametric-foreground
Repo
Framework

ASP Vision: Optically Computing the First Layer of Convolutional Neural Networks using Angle Sensitive Pixels

Title ASP Vision: Optically Computing the First Layer of Convolutional Neural Networks using Angle Sensitive Pixels
Authors Huaijin Chen, Suren Jayasuriya, Jiyue Yang, Judy Stephen, Sriram Sivaramakrishnan, Ashok Veeraraghavan, Alyosha Molnar
Abstract Deep learning using convolutional neural networks (CNNs) is quickly becoming the state-of-the-art for challenging computer vision applications. However, deep learning’s power consumption and bandwidth requirements currently limit its application in embedded and mobile systems with tight energy budgets. In this paper, we explore the energy savings of optically computing the first layer of CNNs. To do so, we utilize bio-inspired Angle Sensitive Pixels (ASPs), custom CMOS diffractive image sensors which act similar to Gabor filter banks in the V1 layer of the human visual cortex. ASPs replace both image sensing and the first layer of a conventional CNN by directly performing optical edge filtering, saving sensing energy, data bandwidth, and CNN FLOPS to compute. Our experimental results (both on synthetic data and a hardware prototype) for a variety of vision tasks such as digit recognition, object recognition, and face identification demonstrate using ASPs while achieving similar performance compared to traditional deep learning pipelines.
Tasks Face Identification, Object Recognition
Published 2016-05-11
URL http://arxiv.org/abs/1605.03621v3
PDF http://arxiv.org/pdf/1605.03621v3.pdf
PWC https://paperswithcode.com/paper/asp-vision-optically-computing-the-first
Repo
Framework

Improved Eigenfeature Regularization for Face Identification

Title Improved Eigenfeature Regularization for Face Identification
Authors Bappaditya Mandal
Abstract In this work, we propose to divide each class (a person) into subclasses using spatial partition trees which helps in better capturing the intra-personal variances arising from the appearances of the same individual. We perform a comprehensive analysis on within-class and within-subclass eigenspectrums of face images and propose a novel method of eigenspectrum modeling which extracts discriminative features of faces from both within-subclass and total or between-subclass scatter matrices. Effective low-dimensional face discriminative features are extracted for face recognition (FR) after performing discriminant evaluation in the entire eigenspace. Experimental results on popular face databases (AR, FERET) and the challenging unconstrained YouTube Face database show the superiority of our proposed approach on all three databases.
Tasks Face Identification, Face Recognition
Published 2016-02-10
URL http://arxiv.org/abs/1602.03256v1
PDF http://arxiv.org/pdf/1602.03256v1.pdf
PWC https://paperswithcode.com/paper/improved-eigenfeature-regularization-for-face
Repo
Framework

Automatic recognition of child speech for robotic applications in noisy environments

Title Automatic recognition of child speech for robotic applications in noisy environments
Authors Samuel Fernando, Roger K. Moore, David Cameron, Emily C. Collins, Abigail Millings, Amanda J. Sharkey, Tony J. Prescott
Abstract Automatic speech recognition (ASR) allows a natural and intuitive interface for robotic educational applications for children. However there are a number of challenges to overcome to allow such an interface to operate robustly in realistic settings, including the intrinsic difficulties of recognising child speech and high levels of background noise often present in classrooms. As part of the EU EASEL project we have provided several contributions to address these challenges, implementing our own ASR module for use in robotics applications. We used the latest deep neural network algorithms which provide a leap in performance over the traditional GMM approach, and apply data augmentation methods to improve robustness to noise and speaker variation. We provide a close integration between the ASR module and the rest of the dialogue system, allowing the ASR to receive in real-time the language models relevant to the current section of the dialogue, greatly improving the accuracy. We integrated our ASR module into an interactive, multimodal system using a small humanoid robot to help children learn about exercise and energy. The system was installed at a public museum event as part of a research study where 320 children (aged 3 to 14) interacted with the robot, with our ASR achieving 90% accuracy for fluent and near-fluent speech.
Tasks Data Augmentation, Speech Recognition
Published 2016-11-08
URL http://arxiv.org/abs/1611.02695v1
PDF http://arxiv.org/pdf/1611.02695v1.pdf
PWC https://paperswithcode.com/paper/automatic-recognition-of-child-speech-for
Repo
Framework

A New Method for Classification of Datasets for Data Mining

Title A New Method for Classification of Datasets for Data Mining
Authors Singh Vijendra, Hemjyotsana Parashar, Nisha Vasudeva
Abstract Decision tree is an important method for both induction research and data mining, which is mainly used for model classification and prediction. ID3 algorithm is the most widely used algorithm in the decision tree so far. In this paper, the shortcoming of ID3’s inclining to choose attributes with many values is discussed, and then a new decision tree algorithm which is improved version of ID3. In our proposed algorithm attributes are divided into groups and then we apply the selection measure 5 for these groups. If information gain is not good then again divide attributes values into groups. These steps are done until we get good classification/misclassification ratio. The proposed algorithms classify the data sets more accurately and efficiently.
Tasks
Published 2016-12-01
URL http://arxiv.org/abs/1612.00151v1
PDF http://arxiv.org/pdf/1612.00151v1.pdf
PWC https://paperswithcode.com/paper/a-new-method-for-classification-of-datasets
Repo
Framework

Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images

Title Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images
Authors Junhua Mao, Jiajing Xu, Yushi Jing, Alan Yuille
Abstract In this paper, we focus on training and evaluating effective word embeddings with both text and visual information. More specifically, we introduce a large-scale dataset with 300 million sentences describing over 40 million images crawled and downloaded from publicly available Pins (i.e. an image with sentence descriptions uploaded by users) on Pinterest. This dataset is more than 200 times larger than MS COCO, the standard large-scale image dataset with sentence descriptions. In addition, we construct an evaluation dataset to directly assess the effectiveness of word embeddings in terms of finding semantically similar or related words and phrases. The word/phrase pairs in this evaluation dataset are collected from the click data with millions of users in an image search system, thus contain rich semantic relationships. Based on these datasets, we propose and compare several Recurrent Neural Networks (RNNs) based multimodal (text and image) models. Experiments show that our model benefits from incorporating the visual information into the word embeddings, and a weight sharing strategy is crucial for learning such multimodal embeddings. The project page is: http://www.stat.ucla.edu/~junhua.mao/multimodal_embedding.html
Tasks Image Retrieval, Word Embeddings
Published 2016-11-24
URL http://arxiv.org/abs/1611.08321v1
PDF http://arxiv.org/pdf/1611.08321v1.pdf
PWC https://paperswithcode.com/paper/training-and-evaluating-multimodal-word
Repo
Framework

A Distributional Semantics Approach to Implicit Language Learning

Title A Distributional Semantics Approach to Implicit Language Learning
Authors Dimitrios Alikaniotis, John N. Williams
Abstract In the present paper we show that distributional information is particularly important when considering concept availability under implicit language learning conditions. Based on results from different behavioural experiments we argue that the implicit learnability of semantic regularities depends on the degree to which the relevant concept is reflected in language use. In our simulations, we train a Vector-Space model on either an English or a Chinese corpus and then feed the resulting representations to a feed-forward neural network. The task of the neural network was to find a mapping between the word representations and the novel words. Using datasets from four behavioural experiments, which used different semantic manipulations, we were able to obtain learning patterns very similar to those obtained by humans.
Tasks
Published 2016-06-29
URL http://arxiv.org/abs/1606.09058v1
PDF http://arxiv.org/pdf/1606.09058v1.pdf
PWC https://paperswithcode.com/paper/a-distributional-semantics-approach-to
Repo
Framework
comments powered by Disqus