Paper Group ANR 12
Temporal-Spatial Mapping for Action Recognition. Towards the Design of Aerostat Wind Turbine Arrays through AI. Greedy Active Learning Algorithm for Logistic Regression Models. A novel hybrid score level and decision level fusion scheme for cancelable multi-biometric verification. Sharpen Focus: Learning with Attention Separability and Consistency. …
Temporal-Spatial Mapping for Action Recognition
Title | Temporal-Spatial Mapping for Action Recognition |
Authors | Xiaolin Song, Cuiling Lan, Wenjun Zeng, Junliang Xing, Jingyu Yang, Xiaoyan Sun |
Abstract | Deep learning models have enjoyed great success for image related computer vision tasks like image classification and object detection. For video related tasks like human action recognition, however, the advancements are not as significant yet. The main challenge is the lack of effective and efficient models in modeling the rich temporal spatial information in a video. We introduce a simple yet effective operation, termed Temporal-Spatial Mapping (TSM), for capturing the temporal evolution of the frames by jointly analyzing all the frames of a video. We propose a video level 2D feature representation by transforming the convolutional features of all frames to a 2D feature map, referred to as VideoMap. With each row being the vectorized feature representation of a frame, the temporal-spatial features are compactly represented, while the temporal dynamic evolution is also well embedded. Based on the VideoMap representation, we further propose a temporal attention model within a shallow convolutional neural network to efficiently exploit the temporal-spatial dynamics. The experiment results show that the proposed scheme achieves the state-of-the-art performance, with 4.2% accuracy gain over Temporal Segment Network (TSN), a competing baseline method, on the challenging human action benchmark dataset HMDB51. |
Tasks | Action Recognition In Videos, Image Classification, Object Detection, Temporal Action Localization |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.03669v1 |
http://arxiv.org/pdf/1809.03669v1.pdf | |
PWC | https://paperswithcode.com/paper/temporal-spatial-mapping-for-action |
Repo | |
Framework | |
Towards the Design of Aerostat Wind Turbine Arrays through AI
Title | Towards the Design of Aerostat Wind Turbine Arrays through AI |
Authors | Larry Bull, Neil Phillips |
Abstract | A new form of aerostat wind generation system which contains an array of interacting turbines is proposed. The design of the balloon turbine components is undertaken through the combination of artificial intelligence and rapid prototyping techniques such that the need for highly accurate models/simulations of the lift and wake dynamics is removed/reduced. Initial small-scale wind tunnel testing to determine design and algorithmic fundamentals will be presented. |
Tasks | |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05290v1 |
http://arxiv.org/pdf/1811.05290v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-the-design-of-aerostat-wind-turbine |
Repo | |
Framework | |
Greedy Active Learning Algorithm for Logistic Regression Models
Title | Greedy Active Learning Algorithm for Logistic Regression Models |
Authors | Hsiang-Ling Hsu, Yuan-Chin Ivan Chang, Ray-Bing Chen |
Abstract | We study a logistic model-based active learning procedure for binary classification problems, in which we adopt a batch subject selection strategy with a modified sequential experimental design method. Moreover, accompanying the proposed subject selection scheme, we simultaneously conduct a greedy variable selection procedure such that we can update the classification model with all labeled training subjects. The proposed algorithm repeatedly performs both subject and variable selection steps until a prefixed stopping criterion is reached. Our numerical results show that the proposed procedure has competitive performance, with smaller training size and a more compact model, comparing with that of the classifier trained with all variables and a full data set. We also apply the proposed procedure to a well-known wave data set (Breiman et al., 1984) to confirm the performance of our method. |
Tasks | Active Learning |
Published | 2018-02-01 |
URL | http://arxiv.org/abs/1802.00243v1 |
http://arxiv.org/pdf/1802.00243v1.pdf | |
PWC | https://paperswithcode.com/paper/greedy-active-learning-algorithm-for-logistic |
Repo | |
Framework | |
A novel hybrid score level and decision level fusion scheme for cancelable multi-biometric verification
Title | A novel hybrid score level and decision level fusion scheme for cancelable multi-biometric verification |
Authors | Rudresh Dwivedi, Somnath Dey |
Abstract | In spite of the benefits of biometric-based authentication systems, there are few concerns raised because of the sensitivity of biometric data to outliers, low performance caused due to intra-class variations and privacy invasion caused by information leakage. To address these issues, we propose a hybrid fusion framework where only the protected modalities are combined to fulfill the requirement of secrecy and performance improvement. This paper presents a method to integrate cancelable modalities utilizing mean-closure weighting (MCW) score level and Dempster-Shafer (DS) theory based decision level fusion for iris and fingerprint to mitigate the limitations in the individual score or decision fusion mechanisms. The proposed hybrid fusion scheme incorporates the similarity scores from different matchers corresponding to each protected modality. The individual scores obtained from different matchers for each modality are combined using MCW score fusion method. The MCW technique achieves the optimal weight for each matcher involved in the score computation. Further, DS theory is applied to the induced scores to output the final decision. The rigorous experimental evaluations on three virtual databases indicate that the proposed hybrid fusion framework outperforms over the component level or individual fusion methods (score level and decision level fusion). As a result, we achieve (48%,66%), (72%,86%) and (49%,38%) of performance improvement over unimodal cancelable iris and unimodal cancelable fingerprint verification systems for Virtual_A, Virtual_B and Virtual_C databases, respectively. Also, the proposed method is robust enough to the variability of scores and outliers satisfying the requirement of secure authentication. |
Tasks | |
Published | 2018-05-26 |
URL | http://arxiv.org/abs/1805.10433v1 |
http://arxiv.org/pdf/1805.10433v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-hybrid-score-level-and-decision-level |
Repo | |
Framework | |
Sharpen Focus: Learning with Attention Separability and Consistency
Title | Sharpen Focus: Learning with Attention Separability and Consistency |
Authors | Lezi Wang, Ziyan Wu, Srikrishna Karanam, Kuan-Chuan Peng, Rajat Vikram Singh, Bo Liu, Dimitris N. Metaxas |
Abstract | Recent developments in gradient-based attention modeling have seen attention maps emerge as a powerful tool for interpreting convolutional neural networks. Despite good localization for an individual class of interest, these techniques produce attention maps with substantially overlapping responses among different classes, leading to the problem of visual confusion and the need for discriminative attention. In this paper, we address this problem by means of a new framework that makes class-discriminative attention a principled part of the learning process. Our key innovations include new learning objectives for attention separability and cross-layer consistency, which result in improved attention discriminability and reduced visual confusion. Extensive experiments on image classification benchmarks show the effectiveness of our approach in terms of improved classification accuracy, including CIFAR-100 (+3.33%), Caltech-256 (+1.64%), ILSVRC2012 (+0.92%), CUB-200-2011 (+4.8%) and PASCAL VOC2012 (+5.73%). |
Tasks | Image Classification |
Published | 2018-11-19 |
URL | https://arxiv.org/abs/1811.07484v3 |
https://arxiv.org/pdf/1811.07484v3.pdf | |
PWC | https://paperswithcode.com/paper/reducing-visual-confusion-with-discriminative |
Repo | |
Framework | |
Progressive Data Science: Potential and Challenges
Title | Progressive Data Science: Potential and Challenges |
Authors | Cagatay Turkay, Nicola Pezzotti, Carsten Binnig, Hendrik Strobelt, Barbara Hammer, Daniel A. Keim, Jean-Daniel Fekete, Themis Palpanas, Yunhai Wang, Florin Rusu |
Abstract | Data science requires time-consuming iterative manual activities. In particular, activities such as data selection, preprocessing, transformation, and mining, highly depend on iterative trial-and-error processes that could be sped-up significantly by providing quick feedback on the impact of changes. The idea of progressive data science is to compute the results of changes in a progressive manner, returning a first approximation of results quickly and allow iterative refinements until converging to a final result. Enabling the user to interact with the intermediate results allows an early detection of erroneous or suboptimal choices, the guided definition of modifications to the pipeline and their quick assessment. In this paper, we discuss the progressiveness challenges arising in different steps of the data science pipeline. We describe how changes in each step of the pipeline impact the subsequent steps and outline why progressive data science will help to make the process more effective. Computing progressive approximations of outcomes resulting from changes creates numerous research challenges, especially if the changes are made in the early steps of the pipeline. We discuss these challenges and outline first steps towards progressiveness, which, we argue, will ultimately help to significantly speed-up the overall data science process. |
Tasks | |
Published | 2018-12-19 |
URL | https://arxiv.org/abs/1812.08032v2 |
https://arxiv.org/pdf/1812.08032v2.pdf | |
PWC | https://paperswithcode.com/paper/progressive-data-science-potential-and |
Repo | |
Framework | |
Complementary Attributes: A New Clue to Zero-Shot Learning
Title | Complementary Attributes: A New Clue to Zero-Shot Learning |
Authors | Xiaofeng Xu, Ivor W. Tsang, Chuancai Liu |
Abstract | Zero-shot learning (ZSL) aims to recognize unseen objects using disjoint seen objects via sharing attributes. The generalization performance of ZSL is governed by the attributes, which transfer semantic information from seen classes to unseen classes. To take full advantage of the knowledge transferred by attributes, in this paper, we introduce the notion of complementary attributes (CA), as a supplement to the original attributes, to enhance the semantic representation ability. Theoretical analyses demonstrate that complementary attributes can improve the PAC-style generalization bound of original ZSL model. Since the proposed CA focuses on enhancing the semantic representation, CA can be easily applied to any existing attribute-based ZSL methods, including the label-embedding strategy based ZSL (LEZSL) and the probability-prediction strategy based ZSL (PPZSL). In PPZSL, there is a strong assumption that all the attributes are independent of each other, which is arguably unrealistic in practice. To solve this problem, a novel rank aggregation framework is proposed to circumvent the assumption. Extensive experiments on five ZSL benchmark datasets and the large-scale ImageNet dataset demonstrate that the proposed complementary attributes and rank aggregation can significantly and robustly improve existing ZSL methods and achieve the state-of-the-art performance. |
Tasks | Style Generalization, Zero-Shot Learning |
Published | 2018-04-17 |
URL | https://arxiv.org/abs/1804.06505v2 |
https://arxiv.org/pdf/1804.06505v2.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-learning-with-complementary |
Repo | |
Framework | |
A Note about: Local Explanation Methods for Deep Neural Networks lack Sensitivity to Parameter Values
Title | A Note about: Local Explanation Methods for Deep Neural Networks lack Sensitivity to Parameter Values |
Authors | Mukund Sundararajan, Ankur Taly |
Abstract | Local explanation methods, also known as attribution methods, attribute a deep network’s prediction to its input (cf. Baehrens et al. (2010)). We respond to the claim from Adebayo et al. (2018) that local explanation methods lack sensitivity, i.e., DNNs with randomly-initialized weights produce explanations that are both visually and quantitatively similar to those produced by DNNs with learned weights. Further investigation reveals that their findings are due to two choices in their analysis: (a) ignoring the signs of the attributions; and (b) for integrated gradients (IG), including pixels in their analysis that have zero attributions by choice of the baseline (an auxiliary input relative to which the attributions are computed). When both factors are accounted for, IG attributions for a random network and the actual network are uncorrelated. Our investigation also sheds light on how these issues affect visualizations, although we note that more work is needed to understand how viewers interpret the difference between the random and the actual attributions. |
Tasks | |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.04205v1 |
http://arxiv.org/pdf/1806.04205v1.pdf | |
PWC | https://paperswithcode.com/paper/a-note-about-local-explanation-methods-for |
Repo | |
Framework | |
Complex and Quaternionic Principal Component Pursuit and Its Application to Audio Separation
Title | Complex and Quaternionic Principal Component Pursuit and Its Application to Audio Separation |
Authors | Tak-Shing T. Chan, Yi-Hsuan Yang |
Abstract | Recently, the principal component pursuit has received increasing attention in signal processing research ranging from source separation to video surveillance. So far, all existing formulations are real-valued and lack the concept of phase, which is inherent in inputs such as complex spectrograms or color images. Thus, in this letter, we extend principal component pursuit to the complex and quaternionic cases to account for the missing phase information. Specifically, we present both complex and quaternionic proximity operators for the $\ell_1$- and trace-norm regularizers. These operators can be used in conjunction with proximal minimization methods such as the inexact augmented Lagrange multiplier algorithm. The new algorithms are then applied to the singing voice separation problem, which aims to separate the singing voice from the instrumental accompaniment. Results on the iKala and MSD100 datasets confirmed the usefulness of phase information in principal component pursuit. |
Tasks | |
Published | 2018-01-09 |
URL | http://arxiv.org/abs/1801.03816v1 |
http://arxiv.org/pdf/1801.03816v1.pdf | |
PWC | https://paperswithcode.com/paper/complex-and-quaternionic-principal-component |
Repo | |
Framework | |
MEETING BOT: Reinforcement Learning for Dialogue Based Meeting Scheduling
Title | MEETING BOT: Reinforcement Learning for Dialogue Based Meeting Scheduling |
Authors | Vishwanath D, Lovekesh Vig, Gautam Shroff, Puneet Agarwal |
Abstract | In this paper we present Meeting Bot, a reinforcement learning based conversational system that interacts with multiple users to schedule meetings. The system is able to interpret user utterences and map them to preferred time slots, which are then fed to a reinforcement learning (RL) system with the goal of converging on an agreeable time slot. The RL system is able to adapt to user preferences and environmental changes in meeting arrival rate while still scheduling effectively. Learning is performed via policy gradient with exploration, by utilizing an MLP as an approximator of the policy function. Results demonstrate that the system outperforms standard scheduling algorithms in terms of overall scheduling efficiency. Additionally, the system is able to adapt its strategy to situations when users consistently reject or accept meetings in certain slots (such as Friday afternoon versus Thursday morning), or when the meeting is called by members who are at a more senior designation. |
Tasks | |
Published | 2018-12-28 |
URL | http://arxiv.org/abs/1812.11158v1 |
http://arxiv.org/pdf/1812.11158v1.pdf | |
PWC | https://paperswithcode.com/paper/meeting-bot-reinforcement-learning-for |
Repo | |
Framework | |
Analysing Results from AI Benchmarks: Key Indicators and How to Obtain Them
Title | Analysing Results from AI Benchmarks: Key Indicators and How to Obtain Them |
Authors | Fernando Martínez-Plumed, José Hernández-Orallo |
Abstract | Item response theory (IRT) can be applied to the analysis of the evaluation of results from AI benchmarks. The two-parameter IRT model provides two indicators (difficulty and discrimination) on the side of the item (or AI problem) while only one indicator (ability) on the side of the respondent (or AI agent). In this paper we analyse how to make this set of indicators dual, by adding a fourth indicator, generality, on the side of the respondent. Generality is meant to be dual to discrimination, and it is based on difficulty. Namely, generality is defined as a new metric that evaluates whether an agent is consistently good at easy problems and bad at difficult ones. With the addition of generality, we see that this set of four key indicators can give us more insight on the results of AI benchmarks. In particular, we explore two popular benchmarks in AI, the Arcade Learning Environment (Atari 2600 games) and the General Video Game AI competition. We provide some guidelines to estimate and interpret these indicators for other AI benchmarks and competitions. |
Tasks | Atari Games |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08186v2 |
http://arxiv.org/pdf/1811.08186v2.pdf | |
PWC | https://paperswithcode.com/paper/analysing-results-from-ai-benchmarks-key |
Repo | |
Framework | |
On the Generalizability of Linear and Non-Linear Region of Interest-Based Multivariate Regression Models for fMRI Data
Title | On the Generalizability of Linear and Non-Linear Region of Interest-Based Multivariate Regression Models for fMRI Data |
Authors | Ethan C. Jackson, James Alexander Hughes, Mark Daley |
Abstract | In contrast to conventional, univariate analysis, various types of multivariate analysis have been applied to functional magnetic resonance imaging (fMRI) data. In this paper, we compare two contemporary approaches for multivariate regression on task-based fMRI data: linear regression with ridge regularization and non-linear symbolic regression using genetic programming. The data for this project is representative of a contemporary fMRI experimental design for visual stimuli. Linear and non-linear models were generated for 10 subjects, with another 4 withheld for validation. Model quality is evaluated by comparing $R$ scores (Pearson product-moment correlation) in various contexts, including single run self-fit, within-subject generalization, and between-subject generalization. Propensity for modelling strategies to overfit is estimated using a separate resting state scan. Results suggest that neither method is objectively or inherently better than the other. |
Tasks | |
Published | 2018-02-03 |
URL | http://arxiv.org/abs/1802.02423v1 |
http://arxiv.org/pdf/1802.02423v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-generalizability-of-linear-and-non |
Repo | |
Framework | |
Generating Attention from Classifier Activations for Fine-grained Recognition
Title | Generating Attention from Classifier Activations for Fine-grained Recognition |
Authors | Wei Shen, Rujie Liu |
Abstract | Recent advances in fine-grained recognition utilize attention maps to localize objects of interest. Although there are many ways to generate attention maps, most of them rely on sophisticated loss functions or complex training processes. In this work, we propose a simple and straightforward attention generation model based on the output activations of classifiers. The advantage of our model is that it can be easily trained with image level labels and softmax loss functions. More specifically, multiple linear local classifiers are firstly adopted to perform fine-grained classification at each location of high level CNN feature maps. The attention map is generated by aggregating and max-pooling the output activations. Then the attention map serves as a surrogate target object mask to train those local classifiers, similar to training models for semantic segmentation. Our model achieves state-of-the-art results on three heavily benchmarked datasets, i.e. 87.9% on CUB-200-2011 dataset, 94.1% on Stanford Cars dataset and 92.1% on FGVC-Aircraft dataset, demonstrating its effectiveness on fine-grained recognition tasks. |
Tasks | Semantic Segmentation |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.10770v1 |
http://arxiv.org/pdf/1811.10770v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-attention-from-classifier |
Repo | |
Framework | |
An Initial Attempt of Combining Visual Selective Attention with Deep Reinforcement Learning
Title | An Initial Attempt of Combining Visual Selective Attention with Deep Reinforcement Learning |
Authors | Liu Yuezhang, Ruohan Zhang, Dana H. Ballard |
Abstract | Visual attention serves as a means of feature selection mechanism in the perceptual system. Motivated by Broadbent’s leaky filter model of selective attention, we evaluate how such mechanism could be implemented and affect the learning process of deep reinforcement learning. We visualize and analyze the feature maps of DQN on a toy problem Catch, and propose an approach to combine visual selective attention with deep reinforcement learning. We experiment with optical flow-based attention and A2C on Atari games. Experiment results show that visual selective attention could lead to improvements in terms of sample efficiency on tested games. An intriguing relation between attention and batch normalization is also discovered. |
Tasks | Atari Games, Feature Selection, Optical Flow Estimation |
Published | 2018-11-11 |
URL | http://arxiv.org/abs/1811.04407v1 |
http://arxiv.org/pdf/1811.04407v1.pdf | |
PWC | https://paperswithcode.com/paper/an-initial-attempt-of-combining-visual |
Repo | |
Framework | |
A Survey on Data Collection for Machine Learning: a Big Data – AI Integration Perspective
Title | A Survey on Data Collection for Machine Learning: a Big Data – AI Integration Perspective |
Authors | Yuji Roh, Geon Heo, Steven Euijong Whang |
Abstract | Data collection is a major bottleneck in machine learning and an active research topic in multiple communities. There are largely two reasons data collection has recently become a critical issue. First, as machine learning is becoming more widely-used, we are seeing new applications that do not necessarily have enough labeled data. Second, unlike traditional machine learning, deep learning techniques automatically generate features, which saves feature engineering costs, but in return may require larger amounts of labeled data. Interestingly, recent research in data collection comes not only from the machine learning, natural language, and computer vision communities, but also from the data management community due to the importance of handling large amounts of data. In this survey, we perform a comprehensive study of data collection from a data management point of view. Data collection largely consists of data acquisition, data labeling, and improvement of existing data or models. We provide a research landscape of these operations, provide guidelines on which technique to use when, and identify interesting research challenges. The integration of machine learning and data management for data collection is part of a larger trend of Big data and Artificial Intelligence (AI) integration and opens many opportunities for new research. |
Tasks | Feature Engineering |
Published | 2018-11-08 |
URL | https://arxiv.org/abs/1811.03402v2 |
https://arxiv.org/pdf/1811.03402v2.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-data-collection-for-machine |
Repo | |
Framework | |