Paper Group ANR 152
Video Salient Object Detection Using Spatiotemporal Deep Features. Discovering Visual Concept Structure with Sparse and Incomplete Tags. Multi-modal analysis of genetically-related subjects using SIFT descriptors in brain MRI. Mining Object Parts from CNNs via Active Question-Answering. Pre-training Attention Mechanisms. Learning to Detect Human-Ob …
Video Salient Object Detection Using Spatiotemporal Deep Features
Title | Video Salient Object Detection Using Spatiotemporal Deep Features |
Authors | Trung-Nghia Le, Akihiro Sugimoto |
Abstract | This paper presents a method for detecting salient objects in videos where temporal information in addition to spatial information is fully taken into account. Following recent reports on the advantage of deep features over conventional hand-crafted features, we propose a new set of SpatioTemporal Deep (STD) features that utilize local and global contexts over frames. We also propose new SpatioTemporal Conditional Random Field (STCRF) to compute saliency from STD features. STCRF is our extension of CRF to the temporal domain and describes the relationships among neighboring regions both in a frame and over frames. STCRF leads to temporally consistent saliency maps over frames, contributing to the accurate detection of salient objects’ boundaries and noise reduction during detection. Our proposed method first segments an input video into multiple scales and then computes a saliency map at each scale level using STD features with STCRF. The final saliency map is computed by fusing saliency maps at different scale levels. Our experiments, using publicly available benchmark datasets, confirm that the proposed method significantly outperforms state-of-the-art methods. We also applied our saliency computation to the video object segmentation task, showing that our method outperforms existing video object segmentation methods. |
Tasks | Object Detection, Salient Object Detection, Semantic Segmentation, Video Object Segmentation, Video Salient Object Detection, Video Semantic Segmentation |
Published | 2017-08-04 |
URL | http://arxiv.org/abs/1708.01447v3 |
http://arxiv.org/pdf/1708.01447v3.pdf | |
PWC | https://paperswithcode.com/paper/video-salient-object-detection-using |
Repo | |
Framework | |
Discovering Visual Concept Structure with Sparse and Incomplete Tags
Title | Discovering Visual Concept Structure with Sparse and Incomplete Tags |
Authors | Jingya Wang, Xiatian Zhu, Shaogang Gong |
Abstract | Discovering automatically the semantic structure of tagged visual data (e.g. web videos and images) is important for visual data analysis and interpretation, enabling the machine intelligence for effectively processing the fast-growing amount of multi-media data. However, this is non-trivial due to the need for jointly learning underlying correlations between heterogeneous visual and tag data. The task is made more challenging by inherently sparse and incomplete tags. In this work, we develop a method for modelling the inherent visual data concept structures based on a novel Hierarchical-Multi-Label Random Forest model capable of correlating structured visual and tag information so as to more accurately interpret the visual semantics, e.g. disclosing meaningful visual groups with similar high-level concepts, and recovering missing tags for individual visual data samples. Specifically, our model exploits hierarchically structured tags of different semantic abstractness and multiple tag statistical correlations in addition to modelling visual and tag interactions. As a result, our model is able to discover more accurate semantic correlation between textual tags and visual features, and finally providing favourable visual semantics interpretation even with highly sparse and incomplete tags. We demonstrate the advantages of our proposed approach in two fundamental applications, visual data clustering and missing tag completion, on benchmarking video (i.e. TRECVID MED 2011) and image (i.e. NUS-WIDE) datasets. |
Tasks | |
Published | 2017-05-30 |
URL | http://arxiv.org/abs/1705.10659v1 |
http://arxiv.org/pdf/1705.10659v1.pdf | |
PWC | https://paperswithcode.com/paper/discovering-visual-concept-structure-with |
Repo | |
Framework | |
Multi-modal analysis of genetically-related subjects using SIFT descriptors in brain MRI
Title | Multi-modal analysis of genetically-related subjects using SIFT descriptors in brain MRI |
Authors | Kuldeep Kumar, Laurent Chauvin, Mathew Toews, Olivier Colliot, Christian Desrosiers |
Abstract | So far, fingerprinting studies have focused on identifying features from single-modality MRI data, which capture individual characteristics in terms of brain structure, function, or white matter microstructure. However, due to the lack of a framework for comparing across multiple modalities, studies based on multi-modal data remain elusive. This paper presents a multi-modal analysis of genetically-related subjects to compare and contrast the information provided by various MRI modalities. The proposed framework represents MRI scans as bags of SIFT features, and uses these features in a nearest-neighbor graph to measure subject similarity. Experiments using the T1/T2-weighted MRI and diffusion MRI data of 861 Human Connectome Project subjects demonstrate strong links between the proposed similarity measure and genetic proximity. |
Tasks | |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.06151v1 |
http://arxiv.org/pdf/1709.06151v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-modal-analysis-of-genetically-related |
Repo | |
Framework | |
Mining Object Parts from CNNs via Active Question-Answering
Title | Mining Object Parts from CNNs via Active Question-Answering |
Authors | Quanshi Zhang, Ruiming Cao, Ying Nian Wu, Song-Chun Zhu |
Abstract | Given a convolutional neural network (CNN) that is pre-trained for object classification, this paper proposes to use active question-answering to semanticize neural patterns in conv-layers of the CNN and mine part concepts. For each part concept, we mine neural patterns in the pre-trained CNN, which are related to the target part, and use these patterns to construct an And-Or graph (AOG) to represent a four-layer semantic hierarchy of the part. As an interpretable model, the AOG associates different CNN units with different explicit object parts. We use an active human-computer communication to incrementally grow such an AOG on the pre-trained CNN as follows. We allow the computer to actively identify objects, whose neural patterns cannot be explained by the current AOG. Then, the computer asks human about the unexplained objects, and uses the answers to automatically discover certain CNN patterns corresponding to the missing knowledge. We incrementally grow the AOG to encode new knowledge discovered during the active-learning process. In experiments, our method exhibits high learning efficiency. Our method uses about 1/6-1/3 of the part annotations for training, but achieves similar or better part-localization performance than fast-RCNN methods. |
Tasks | Active Learning, Object Classification, Question Answering |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03173v1 |
http://arxiv.org/pdf/1704.03173v1.pdf | |
PWC | https://paperswithcode.com/paper/mining-object-parts-from-cnns-via-active |
Repo | |
Framework | |
Pre-training Attention Mechanisms
Title | Pre-training Attention Mechanisms |
Authors | Jack Lindsey |
Abstract | Recurrent neural networks with differentiable attention mechanisms have had success in generative and classification tasks. We show that the classification performance of such models can be enhanced by guiding a randomly initialized model to attend to salient regions of the input in early training iterations. We further show that, if explicit heuristics for guidance are unavailable, a model that is pretrained on an unsupervised reconstruction task can discover good attention policies without supervision. We demonstrate that increased efficiency of the attention mechanism itself contributes to these performance improvements. Based on these insights, we introduce bootstrapped glimpse mimicking, a simple, theoretically task-general method of more effectively training attention models. Our work draws inspiration from and parallels results on human learning of attention. |
Tasks | |
Published | 2017-12-15 |
URL | http://arxiv.org/abs/1712.05652v1 |
http://arxiv.org/pdf/1712.05652v1.pdf | |
PWC | https://paperswithcode.com/paper/pre-training-attention-mechanisms |
Repo | |
Framework | |
Learning to Detect Human-Object Interactions
Title | Learning to Detect Human-Object Interactions |
Authors | Yu-Wei Chao, Yunfan Liu, Xieyang Liu, Huayi Zeng, Jia Deng |
Abstract | We study the problem of detecting human-object interactions (HOI) in static images, defined as predicting a human and an object bounding box with an interaction class label that connects them. HOI detection is a fundamental problem in computer vision as it provides semantic information about the interactions among the detected objects. We introduce HICO-DET, a new large benchmark for HOI detection, by augmenting the current HICO classification benchmark with instance annotations. To solve the task, we propose Human-Object Region-based Convolutional Neural Networks (HO-RCNN). At the core of our HO-RCNN is the Interaction Pattern, a novel DNN input that characterizes the spatial relations between two bounding boxes. Experiments on HICO-DET demonstrate that our HO-RCNN, by exploiting human-object spatial relations through Interaction Patterns, significantly improves the performance of HOI detection over baseline approaches. |
Tasks | Human-Object Interaction Detection |
Published | 2017-02-17 |
URL | http://arxiv.org/abs/1702.05448v2 |
http://arxiv.org/pdf/1702.05448v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-detect-human-object-interactions |
Repo | |
Framework | |
Analysis of Italian Word Embeddings
Title | Analysis of Italian Word Embeddings |
Authors | Rocco Tripodi, Stefano Li Pira |
Abstract | In this work we analyze the performances of two of the most used word embeddings algorithms, skip-gram and continuous bag of words on Italian language. These algorithms have many hyper-parameter that have to be carefully tuned in order to obtain accurate word representation in vectorial space. We provide an accurate analysis and an evaluation, showing what are the best configuration of parameters for specific tasks. |
Tasks | Word Embeddings |
Published | 2017-07-27 |
URL | http://arxiv.org/abs/1707.08783v4 |
http://arxiv.org/pdf/1707.08783v4.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-italian-word-embeddings |
Repo | |
Framework | |
Low-Rank Hidden State Embeddings for Viterbi Sequence Labeling
Title | Low-Rank Hidden State Embeddings for Viterbi Sequence Labeling |
Authors | Dung Thai, Shikhar Murty, Trapit Bansal, Luke Vilnis, David Belanger, Andrew McCallum |
Abstract | In textual information extraction and other sequence labeling tasks it is now common to use recurrent neural networks (such as LSTM) to form rich embedded representations of long-term input co-occurrence patterns. Representation of output co-occurrence patterns is typically limited to a hand-designed graphical model, such as a linear-chain CRF representing short-term Markov dependencies among successive labels. This paper presents a method that learns embedded representations of latent output structure in sequence data. Our model takes the form of a finite-state machine with a large number of latent states per label (a latent variable CRF), where the state-transition matrix is factorized—effectively forming an embedded representation of state-transitions capable of enforcing long-term label dependencies, while supporting exact Viterbi inference over output labels. We demonstrate accuracy improvements and interpretable latent structure in a synthetic but complex task based on CoNLL named entity recognition. |
Tasks | Named Entity Recognition |
Published | 2017-08-02 |
URL | http://arxiv.org/abs/1708.00553v1 |
http://arxiv.org/pdf/1708.00553v1.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-hidden-state-embeddings-for-viterbi |
Repo | |
Framework | |
Riemannian tangent space mapping and elastic net regularization for cost-effective EEG markers of brain atrophy in Alzheimer’s disease
Title | Riemannian tangent space mapping and elastic net regularization for cost-effective EEG markers of brain atrophy in Alzheimer’s disease |
Authors | Wolfgang Fruehwirt, Matthias Gerstgrasser, Pengfei Zhang, Leonard Weydemann, Markus Waser, Reinhold Schmidt, Thomas Benke, Peter Dal-Bianco, Gerhard Ransmayr, Dieter Grossegger, Heinrich Garn, Gareth W. Peters, Stephen Roberts, Georg Dorffner |
Abstract | The diagnosis of Alzheimer’s disease (AD) in routine clinical practice is most commonly based on subjective clinical interpretations. Quantitative electroencephalography (QEEG) measures have been shown to reflect neurodegenerative processes in AD and might qualify as affordable and thereby widely available markers to facilitate the objectivization of AD assessment. Here, we present a novel framework combining Riemannian tangent space mapping and elastic net regression for the development of brain atrophy markers. While most AD QEEG studies are based on small sample sizes and psychological test scores as outcome measures, here we train and test our models using data of one of the largest prospective EEG AD trials ever conducted, including MRI biomarkers of brain atrophy. |
Tasks | EEG |
Published | 2017-11-22 |
URL | http://arxiv.org/abs/1711.08359v1 |
http://arxiv.org/pdf/1711.08359v1.pdf | |
PWC | https://paperswithcode.com/paper/riemannian-tangent-space-mapping-and-elastic |
Repo | |
Framework | |
Tweeting AI: Perceptions of AI-Tweeters (AIT) vs Expert AI-Tweeters (EAIT)
Title | Tweeting AI: Perceptions of AI-Tweeters (AIT) vs Expert AI-Tweeters (EAIT) |
Authors | Lydia Manikonda, Cameron Dudley, Subbarao Kambhampati |
Abstract | With the recent advancements in Artificial Intelligence (AI), various organizations and individuals started debating about the progress of AI as a blessing or a curse for the future of the society. This paper conducts an investigation on how the public perceives the progress of AI by utilizing the data shared on Twitter. Specifically, this paper performs a comparative analysis on the understanding of users from two categories – general AI-Tweeters (AIT) and the expert AI-Tweeters (EAIT) who share posts about AI on Twitter. Our analysis revealed that users from both the categories express distinct emotions and interests towards AI. Users from both the categories regard AI as positive and are optimistic about the progress of AI but the experts are more negative than the general AI-Tweeters. Characterization of users manifested that `London’ is the popular location of users from where they tweet about AI. Tweets posted by AIT are highly retweeted than posts made by EAIT that reveals greater diffusion of information from AIT. | |
Tasks | |
Published | 2017-04-27 |
URL | http://arxiv.org/abs/1704.08389v2 |
http://arxiv.org/pdf/1704.08389v2.pdf | |
PWC | https://paperswithcode.com/paper/tweeting-ai-perceptions-of-ai-tweeters-ait-vs |
Repo | |
Framework | |
SU-RUG at the CoNLL-SIGMORPHON 2017 shared task: Morphological Inflection with Attentional Sequence-to-Sequence Models
Title | SU-RUG at the CoNLL-SIGMORPHON 2017 shared task: Morphological Inflection with Attentional Sequence-to-Sequence Models |
Authors | Robert Östling, Johannes Bjerva |
Abstract | This paper describes the Stockholm University/University of Groningen (SU-RUG) system for the SIGMORPHON 2017 shared task on morphological inflection. Our system is based on an attentional sequence-to-sequence neural network model using Long Short-Term Memory (LSTM) cells, with joint training of morphological inflection and the inverse transformation, i.e. lemmatization and morphological analysis. Our system outperforms the baseline with a large margin, and our submission ranks as the 4th best team for the track we participate in (task 1, high-resource). |
Tasks | Lemmatization, Morphological Analysis, Morphological Inflection |
Published | 2017-06-12 |
URL | http://arxiv.org/abs/1706.03499v1 |
http://arxiv.org/pdf/1706.03499v1.pdf | |
PWC | https://paperswithcode.com/paper/su-rug-at-the-conll-sigmorphon-2017-shared |
Repo | |
Framework | |
OMNIRank: Risk Quantification for P2P Platforms with Deep Learning
Title | OMNIRank: Risk Quantification for P2P Platforms with Deep Learning |
Authors | Honglun Zhang, Haiyang Wang, Xiaming Chen, Yongkun Wang, Yaohui Jin |
Abstract | P2P lending presents as an innovative and flexible alternative for conventional lending institutions like banks, where lenders and borrowers directly make transactions and benefit each other without complicated verifications. However, due to lack of specialized laws, delegated monitoring and effective managements, P2P platforms may spawn potential risks, such as withdraw failures, investigation involvements and even runaway bosses, which cause great losses to lenders and are especially serious and notorious in China. Although there are abundant public information and data available on the Internet related to P2P platforms, challenges of multi-sourcing and heterogeneity matter. In this paper, we promote a novel deep learning model, OMNIRank, which comprehends multi-dimensional features of P2P platforms for risk quantification and produces scores for ranking. We first construct a large-scale flexible crawling framework and obtain great amounts of multi-source heterogeneous data of domestic P2P platforms since 2007 from the Internet. Purifications like duplication and noise removal, null handing, format unification and fusion are applied to improve data qualities. Then we extract deep features of P2P platforms via text comprehension, topic modeling, knowledge graph and sentiment analysis, which are delivered as inputs to OMNIRank, a deep learning model for risk quantification of P2P platforms. Finally, according to rankings generated by OMNIRank, we conduct flourish data visualizations and interactions, providing lenders with comprehensive information supports, decision suggestions and safety guarantees. |
Tasks | Reading Comprehension, Sentiment Analysis |
Published | 2017-04-27 |
URL | http://arxiv.org/abs/1705.03497v1 |
http://arxiv.org/pdf/1705.03497v1.pdf | |
PWC | https://paperswithcode.com/paper/omnirank-risk-quantification-for-p2p |
Repo | |
Framework | |
Generic Tubelet Proposals for Action Localization
Title | Generic Tubelet Proposals for Action Localization |
Authors | Jiawei He, Mostafa S. Ibrahim, Zhiwei Deng, Greg Mori |
Abstract | We develop a novel framework for action localization in videos. We propose the Tube Proposal Network (TPN), which can generate generic, class-independent, video-level tubelet proposals in videos. The generated tubelet proposals can be utilized in various video analysis tasks, including recognizing and localizing actions in videos. In particular, we integrate these generic tubelet proposals into a unified temporal deep network for action classification. Compared with other methods, our generic tubelet proposal method is accurate, general, and is fully differentiable under a smoothL1 loss function. We demonstrate the performance of our algorithm on the standard UCF-Sports, J-HMDB21, and UCF-101 datasets. Our class-independent TPN outperforms other tubelet generation methods, and our unified temporal deep network achieves state-of-the-art localization results on all three datasets. |
Tasks | Action Classification, Action Localization |
Published | 2017-05-30 |
URL | http://arxiv.org/abs/1705.10861v1 |
http://arxiv.org/pdf/1705.10861v1.pdf | |
PWC | https://paperswithcode.com/paper/generic-tubelet-proposals-for-action |
Repo | |
Framework | |
DeepDeath: Learning to Predict the Underlying Cause of Death with Big Data
Title | DeepDeath: Learning to Predict the Underlying Cause of Death with Big Data |
Authors | Hamid Reza Hassanzadeh, Ying Sha, May D. Wang |
Abstract | Multiple cause-of-death data provides a valuable source of information that can be used to enhance health standards by predicting health related trajectories in societies with large populations. These data are often available in large quantities across U.S. states and require Big Data techniques to uncover complex hidden patterns. We design two different classes of models suitable for large-scale analysis of mortality data, a Hadoop-based ensemble of random forests trained over N-grams, and the DeepDeath, a deep classifier based on the recurrent neural network (RNN). We apply both classes to the mortality data provided by the National Center for Health Statistics and show that while both perform significantly better than the random classifier, the deep model that utilizes long short-term memory networks (LSTMs), surpasses the N-gram based models and is capable of learning the temporal aspect of the data without a need for building ad-hoc, expert-driven features. |
Tasks | |
Published | 2017-05-06 |
URL | http://arxiv.org/abs/1705.03508v1 |
http://arxiv.org/pdf/1705.03508v1.pdf | |
PWC | https://paperswithcode.com/paper/deepdeath-learning-to-predict-the-underlying |
Repo | |
Framework | |
Outlier-robust moment-estimation via sum-of-squares
Title | Outlier-robust moment-estimation via sum-of-squares |
Authors | Pravesh K. Kothari, David Steurer |
Abstract | We develop efficient algorithms for estimating low-degree moments of unknown distributions in the presence of adversarial outliers. The guarantees of our algorithms improve in many cases significantly over the best previous ones, obtained in recent works of Diakonikolas et al, Lai et al, and Charikar et al. We also show that the guarantees of our algorithms match information-theoretic lower-bounds for the class of distributions we consider. These improved guarantees allow us to give improved algorithms for independent component analysis and learning mixtures of Gaussians in the presence of outliers. Our algorithms are based on a standard sum-of-squares relaxation of the following conceptually-simple optimization problem: Among all distributions whose moments are bounded in the same way as for the unknown distribution, find the one that is closest in statistical distance to the empirical distribution of the adversarially-corrupted sample. |
Tasks | |
Published | 2017-11-30 |
URL | http://arxiv.org/abs/1711.11581v2 |
http://arxiv.org/pdf/1711.11581v2.pdf | |
PWC | https://paperswithcode.com/paper/outlier-robust-moment-estimation-via-sum-of |
Repo | |
Framework | |