May 6, 2019

3344 words 16 mins read

Paper Group ANR 437

Paper Group ANR 437

Learning Hough Regression Models via Bridge Partial Least Squares for Object Detection. When Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment. Learning Network of Multivariate Hawkes Processes: A Time Series Approach. Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs. Sensor-b …

Learning Hough Regression Models via Bridge Partial Least Squares for Object Detection

Title Learning Hough Regression Models via Bridge Partial Least Squares for Object Detection
Authors Jianyu Tang, Hanzi Wang, Yan Yan
Abstract Popular Hough Transform-based object detection approaches usually construct an appearance codebook by clustering local image features. However, how to choose appropriate values for the parameters used in the clustering step remains an open problem. Moreover, some popular histogram features extracted from overlapping image blocks may cause a high degree of redundancy and multicollinearity. In this paper, we propose a novel Hough Transform-based object detection approach. First, to address the above issues, we exploit a Bridge Partial Least Squares (BPLS) technique to establish context-encoded Hough Regression Models (HRMs), which are linear regression models that cast probabilistic Hough votes to predict object locations. BPLS is an efficient variant of Partial Least Squares (PLS). PLS-based regression techniques (including BPLS) can reduce the redundancy and eliminate the multicollinearity of a feature set. And the appropriate value of the only parameter used in PLS (i.e., the number of latent components) can be determined by using a cross-validation procedure. Second, to efficiently handle object scale changes, we propose a novel multi-scale voting scheme. In this scheme, multiple Hough images corresponding to multiple object scales can be obtained simultaneously. Third, an object in a test image may correspond to multiple true and false positive hypotheses at different scales. Based on the proposed multi-scale voting scheme, a principled strategy is proposed to fuse hypotheses to reduce false positives by evaluating normalized pointwise mutual information between hypotheses. In the experiments, we also compare the proposed HRM approach with its several variants to evaluate the influences of its components on its performance. Experimental results show that the proposed HRM approach has achieved desirable performances on popular benchmark datasets.
Tasks Object Detection
Published 2016-03-26
URL http://arxiv.org/abs/1603.08092v1
PDF http://arxiv.org/pdf/1603.08092v1.pdf
PWC https://paperswithcode.com/paper/learning-hough-regression-models-via-bridge
Repo
Framework

When Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment

Title When Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment
Authors Honglin Zheng, Tianlang Chen, Jiebo Luo
Abstract Sentiment analysis is crucial for extracting social signals from social media content. Due to the prevalence of images in social media, image sentiment analysis is receiving increasing attention in recent years. However, most existing systems are black-boxes that do not provide insight on how image content invokes sentiment and emotion in the viewers. Psychological studies have confirmed that salient objects in an image often invoke emotions. In this work, we investigate more fine-grained and more comprehensive interaction between visual saliency and visual sentiment. In particular, we partition images in several primary scene-type dimensions, including: open-closed, natural-manmade, indoor-outdoor, and face-noface. Using state of the art saliency detection algorithm and sentiment classification algorithm, we examine how the sentiment of the salient region(s) in an image relates to the overall sentiment of the image. The experiments on a representative image emotion dataset have shown interesting correlation between saliency and sentiment in different scene types and in turn shed light on the mechanism of visual sentiment evocation.
Tasks Saliency Detection, Sentiment Analysis
Published 2016-11-14
URL http://arxiv.org/abs/1611.04636v1
PDF http://arxiv.org/pdf/1611.04636v1.pdf
PWC https://paperswithcode.com/paper/when-saliency-meets-sentiment-understanding
Repo
Framework

Learning Network of Multivariate Hawkes Processes: A Time Series Approach

Title Learning Network of Multivariate Hawkes Processes: A Time Series Approach
Authors Jalal Etesami, Negar Kiyavash, Kun Zhang, Kushagra Singhal
Abstract Learning the influence structure of multiple time series data is of great interest to many disciplines. This paper studies the problem of recovering the causal structure in network of multivariate linear Hawkes processes. In such processes, the occurrence of an event in one process affects the probability of occurrence of new events in some other processes. Thus, a natural notion of causality exists between such processes captured by the support of the excitation matrix. We show that the resulting causal influence network is equivalent to the Directed Information graph (DIG) of the processes, which encodes the causal factorization of the joint distribution of the processes. Furthermore, we present an algorithm for learning the support of excitation matrix (or equivalently the DIG). The performance of the algorithm is evaluated on synthesized multivariate Hawkes networks as well as a stock market and MemeTracker real-world dataset.
Tasks Time Series
Published 2016-03-14
URL http://arxiv.org/abs/1603.04319v1
PDF http://arxiv.org/pdf/1603.04319v1.pdf
PWC https://paperswithcode.com/paper/learning-network-of-multivariate-hawkes
Repo
Framework

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

Title Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs
Authors Youbao Tang, Xiangqian Wu
Abstract This paper proposes a novel saliency detection method by combining region-level saliency estimation and pixel-level saliency prediction with CNNs (denoted as CRPSD). For pixel-level saliency prediction, a fully convolutional neural network (called pixel-level CNN) is constructed by modifying the VGGNet architecture to perform multi-scale feature learning, based on which an image-to-image prediction is conducted to accomplish the pixel-level saliency detection. For region-level saliency estimation, an adaptive superpixel based region generation technique is first designed to partition an image into regions, based on which the region-level saliency is estimated by using a CNN model (called region-level CNN). The pixel-level and region-level saliencies are fused to form the final salient map by using another CNN (called fusion CNN). And the pixel-level CNN and fusion CNN are jointly learned. Extensive quantitative and qualitative experiments on four public benchmark datasets demonstrate that the proposed method greatly outperforms the state-of-the-art saliency detection approaches.
Tasks Saliency Detection, Saliency Prediction
Published 2016-08-18
URL http://arxiv.org/abs/1608.05186v1
PDF http://arxiv.org/pdf/1608.05186v1.pdf
PWC https://paperswithcode.com/paper/saliency-detection-via-combining-region-level
Repo
Framework

Sensor-based Gait Parameter Extraction with Deep Convolutional Neural Networks

Title Sensor-based Gait Parameter Extraction with Deep Convolutional Neural Networks
Authors Julius Hannink, Thomas Kautz, Cristian F. Pasluosta, Karl-Günter Gaßmann, Jochen Klucken, Bjoern M. Eskofier
Abstract Measurement of stride-related, biomechanical parameters is the common rationale for objective gait impairment scoring. State-of-the-art double integration approaches to extract these parameters from inertial sensor data are, however, limited in their clinical applicability due to the underlying assumptions. To overcome this, we present a method to translate the abstract information provided by wearable sensors to context-related expert features based on deep convolutional neural networks. Regarding mobile gait analysis, this enables integration-free and data-driven extraction of a set of 8 spatio-temporal stride parameters. To this end, two modelling approaches are compared: A combined network estimating all parameters of interest and an ensemble approach that spawns less complex networks for each parameter individually. The ensemble approach is outperforming the combined modelling in the current application. On a clinically relevant and publicly available benchmark dataset, we estimate stride length, width and medio-lateral change in foot angle up to ${-0.15\pm6.09}$ cm, ${-0.09\pm4.22}$ cm and ${0.13 \pm 3.78^\circ}$ respectively. Stride, swing and stance time as well as heel and toe contact times are estimated up to ${\pm 0.07}$, ${\pm0.05}$, ${\pm 0.07}$, ${\pm0.07}$ and ${\pm0.12}$ s respectively. This is comparable to and in parts outperforming or defining state-of-the-art. Our results further indicate that the proposed change in methodology could substitute assumption-driven double-integration methods and enable mobile assessment of spatio-temporal stride parameters in clinically critical situations as e.g. in the case of spastic gait impairments.
Tasks
Published 2016-09-12
URL http://arxiv.org/abs/1609.03323v3
PDF http://arxiv.org/pdf/1609.03323v3.pdf
PWC https://paperswithcode.com/paper/sensor-based-gait-parameter-extraction-with
Repo
Framework

The Role of Context Selection in Object Detection

Title The Role of Context Selection in Object Detection
Authors Ruichi Yu, Xi Chen, Vlad I. Morariu, Larry S. Davis
Abstract We investigate the reasons why context in object detection has limited utility by isolating and evaluating the predictive power of different context cues under ideal conditions in which context provided by an oracle. Based on this study, we propose a region-based context re-scoring method with dynamic context selection to remove noise and emphasize informative context. We introduce latent indicator variables to select (or ignore) potential contextual regions, and learn the selection strategy with latent-SVM. We conduct experiments to evaluate the performance of the proposed context selection method on the SUN RGB-D dataset. The method achieves a significant improvement in terms of mean average precision (mAP), compared with both appearance based detectors and a conventional context model without the selection scheme.
Tasks Object Detection
Published 2016-09-09
URL http://arxiv.org/abs/1609.02948v1
PDF http://arxiv.org/pdf/1609.02948v1.pdf
PWC https://paperswithcode.com/paper/the-role-of-context-selection-in-object
Repo
Framework

Characterizing the Language of Online Communities and its Relation to Community Reception

Title Characterizing the Language of Online Communities and its Relation to Community Reception
Authors Trang Tran, Mari Ostendorf
Abstract This work investigates style and topic aspects of language in online communities: looking at both utility as an identifier of the community and correlation with community reception of content. Style is characterized using a hybrid word and part-of-speech tag n-gram language model, while topic is represented using Latent Dirichlet Allocation. Experiments with several Reddit forums show that style is a better indicator of community identity than topic, even for communities organized around specific topics. Further, there is a positive correlation between the community reception to a contribution and the style similarity to that community, but not so for topic similarity.
Tasks Language Modelling
Published 2016-09-15
URL http://arxiv.org/abs/1609.04779v1
PDF http://arxiv.org/pdf/1609.04779v1.pdf
PWC https://paperswithcode.com/paper/characterizing-the-language-of-online
Repo
Framework

Image and Depth from a Single Defocused Image Using Coded Aperture Photography

Title Image and Depth from a Single Defocused Image Using Coded Aperture Photography
Authors Mina Masoudifar, Hamid Reza Pourreza
Abstract Depth from defocus and defocus deblurring from a single image are two challenging problems that are derived from the finite depth of field in conventional cameras. Coded aperture imaging is one of the techniques that is used for improving the results of these two problems. Up to now, different methods have been proposed for improving the results of either defocus deblurring or depth estimation. In this paper, a multi-objective function is proposed for evaluating and designing aperture patterns with the aim of improving the results of both depth from defocus and defocus deblurring. Pattern evaluation is performed by considering the scene illumination condition and camera system specification. Based on the proposed criteria, a single asymmetric pattern is designed that is used for restoring a sharp image and a depth map from a single input. Since the designed pattern is asymmetric, defocus objects on the two sides of the focal plane can be distinguished. Depth estimation is performed by using a new algorithm, which is based on image quality assessment criteria and can distinguish between blurred objects lying in front or behind the focal plane. Extensive simulations as well as experiments on a variety of real scenes are conducted to compare our aperture with previously proposed ones.
Tasks Deblurring, Depth Estimation, Image Quality Assessment
Published 2016-03-13
URL http://arxiv.org/abs/1603.04046v1
PDF http://arxiv.org/pdf/1603.04046v1.pdf
PWC https://paperswithcode.com/paper/image-and-depth-from-a-single-defocused-image
Repo
Framework

On clustering network-valued data

Title On clustering network-valued data
Authors Soumendu Sundar Mukherjee, Purnamrita Sarkar, Lizhen Lin
Abstract Community detection, which focuses on clustering nodes or detecting communities in (mostly) a single network, is a problem of considerable practical interest and has received a great deal of attention in the research community. While being able to cluster within a network is important, there are emerging needs to be able to cluster multiple networks. This is largely motivated by the routine collection of network data that are generated from potentially different populations. These networks may or may not have node correspondence. When node correspondence is present, we cluster networks by summarizing a network by its graphon estimate, whereas when node correspondence is not present, we propose a novel solution for clustering such networks by associating a computationally feasible feature vector to each network based on trace of powers of the adjacency matrix. We illustrate our methods using both simulated and real data sets, and theoretical justifications are provided in terms of consistency.
Tasks Community Detection
Published 2016-06-08
URL http://arxiv.org/abs/1606.02401v3
PDF http://arxiv.org/pdf/1606.02401v3.pdf
PWC https://paperswithcode.com/paper/on-clustering-network-valued-data
Repo
Framework

Personalized Donor-Recipient Matching for Organ Transplantation

Title Personalized Donor-Recipient Matching for Organ Transplantation
Authors Jinsung Yoon, Ahmed M. Alaa, Martin Cadeiras, Mihaela van der Schaar
Abstract Organ transplants can improve the life expectancy and quality of life for the recipient but carries the risk of serious post-operative complications, such as septic shock and organ rejection. The probability of a successful transplant depends in a very subtle fashion on compatibility between the donor and the recipient but current medical practice is short of domain knowledge regarding the complex nature of recipient-donor compatibility. Hence a data-driven approach for learning compatibility has the potential for significant improvements in match quality. This paper proposes a novel system (ConfidentMatch) that is trained using data from electronic health records. ConfidentMatch predicts the success of an organ transplant (in terms of the 3 year survival rates) on the basis of clinical and demographic traits of the donor and recipient. ConfidentMatch captures the heterogeneity of the donor and recipient traits by optimally dividing the feature space into clusters and constructing different optimal predictive models to each cluster. The system controls the complexity of the learned predictive model in a way that allows for assuring more granular and confident predictions for a larger number of potential recipient-donor pairs, thereby ensuring that predictions are “personalized” and tailored to individual characteristics to the finest possible granularity. Experiments conducted on the UNOS heart transplant dataset show the superiority of the prognostic value of ConfidentMatch to other competing benchmarks; ConfidentMatch can provide predictions of success with 95% confidence for 5,489 patients of a total population of 9,620 patients, which corresponds to 410 more patients than the most competitive benchmark algorithm (DeepBoost).
Tasks
Published 2016-11-12
URL http://arxiv.org/abs/1611.03934v1
PDF http://arxiv.org/pdf/1611.03934v1.pdf
PWC https://paperswithcode.com/paper/personalized-donor-recipient-matching-for
Repo
Framework

Control of Memory, Active Perception, and Action in Minecraft

Title Control of Memory, Active Perception, and Action in Minecraft
Authors Junhyuk Oh, Valliappa Chockalingam, Satinder Singh, Honglak Lee
Abstract In this paper, we introduce a new set of reinforcement learning (RL) tasks in Minecraft (a flexible 3D world). We then use these tasks to systematically compare and contrast existing deep reinforcement learning (DRL) architectures with our new memory-based DRL architectures. These tasks are designed to emphasize, in a controllable manner, issues that pose challenges for RL methods including partial observability (due to first-person visual observations), delayed rewards, high-dimensional visual observations, and the need to use active perception in a correct manner so as to perform well in the tasks. While these tasks are conceptually simple to describe, by virtue of having all of these challenges simultaneously they are difficult for current DRL architectures. Additionally, we evaluate the generalization performance of the architectures on environments not used during training. The experimental results show that our new architectures generalize to unseen environments better than existing DRL architectures.
Tasks
Published 2016-05-30
URL http://arxiv.org/abs/1605.09128v1
PDF http://arxiv.org/pdf/1605.09128v1.pdf
PWC https://paperswithcode.com/paper/control-of-memory-active-perception-and
Repo
Framework
Title Fast Optical Flow using Dense Inverse Search
Authors Till Kroeger, Radu Timofte, Dengxin Dai, Luc Van Gool
Abstract Most recent works in optical flow extraction focus on the accuracy and neglect the time complexity. However, in real-life visual applications, such as tracking, activity detection and recognition, the time complexity is critical. We propose a solution with very low time complexity and competitive accuracy for the computation of dense optical flow. It consists of three parts: 1) inverse search for patch correspondences; 2) dense displacement field creation through patch aggregation along multiple scales; 3) variational refinement. At the core of our Dense Inverse Search-based method (DIS) is the efficient search of correspondences inspired by the inverse compositional image alignment proposed by Baker and Matthews in 2001. DIS is competitive on standard optical flow benchmarks with large displacements. DIS runs at 300Hz up to 600Hz on a single CPU core, reaching the temporal resolution of human’s biological vision system. It is order(s) of magnitude faster than state-of-the-art methods in the same range of accuracy, making DIS ideal for visual applications.
Tasks Action Detection, Activity Detection, Optical Flow Estimation
Published 2016-03-11
URL http://arxiv.org/abs/1603.03590v1
PDF http://arxiv.org/pdf/1603.03590v1.pdf
PWC https://paperswithcode.com/paper/fast-optical-flow-using-dense-inverse-search
Repo
Framework

Iterative Hard Thresholding for Model Selection in Genome-Wide Association Studies

Title Iterative Hard Thresholding for Model Selection in Genome-Wide Association Studies
Authors Kevin L. Keys, Gary K. Chen, Kenneth Lange
Abstract A genome-wide association study (GWAS) correlates marker variation with trait variation in a sample of individuals. Each study subject is genotyped at a multitude of SNPs (single nucleotide polymorphisms) spanning the genome. Here we assume that subjects are unrelated and collected at random and that trait values are normally distributed or transformed to normality. Over the past decade, researchers have been remarkably successful in applying GWAS analysis to hundreds of traits. The massive amount of data produced in these studies present unique computational challenges. Penalized regression with LASSO or MCP penalties is capable of selecting a handful of associated SNPs from millions of potential SNPs. Unfortunately, model selection can be corrupted by false positives and false negatives, obscuring the genetic underpinning of a trait. This paper introduces the iterative hard thresholding (IHT) algorithm to the GWAS analysis of continuous traits. Our parallel implementation of IHT accommodates SNP genotype compression and exploits multiple CPU cores and graphics processing units (GPUs). This allows statistical geneticists to leverage commodity desktop computers in GWAS analysis and to avoid supercomputing. We evaluate IHT performance on both simulated and real GWAS data and conclude that it reduces false positive and false negative rates while remaining competitive in computational time with penalized regression. Source code is freely available at https://github.com/klkeys/IHT.jl.
Tasks Model Selection
Published 2016-08-04
URL http://arxiv.org/abs/1608.01398v3
PDF http://arxiv.org/pdf/1608.01398v3.pdf
PWC https://paperswithcode.com/paper/iterative-hard-thresholding-for-model
Repo
Framework

Prepositional Attachment Disambiguation Using Bilingual Parsing and Alignments

Title Prepositional Attachment Disambiguation Using Bilingual Parsing and Alignments
Authors Geetanjali Rakshit, Sagar Sontakke, Pushpak Bhattacharyya, Gholamreza Haffari
Abstract In this paper, we attempt to solve the problem of Prepositional Phrase (PP) attachments in English. The motivation for the work comes from NLP applications like Machine Translation, for which, getting the correct attachment of prepositions is very crucial. The idea is to correct the PP-attachments for a sentence with the help of alignments from parallel data in another language. The novelty of our work lies in the formulation of the problem into a dual decomposition based algorithm that enforces agreement between the parse trees from two languages as a constraint. Experiments were performed on the English-Hindi language pair and the performance improved by 10% over the baseline, where the baseline is the attachment predicted by the MSTParser model trained for English.
Tasks Machine Translation
Published 2016-03-29
URL http://arxiv.org/abs/1603.08594v1
PDF http://arxiv.org/pdf/1603.08594v1.pdf
PWC https://paperswithcode.com/paper/prepositional-attachment-disambiguation-using
Repo
Framework

Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering Framework

Title Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering Framework
Authors Chun-Guang Li, Chong You, René Vidal
Abstract Subspace clustering refers to the problem of segmenting data drawn from a union of subspaces. State-of-the-art approaches for solving this problem follow a two-stage approach. In the first step, an affinity matrix is learned from the data using sparse or low-rank minimization techniques. In the second step, the segmentation is found by applying spectral clustering to this affinity. While this approach has led to state-of-the-art results in many applications, it is sub-optimal because it does not exploit the fact that the affinity and the segmentation depend on each other. In this paper, we propose a joint optimization framework — Structured Sparse Subspace Clustering (S$^3$C) — for learning both the affinity and the segmentation. The proposed S$^3$C framework is based on expressing each data point as a structured sparse linear combination of all other data points, where the structure is induced by a norm that depends on the unknown segmentation. Moreover, we extend the proposed S$^3$C framework into Constrained Structured Sparse Subspace Clustering (CS$^3$C) in which available partial side-information is incorporated into the stage of learning the affinity. We show that both the structured sparse representation and the segmentation can be found via a combination of an alternating direction method of multipliers with spectral clustering. Experiments on a synthetic data set, the Extended Yale B data set, the Hopkins 155 motion segmentation database, and three cancer data sets demonstrate the effectiveness of our approach.
Tasks Motion Segmentation
Published 2016-10-17
URL http://arxiv.org/abs/1610.05211v2
PDF http://arxiv.org/pdf/1610.05211v2.pdf
PWC https://paperswithcode.com/paper/structured-sparse-subspace-clustering-a-joint
Repo
Framework
comments powered by Disqus