May 6, 2019

3344 words 16 mins read

Paper Group ANR 437

Learning Hough Regression Models via Bridge Partial Least Squares for Object Detection. When Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment. Learning Network of Multivariate Hawkes Processes: A Time Series Approach. Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs. Sensor-b …

Learning Hough Regression Models via Bridge Partial Least Squares for Object Detection


Title	Learning Hough Regression Models via Bridge Partial Least Squares for Object Detection
Authors	Jianyu Tang, Hanzi Wang, Yan Yan
Abstract	Popular Hough Transform-based object detection approaches usually construct an appearance codebook by clustering local image features. However, how to choose appropriate values for the parameters used in the clustering step remains an open problem. Moreover, some popular histogram features extracted from overlapping image blocks may cause a high degree of redundancy and multicollinearity. In this paper, we propose a novel Hough Transform-based object detection approach. First, to address the above issues, we exploit a Bridge Partial Least Squares (BPLS) technique to establish context-encoded Hough Regression Models (HRMs), which are linear regression models that cast probabilistic Hough votes to predict object locations. BPLS is an efficient variant of Partial Least Squares (PLS). PLS-based regression techniques (including BPLS) can reduce the redundancy and eliminate the multicollinearity of a feature set. And the appropriate value of the only parameter used in PLS (i.e., the number of latent components) can be determined by using a cross-validation procedure. Second, to efficiently handle object scale changes, we propose a novel multi-scale voting scheme. In this scheme, multiple Hough images corresponding to multiple object scales can be obtained simultaneously. Third, an object in a test image may correspond to multiple true and false positive hypotheses at different scales. Based on the proposed multi-scale voting scheme, a principled strategy is proposed to fuse hypotheses to reduce false positives by evaluating normalized pointwise mutual information between hypotheses. In the experiments, we also compare the proposed HRM approach with its several variants to evaluate the influences of its components on its performance. Experimental results show that the proposed HRM approach has achieved desirable performances on popular benchmark datasets.
Tasks	Object Detection
Published	2016-03-26
URL	http://arxiv.org/abs/1603.08092v1
PDF	http://arxiv.org/pdf/1603.08092v1.pdf
PWC	https://paperswithcode.com/paper/learning-hough-regression-models-via-bridge
Repo
Framework

When Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment


Title	When Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment
Authors	Honglin Zheng, Tianlang Chen, Jiebo Luo
Abstract	Sentiment analysis is crucial for extracting social signals from social media content. Due to the prevalence of images in social media, image sentiment analysis is receiving increasing attention in recent years. However, most existing systems are black-boxes that do not provide insight on how image content invokes sentiment and emotion in the viewers. Psychological studies have confirmed that salient objects in an image often invoke emotions. In this work, we investigate more fine-grained and more comprehensive interaction between visual saliency and visual sentiment. In particular, we partition images in several primary scene-type dimensions, including: open-closed, natural-manmade, indoor-outdoor, and face-noface. Using state of the art saliency detection algorithm and sentiment classification algorithm, we examine how the sentiment of the salient region(s) in an image relates to the overall sentiment of the image. The experiments on a representative image emotion dataset have shown interesting correlation between saliency and sentiment in different scene types and in turn shed light on the mechanism of visual sentiment evocation.
Tasks	Saliency Detection, Sentiment Analysis
Published	2016-11-14
URL	http://arxiv.org/abs/1611.04636v1
PDF	http://arxiv.org/pdf/1611.04636v1.pdf
PWC	https://paperswithcode.com/paper/when-saliency-meets-sentiment-understanding
Repo
Framework

Learning Network of Multivariate Hawkes Processes: A Time Series Approach


Title	Learning Network of Multivariate Hawkes Processes: A Time Series Approach
Authors	Jalal Etesami, Negar Kiyavash, Kun Zhang, Kushagra Singhal
Abstract	Learning the influence structure of multiple time series data is of great interest to many disciplines. This paper studies the problem of recovering the causal structure in network of multivariate linear Hawkes processes. In such processes, the occurrence of an event in one process affects the probability of occurrence of new events in some other processes. Thus, a natural notion of causality exists between such processes captured by the support of the excitation matrix. We show that the resulting causal influence network is equivalent to the Directed Information graph (DIG) of the processes, which encodes the causal factorization of the joint distribution of the processes. Furthermore, we present an algorithm for learning the support of excitation matrix (or equivalently the DIG). The performance of the algorithm is evaluated on synthesized multivariate Hawkes networks as well as a stock market and MemeTracker real-world dataset.
Tasks	Time Series
Published	2016-03-14
URL	http://arxiv.org/abs/1603.04319v1
PDF	http://arxiv.org/pdf/1603.04319v1.pdf
PWC	https://paperswithcode.com/paper/learning-network-of-multivariate-hawkes
Repo
Framework

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs


Title	Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs
Authors	Youbao Tang, Xiangqian Wu
Abstract	This paper proposes a novel saliency detection method by combining region-level saliency estimation and pixel-level saliency prediction with CNNs (denoted as CRPSD). For pixel-level saliency prediction, a fully convolutional neural network (called pixel-level CNN) is constructed by modifying the VGGNet architecture to perform multi-scale feature learning, based on which an image-to-image prediction is conducted to accomplish the pixel-level saliency detection. For region-level saliency estimation, an adaptive superpixel based region generation technique is first designed to partition an image into regions, based on which the region-level saliency is estimated by using a CNN model (called region-level CNN). The pixel-level and region-level saliencies are fused to form the final salient map by using another CNN (called fusion CNN). And the pixel-level CNN and fusion CNN are jointly learned. Extensive quantitative and qualitative experiments on four public benchmark datasets demonstrate that the proposed method greatly outperforms the state-of-the-art saliency detection approaches.
Tasks	Saliency Detection, Saliency Prediction
Published	2016-08-18
URL	http://arxiv.org/abs/1608.05186v1
PDF	http://arxiv.org/pdf/1608.05186v1.pdf
PWC	https://paperswithcode.com/paper/saliency-detection-via-combining-region-level
Repo
Framework

Sensor-based Gait Parameter Extraction with Deep Convolutional Neural Networks


Title	Sensor-based Gait Parameter Extraction with Deep Convolutional Neural Networks
Authors	Julius Hannink, Thomas Kautz, Cristian F. Pasluosta, Karl-Günter Gaßmann, Jochen Klucken, Bjoern M. Eskofier
Abstract	Measurement of stride-related, biomechanical parameters is the common rationale for objective gait impairment scoring. State-of-the-art double integration approaches to extract these parameters from inertial sensor data are, however, limited in their clinical applicability due to the underlying assumptions. To overcome this, we present a method to translate the abstract information provided by wearable sensors to context-related expert features based on deep convolutional neural networks. Regarding mobile gait analysis, this enables integration-free and data-driven extraction of a set of 8 spatio-temporal stride parameters. To this end, two modelling approaches are compared: A combined network estimating all parameters of interest and an ensemble approach that spawns less complex networks for each parameter individually. The ensemble approach is outperforming the combined modelling in the current application. On a clinically relevant and publicly available benchmark dataset, we estimate stride length, width and medio-lateral change in foot angle up to ${-0.15\pm6.09}$ cm, ${-0.09\pm4.22}$ cm and ${0.13 \pm 3.78^\circ}$ respectively. Stride, swing and stance time as well as heel and toe contact times are estimated up to ${\pm 0.07}$, ${\pm0.05}$, ${\pm 0.07}$, ${\pm0.07}$ and ${\pm0.12}$ s respectively. This is comparable to and in parts outperforming or defining state-of-the-art. Our results further indicate that the proposed change in methodology could substitute assumption-driven double-integration methods and enable mobile assessment of spatio-temporal stride parameters in clinically critical situations as e.g. in the case of spastic gait impairments.
Tasks
Published	2016-09-12
URL	http://arxiv.org/abs/1609.03323v3
PDF	http://arxiv.org/pdf/1609.03323v3.pdf
PWC	https://paperswithcode.com/paper/sensor-based-gait-parameter-extraction-with
Repo
Framework

The Role of Context Selection in Object Detection


Title	The Role of Context Selection in Object Detection
Authors	Ruichi Yu, Xi Chen, Vlad I. Morariu, Larry S. Davis
Abstract	We investigate the reasons why context in object detection has limited utility by isolating and evaluating the predictive power of different context cues under ideal conditions in which context provided by an oracle. Based on this study, we propose a region-based context re-scoring method with dynamic context selection to remove noise and emphasize informative context. We introduce latent indicator variables to select (or ignore) potential contextual regions, and learn the selection strategy with latent-SVM. We conduct experiments to evaluate the performance of the proposed context selection method on the SUN RGB-D dataset. The method achieves a significant improvement in terms of mean average precision (mAP), compared with both appearance based detectors and a conventional context model without the selection scheme.
Tasks	Object Detection
Published	2016-09-09
URL	http://arxiv.org/abs/1609.02948v1
PDF	http://arxiv.org/pdf/1609.02948v1.pdf
PWC	https://paperswithcode.com/paper/the-role-of-context-selection-in-object
Repo
Framework

Characterizing the Language of Online Communities and its Relation to Community Reception


Title	Characterizing the Language of Online Communities and its Relation to Community Reception
Authors	Trang Tran, Mari Ostendorf
Abstract	This work investigates style and topic aspects of language in online communities: looking at both utility as an identifier of the community and correlation with community reception of content. Style is characterized using a hybrid word and part-of-speech tag n-gram language model, while topic is represented using Latent Dirichlet Allocation. Experiments with several Reddit forums show that style is a better indicator of community identity than topic, even for communities organized around specific topics. Further, there is a positive correlation between the community reception to a contribution and the style similarity to that community, but not so for topic similarity.
Tasks	Language Modelling
Published	2016-09-15
URL	http://arxiv.org/abs/1609.04779v1
PDF	http://arxiv.org/pdf/1609.04779v1.pdf
PWC	https://paperswithcode.com/paper/characterizing-the-language-of-online
Repo
Framework

Image and Depth from a Single Defocused Image Using Coded Aperture Photography


Title	Image and Depth from a Single Defocused Image Using Coded Aperture Photography
Authors	Mina Masoudifar, Hamid Reza Pourreza
Abstract	Depth from defocus and defocus deblurring from a single image are two challenging problems that are derived from the finite depth of field in conventional cameras. Coded aperture imaging is one of the techniques that is used for improving the results of these two problems. Up to now, different methods have been proposed for improving the results of either defocus deblurring or depth estimation. In this paper, a multi-objective function is proposed for evaluating and designing aperture patterns with the aim of improving the results of both depth from defocus and defocus deblurring. Pattern evaluation is performed by considering the scene illumination condition and camera system specification. Based on the proposed criteria, a single asymmetric pattern is designed that is used for restoring a sharp image and a depth map from a single input. Since the designed pattern is asymmetric, defocus objects on the two sides of the focal plane can be distinguished. Depth estimation is performed by using a new algorithm, which is based on image quality assessment criteria and can distinguish between blurred objects lying in front or behind the focal plane. Extensive simulations as well as experiments on a variety of real scenes are conducted to compare our aperture with previously proposed ones.
Tasks	Deblurring, Depth Estimation, Image Quality Assessment
Published	2016-03-13
URL	http://arxiv.org/abs/1603.04046v1
PDF	http://arxiv.org/pdf/1603.04046v1.pdf
PWC	https://paperswithcode.com/paper/image-and-depth-from-a-single-defocused-image
Repo
Framework

On clustering network-valued data


Title	On clustering network-valued data
Authors	Soumendu Sundar Mukherjee, Purnamrita Sarkar, Lizhen Lin
Abstract	Community detection, which focuses on clustering nodes or detecting communities in (mostly) a single network, is a problem of considerable practical interest and has received a great deal of attention in the research community. While being able to cluster within a network is important, there are emerging needs to be able to cluster multiple networks. This is largely motivated by the routine collection of network data that are generated from potentially different populations. These networks may or may not have node correspondence. When node correspondence is present, we cluster networks by summarizing a network by its graphon estimate, whereas when node correspondence is not present, we propose a novel solution for clustering such networks by associating a computationally feasible feature vector to each network based on trace of powers of the adjacency matrix. We illustrate our methods using both simulated and real data sets, and theoretical justifications are provided in terms of consistency.
Tasks	Community Detection
Published	2016-06-08
URL	http://arxiv.org/abs/1606.02401v3
PDF	http://arxiv.org/pdf/1606.02401v3.pdf
PWC	https://paperswithcode.com/paper/on-clustering-network-valued-data
Repo
Framework

Personalized Donor-Recipient Matching for Organ Transplantation


Title	Personalized Donor-Recipient Matching for Organ Transplantation
Authors	Jinsung Yoon, Ahmed M. Alaa, Martin Cadeiras, Mihaela van der Schaar
Abstract	Organ transplants can improve the life expectancy and quality of life for the recipient but carries the risk of serious post-operative complications, such as septic shock and organ rejection. The probability of a successful transplant depends in a very subtle fashion on compatibility between the donor and the recipient but current medical practice is short of domain knowledge regarding the complex nature of recipient-donor compatibility. Hence a data-driven approach for learning compatibility has the potential for significant improvements in match quality. This paper proposes a novel system (ConfidentMatch) that is trained using data from electronic health records. ConfidentMatch predicts the success of an organ transplant (in terms of the 3 year survival rates) on the basis of clinical and demographic traits of the donor and recipient. ConfidentMatch captures the heterogeneity of the donor and recipient traits by optimally dividing the feature space into clusters and constructing different optimal predictive models to each cluster. The system controls the complexity of the learned predictive model in a way that allows for assuring more granular and confident predictions for a larger number of potential recipient-donor pairs, thereby ensuring that predictions are “personalized” and tailored to individual characteristics to the finest possible granularity. Experiments conducted on the UNOS heart transplant dataset show the superiority of the prognostic value of ConfidentMatch to other competing benchmarks; ConfidentMatch can provide predictions of success with 95% confidence for 5,489 patients of a total population of 9,620 patients, which corresponds to 410 more patients than the most competitive benchmark algorithm (DeepBoost).
Tasks
Published	2016-11-12
URL	http://arxiv.org/abs/1611.03934v1
PDF	http://arxiv.org/pdf/1611.03934v1.pdf
PWC	https://paperswithcode.com/paper/personalized-donor-recipient-matching-for
Repo
Framework

Control of Memory, Active Perception, and Action in Minecraft


Title	Control of Memory, Active Perception, and Action in Minecraft
Authors	Junhyuk Oh, Valliappa Chockalingam, Satinder Singh, Honglak Lee
Abstract	In this paper, we introduce a new set of reinforcement learning (RL) tasks in Minecraft (a flexible 3D world). We then use these tasks to systematically compare and contrast existing deep reinforcement learning (DRL) architectures with our new memory-based DRL architectures. These tasks are designed to emphasize, in a controllable manner, issues that pose challenges for RL methods including partial observability (due to first-person visual observations), delayed rewards, high-dimensional visual observations, and the need to use active perception in a correct manner so as to perform well in the tasks. While these tasks are conceptually simple to describe, by virtue of having all of these challenges simultaneously they are difficult for current DRL architectures. Additionally, we evaluate the generalization performance of the architectures on environments not used during training. The experimental results show that our new architectures generalize to unseen environments better than existing DRL architectures.
Tasks
Published	2016-05-30
URL	http://arxiv.org/abs/1605.09128v1
PDF	http://arxiv.org/pdf/1605.09128v1.pdf
PWC	https://paperswithcode.com/paper/control-of-memory-active-perception-and
Repo
Framework

Fast Optical Flow using Dense Inverse Search


Title	Fast Optical Flow using Dense Inverse Search
Authors	Till Kroeger, Radu Timofte, Dengxin Dai, Luc Van Gool
Abstract	Most recent works in optical flow extraction focus on the accuracy and neglect the time complexity. However, in real-life visual applications, such as tracking, activity detection and recognition, the time complexity is critical. We propose a solution with very low time complexity and competitive accuracy for the computation of dense optical flow. It consists of three parts: 1) inverse search for patch correspondences; 2) dense displacement field creation through patch aggregation along multiple scales; 3) variational refinement. At the core of our Dense Inverse Search-based method (DIS) is the efficient search of correspondences inspired by the inverse compositional image alignment proposed by Baker and Matthews in 2001. DIS is competitive on standard optical flow benchmarks with large displacements. DIS runs at 300Hz up to 600Hz on a single CPU core, reaching the temporal resolution of human’s biological vision system. It is order(s) of magnitude faster than state-of-the-art methods in the same range of accuracy, making DIS ideal for visual applications.
Tasks	Action Detection, Activity Detection, Optical Flow Estimation
Published	2016-03-11
URL	http://arxiv.org/abs/1603.03590v1
PDF	http://arxiv.org/pdf/1603.03590v1.pdf
PWC	https://paperswithcode.com/paper/fast-optical-flow-using-dense-inverse-search
Repo
Framework

Iterative Hard Thresholding for Model Selection in Genome-Wide Association Studies


Title	Iterative Hard Thresholding for Model Selection in Genome-Wide Association Studies
Authors	Kevin L. Keys, Gary K. Chen, Kenneth Lange
Abstract	A genome-wide association study (GWAS) correlates marker variation with trait variation in a sample of individuals. Each study subject is genotyped at a multitude of SNPs (single nucleotide polymorphisms) spanning the genome. Here we assume that subjects are unrelated and collected at random and that trait values are normally distributed or transformed to normality. Over the past decade, researchers have been remarkably successful in applying GWAS analysis to hundreds of traits. The massive amount of data produced in these studies present unique computational challenges. Penalized regression with LASSO or MCP penalties is capable of selecting a handful of associated SNPs from millions of potential SNPs. Unfortunately, model selection can be corrupted by false positives and false negatives, obscuring the genetic underpinning of a trait. This paper introduces the iterative hard thresholding (IHT) algorithm to the GWAS analysis of continuous traits. Our parallel implementation of IHT accommodates SNP genotype compression and exploits multiple CPU cores and graphics processing units (GPUs). This allows statistical geneticists to leverage commodity desktop computers in GWAS analysis and to avoid supercomputing. We evaluate IHT performance on both simulated and real GWAS data and conclude that it reduces false positive and false negative rates while remaining competitive in computational time with penalized regression. Source code is freely available at https://github.com/klkeys/IHT.jl.
Tasks	Model Selection
Published	2016-08-04
URL	http://arxiv.org/abs/1608.01398v3
PDF	http://arxiv.org/pdf/1608.01398v3.pdf
PWC	https://paperswithcode.com/paper/iterative-hard-thresholding-for-model
Repo
Framework

Prepositional Attachment Disambiguation Using Bilingual Parsing and Alignments


Title	Prepositional Attachment Disambiguation Using Bilingual Parsing and Alignments
Authors	Geetanjali Rakshit, Sagar Sontakke, Pushpak Bhattacharyya, Gholamreza Haffari
Abstract	In this paper, we attempt to solve the problem of Prepositional Phrase (PP) attachments in English. The motivation for the work comes from NLP applications like Machine Translation, for which, getting the correct attachment of prepositions is very crucial. The idea is to correct the PP-attachments for a sentence with the help of alignments from parallel data in another language. The novelty of our work lies in the formulation of the problem into a dual decomposition based algorithm that enforces agreement between the parse trees from two languages as a constraint. Experiments were performed on the English-Hindi language pair and the performance improved by 10% over the baseline, where the baseline is the attachment predicted by the MSTParser model trained for English.
Tasks	Machine Translation
Published	2016-03-29
URL	http://arxiv.org/abs/1603.08594v1
PDF	http://arxiv.org/pdf/1603.08594v1.pdf
PWC	https://paperswithcode.com/paper/prepositional-attachment-disambiguation-using
Repo
Framework

Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering Framework


Title	Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering Framework
Authors	Chun-Guang Li, Chong You, René Vidal
Abstract	Subspace clustering refers to the problem of segmenting data drawn from a union of subspaces. State-of-the-art approaches for solving this problem follow a two-stage approach. In the first step, an affinity matrix is learned from the data using sparse or low-rank minimization techniques. In the second step, the segmentation is found by applying spectral clustering to this affinity. While this approach has led to state-of-the-art results in many applications, it is sub-optimal because it does not exploit the fact that the affinity and the segmentation depend on each other. In this paper, we propose a joint optimization framework — Structured Sparse Subspace Clustering (S$^3$C) — for learning both the affinity and the segmentation. The proposed S$^3$C framework is based on expressing each data point as a structured sparse linear combination of all other data points, where the structure is induced by a norm that depends on the unknown segmentation. Moreover, we extend the proposed S$^3$C framework into Constrained Structured Sparse Subspace Clustering (CS$^3$C) in which available partial side-information is incorporated into the stage of learning the affinity. We show that both the structured sparse representation and the segmentation can be found via a combination of an alternating direction method of multipliers with spectral clustering. Experiments on a synthetic data set, the Extended Yale B data set, the Hopkins 155 motion segmentation database, and three cancer data sets demonstrate the effectiveness of our approach.
Tasks	Motion Segmentation
Published	2016-10-17
URL	http://arxiv.org/abs/1610.05211v2
PDF	http://arxiv.org/pdf/1610.05211v2.pdf
PWC	https://paperswithcode.com/paper/structured-sparse-subspace-clustering-a-joint
Repo
Framework