July 26, 2019

3398 words 16 mins read

Paper Group ANR 767

HANDY: A Hybrid Association Rules Mining Approach for Network Layer Discovery of Services for Mobile Ad hoc Network. Deep Bilateral Learning for Real-Time Image Enhancement. Extensions of Morse-Smale Regression with Application to Actuarial Science. A Multi-view RGB-D Approach for Human Pose Estimation in Operating Rooms. Prior-aware Dual Decomposi …

HANDY: A Hybrid Association Rules Mining Approach for Network Layer Discovery of Services for Mobile Ad hoc Network


Title	HANDY: A Hybrid Association Rules Mining Approach for Network Layer Discovery of Services for Mobile Ad hoc Network
Authors	Noman Islam, Zubair A. Shaikh, Aqeel-ur-Rehman, Muhammad Shahab Siddiqui
Abstract	Mobile Ad hoc Network (MANET) is an infrastructure-less network formed between a set of mobile nodes. The discovery of services in MANET is a challenging job due to the unique properties of network. In this paper, a novel service discovery framework called Hybrid Association Rules Based Network Layer Discovery of Services for Ad hoc Networks (HANDY) has been proposed. HANDY provides three major research contributions. At first, it adopts a cross-layer optimized design for discovery of services that is based on simultaneous discovery of services and corresponding routes. Secondly, it provides a multi-level ontology-based approach to describe the services. This resolves the issue of semantic interoperability among the service consumers in a scalable fashion. Finally, to further optimize the performance of the discovery process, HANDY recommends exploiting the inherent associations present among the services. These associations are used in two ways. First, periodic service advertisements are performed based on these associations. In addition, when a response of a service discovery request is generated, correlated services are also attached with the response. The proposed service discovery scheme has been implemented in JIST/SWANS simulator. The results demonstrate that the proposed modifications give rise to improvement in hit ratio of the service consumers and latency of discovery process.
Tasks
Published	2017-10-03
URL	http://arxiv.org/abs/1710.02035v1
PDF	http://arxiv.org/pdf/1710.02035v1.pdf
PWC	https://paperswithcode.com/paper/handy-a-hybrid-association-rules-mining
Repo
Framework

Deep Bilateral Learning for Real-Time Image Enhancement


Title	Deep Bilateral Learning for Real-Time Image Enhancement
Authors	Michaël Gharbi, Jiawen Chen, Jonathan T. Barron, Samuel W. Hasinoff, Frédo Durand
Abstract	Performance is a critical challenge in mobile image processing. Given a reference imaging pipeline, or even human-adjusted pairs of images, we seek to reproduce the enhancements and enable real-time evaluation. For this, we introduce a new neural network architecture inspired by bilateral grid processing and local affine color transforms. Using pairs of input/output images, we train a convolutional neural network to predict the coefficients of a locally-affine model in bilateral space. Our architecture learns to make local, global, and content-dependent decisions to approximate the desired image transformation. At runtime, the neural network consumes a low-resolution version of the input image, produces a set of affine transformations in bilateral space, upsamples those transformations in an edge-preserving fashion using a new slicing node, and then applies those upsampled transformations to the full-resolution image. Our algorithm processes high-resolution images on a smartphone in milliseconds, provides a real-time viewfinder at 1080p resolution, and matches the quality of state-of-the-art approximation techniques on a large class of image operators. Unlike previous work, our model is trained off-line from data and therefore does not require access to the original operator at runtime. This allows our model to learn complex, scene-dependent transformations for which no reference implementation is available, such as the photographic edits of a human retoucher.
Tasks	Image Enhancement
Published	2017-07-10
URL	http://arxiv.org/abs/1707.02880v2
PDF	http://arxiv.org/pdf/1707.02880v2.pdf
PWC	https://paperswithcode.com/paper/deep-bilateral-learning-for-real-time-image
Repo
Framework

Extensions of Morse-Smale Regression with Application to Actuarial Science


Title	Extensions of Morse-Smale Regression with Application to Actuarial Science
Authors	Colleen M. Farrelly
Abstract	The problem of subgroups is ubiquitous in scientific research (ex. disease heterogeneity, spatial distributions in ecology…), and piecewise regression is one way to deal with this phenomenon. Morse-Smale regression offers a way to partition the regression function based on level sets of a defined function and that function’s basins of attraction. This topologically-based piecewise regression algorithm has shown promise in its initial applications, but the current implementation in the literature has been limited to elastic net and generalized linear regression. It is possible that nonparametric methods, such as random forest or conditional inference trees, may provide better prediction and insight through modeling interaction terms and other nonlinear relationships between predictors and a given outcome. This study explores the use of several machine learning algorithms within a Morse-Smale piecewise regression framework, including boosted regression with linear baselearners, homotopy-based LASSO, conditional inference trees, random forest, and a wide neural network framework called extreme learning machines. Simulations on Tweedie regression problems with varying Tweedie parameter and dispersion suggest that many machine learning approaches to Morse-Smale piecewise regression improve the original algorithm’s performance, particularly for outcomes with lower dispersion and linear or a mix of linear and nonlinear predictor relationships. On a real actuarial problem, several of these new algorithms perform as good as or better than the original Morse-Smale regression algorithm, and most provide information on the nature of predictor relationships within each partition to provide insight into differences between dataset partitions.
Tasks
Published	2017-08-17
URL	http://arxiv.org/abs/1708.05712v1
PDF	http://arxiv.org/pdf/1708.05712v1.pdf
PWC	https://paperswithcode.com/paper/extensions-of-morse-smale-regression-with
Repo
Framework

A Multi-view RGB-D Approach for Human Pose Estimation in Operating Rooms


Title	A Multi-view RGB-D Approach for Human Pose Estimation in Operating Rooms
Authors	Abdolrahim Kadkhodamohammadi, Afshin Gangi, Michel de Mathelin, Nicolas Padoy
Abstract	Many approaches have been proposed for human pose estimation in single and multi-view RGB images. However, some environments, such as the operating room, are still very challenging for state-of-the-art RGB methods. In this paper, we propose an approach for multi-view 3D human pose estimation from RGB-D images and demonstrate the benefits of using the additional depth channel for pose refinement beyond its use for the generation of improved features. The proposed method permits the joint detection and estimation of the poses without knowing a priori the number of persons present in the scene. We evaluate this approach on a novel multi-view RGB-D dataset acquired during live surgeries and annotated with ground truth 3D poses.
Tasks	3D Human Pose Estimation, Pose Estimation
Published	2017-01-25
URL	http://arxiv.org/abs/1701.07372v1
PDF	http://arxiv.org/pdf/1701.07372v1.pdf
PWC	https://paperswithcode.com/paper/a-multi-view-rgb-d-approach-for-human-pose
Repo
Framework

Prior-aware Dual Decomposition: Document-specific Topic Inference for Spectral Topic Models


Title	Prior-aware Dual Decomposition: Document-specific Topic Inference for Spectral Topic Models
Authors	Moontae Lee, David Bindel, David Mimno
Abstract	Spectral topic modeling algorithms operate on matrices/tensors of word co-occurrence statistics to learn topic-specific word distributions. This approach removes the dependence on the original documents and produces substantial gains in efficiency and provable topic inference, but at a cost: the model can no longer provide information about the topic composition of individual documents. Recently Thresholded Linear Inverse (TLI) is proposed to map the observed words of each document back to its topic composition. However, its linear characteristics limit the inference quality without considering the important prior information over topics. In this paper, we evaluate Simple Probabilistic Inverse (SPI) method and novel Prior-aware Dual Decomposition (PADD) that is capable of learning document-specific topic compositions in parallel. Experiments show that PADD successfully leverages topic correlations as a prior, notably outperforming TLI and learning quality topic compositions comparable to Gibbs sampling on various data.
Tasks	Topic Models
Published	2017-11-19
URL	http://arxiv.org/abs/1711.07065v1
PDF	http://arxiv.org/pdf/1711.07065v1.pdf
PWC	https://paperswithcode.com/paper/prior-aware-dual-decomposition-document
Repo
Framework


Title	Geometric Cross-Modal Comparison of Heterogeneous Sensor Data
Authors	Christopher J. Tralie, Abraham Smith, Nathan Borggren, Jay Hineman, Paul Bendich, Peter Zulch, John Harer
Abstract	In this work, we address the problem of cross-modal comparison of aerial data streams. A variety of simulated automobile trajectories are sensed using two different modalities: full-motion video, and radio-frequency (RF) signals received by detectors at various locations. The information represented by the two modalities is compared using self-similarity matrices (SSMs) corresponding to time-ordered point clouds in feature spaces of each of these data sources; we note that these feature spaces can be of entirely different scale and dimensionality. Several metrics for comparing SSMs are explored, including a cutting-edge time-warping technique that can simultaneously handle local time warping and partial matches, while also controlling for the change in geometry between feature spaces of the two modalities. We note that this technique is quite general, and does not depend on the choice of modalities. In this particular setting, we demonstrate that the cross-modal distance between SSMs corresponding to the same trajectory type is smaller than the cross-modal distance between SSMs corresponding to distinct trajectory types, and we formalize this observation via precision-recall metrics in experiments. Finally, we comment on promising implications of these ideas for future integration into multiple-hypothesis tracking systems.
Tasks
Published	2017-11-23
URL	http://arxiv.org/abs/1711.08569v1
PDF	http://arxiv.org/pdf/1711.08569v1.pdf
PWC	https://paperswithcode.com/paper/geometric-cross-modal-comparison-of
Repo
Framework

Weaving Multi-scale Context for Single Shot Detector


Title	Weaving Multi-scale Context for Single Shot Detector
Authors	Yunpeng Chen, Jianshu Li, Bin Zhou, Jiashi Feng, Shuicheng Yan
Abstract	Aggregating context information from multiple scales has been proved to be effective for improving accuracy of Single Shot Detectors (SSDs) on object detection. However, existing multi-scale context fusion techniques are computationally expensive, which unfavorably diminishes the advantageous speed of SSD. In this work, we propose a novel network topology, called WeaveNet, that can efficiently fuse multi-scale information and boost the detection accuracy with negligible extra cost. The proposed WeaveNet iteratively weaves context information from adjacent scales together to enable more sophisticated context reasoning while maintaining fast speed. Built by stacking light-weight blocks, WeaveNet is easy to train without requiring batch normalization and can be further accelerated by our proposed architecture simplification. Experimental results on PASCAL VOC 2007, PASCAL VOC 2012 benchmarks show signification performance boost brought by WeaveNet. For 320x320 input of batch size = 8, WeaveNet reaches 79.5% mAP on PASCAL VOC 2007 test in 101 fps with only 4 fps extra cost, and further improves to 79.7% mAP with more iterations.
Tasks	Object Detection
Published	2017-12-08
URL	http://arxiv.org/abs/1712.03149v1
PDF	http://arxiv.org/pdf/1712.03149v1.pdf
PWC	https://paperswithcode.com/paper/weaving-multi-scale-context-for-single-shot
Repo
Framework

Level Playing Field for Million Scale Face Recognition


Title	Level Playing Field for Million Scale Face Recognition
Authors	Aaron Nech, Ira Kemelmacher-Shlizerman
Abstract	Face recognition has the perception of a solved problem, however when tested at the million-scale exhibits dramatic variation in accuracies across the different algorithms. Are the algorithms very different? Is access to good/big training data their secret weapon? Where should face recognition improve? To address those questions, we created a benchmark, MF2, that requires all algorithms to be trained on same data, and tested at the million scale. MF2 is a public large-scale set with 672K identities and 4.7M photos created with the goal to level playing field for large scale face recognition. We contrast our results with findings from the other two large-scale benchmarks MegaFace Challenge and MS-Celebs-1M where groups were allowed to train on any private/public/big/small set. Some key discoveries: 1) algorithms, trained on MF2, were able to achieve state of the art and comparable results to algorithms trained on massive private sets, 2) some outperformed themselves once trained on MF2, 3) invariance to aging suffers from low accuracies as in MegaFace, identifying the need for larger age variations possibly within identities or adjustment of algorithms in future testings.
Tasks	Face Recognition
Published	2017-05-01
URL	http://arxiv.org/abs/1705.00393v1
PDF	http://arxiv.org/pdf/1705.00393v1.pdf
PWC	https://paperswithcode.com/paper/level-playing-field-for-million-scale-face
Repo
Framework

Multi-Oriented Text Detection and Verification in Video Frames and Scene Images


Title	Multi-Oriented Text Detection and Verification in Video Frames and Scene Images
Authors	Aneeshan Sain, Ayan Kumar Bhunia, Partha Pratim Roy, Umapada Pal
Abstract	In this paper, we bring forth a novel approach of video text detection using Fourier-Laplacian filtering in the frequency domain that includes a verification technique using Hidden Markov Model (HMM). The proposed approach deals with the text region appearing not only in horizontal or vertical directions, but also in any other oblique or curved orientation in the image. Until now only a few methods have been proposed that look into curved text detection in video frames, wherein lies our novelty. In our approach, we first apply Fourier-Laplacian transform on the image followed by an ideal Laplacian-Gaussian filtering. Thereafter K-means clustering is employed to obtain the asserted text areas depending on a maximum difference map. Next, the obtained connected components (CC) are skeletonized to distinguish various text strings. Complex components are disintegrated into simpler ones according to a junction removal algorithm followed by a concatenation performed on possible combination of the disjoint skeletons to obtain the corresponding text area. Finally these text hypotheses are verified using HMM-based text/non-text classification system. False positives are thus eliminated giving us a robust text detection performance. We have tested our framework in multi-oriented text lines in four scripts, namely, English, Chinese, Devanagari and Bengali, in video frames and scene texts. The results obtained show that proposed approach surpasses existing methods on text detection.
Tasks	Curved Text Detection, Text Classification
Published	2017-07-22
URL	http://arxiv.org/abs/1707.07150v2
PDF	http://arxiv.org/pdf/1707.07150v2.pdf
PWC	https://paperswithcode.com/paper/multi-oriented-text-detection-and
Repo
Framework

Image Processing Operations Identification via Convolutional Neural Network


Title	Image Processing Operations Identification via Convolutional Neural Network
Authors	Bolin Chen, Haodong Li, Weiqi Luo
Abstract	In recent years, image forensics has attracted more and more attention, and many forensic methods have been proposed for identifying image processing operations. Up to now, most existing methods are based on hand crafted features, and just one specific operation is considered in their methods. In many forensic scenarios, however, multiple classification for various image processing operations is more practical. Besides, it is difficult to obtain effective features by hand for some image processing operations. In this paper, therefore, we propose a new convolutional neural network (CNN) based method to adaptively learn discriminative features for identifying typical image processing operations. We carefully design the high pass filter bank to get the image residuals of the input image, the channel expansion layer to mix up the resulting residuals, the pooling layers, and the activation functions employed in our method. The extensive results show that the proposed method can outperform the currently best method based on hand crafted features and three related methods based on CNN for image steganalysis and/or forensics, achieving the state-of-the-art results. Furthermore, we provide more supplementary results to show the rationality and robustness of the proposed model.
Tasks
Published	2017-09-09
URL	http://arxiv.org/abs/1709.02908v1
PDF	http://arxiv.org/pdf/1709.02908v1.pdf
PWC	https://paperswithcode.com/paper/image-processing-operations-identification
Repo
Framework

Learning Wasserstein Embeddings


Title	Learning Wasserstein Embeddings
Authors	Nicolas Courty, Rémi Flamary, Mélanie Ducoffe
Abstract	The Wasserstein distance received a lot of attention recently in the community of machine learning, especially for its principled way of comparing distributions. It has found numerous applications in several hard problems, such as domain adaptation, dimensionality reduction or generative models. However, its use is still limited by a heavy computational cost. Our goal is to alleviate this problem by providing an approximation mechanism that allows to break its inherent complexity. It relies on the search of an embedding where the Euclidean distance mimics the Wasserstein distance. We show that such an embedding can be found with a siamese architecture associated with a decoder network that allows to move from the embedding space back to the original input space. Once this embedding has been found, computing optimization problems in the Wasserstein space (e.g. barycenters, principal directions or even archetypes) can be conducted extremely fast. Numerical experiments supporting this idea are conducted on image datasets, and show the wide potential benefits of our method.
Tasks	Dimensionality Reduction, Domain Adaptation
Published	2017-10-20
URL	http://arxiv.org/abs/1710.07457v1
PDF	http://arxiv.org/pdf/1710.07457v1.pdf
PWC	https://paperswithcode.com/paper/learning-wasserstein-embeddings
Repo
Framework

Efficient and Adaptive Linear Regression in Semi-Supervised Settings


Title	Efficient and Adaptive Linear Regression in Semi-Supervised Settings
Authors	Abhishek Chakrabortty, Tianxi Cai
Abstract	We consider the linear regression problem under semi-supervised settings wherein the available data typically consists of: (i) a small or moderate sized ‘labeled’ data, and (ii) a much larger sized ‘unlabeled’ data. Such data arises naturally from settings where the outcome, unlike the covariates, is expensive to obtain, a frequent scenario in modern studies involving large databases like electronic medical records (EMR). Supervised estimators like the ordinary least squares (OLS) estimator utilize only the labeled data. It is often of interest to investigate if and when the unlabeled data can be exploited to improve estimation of the regression parameter in the adopted linear model. In this paper, we propose a class of ‘Efficient and Adaptive Semi-Supervised Estimators’ (EASE) to improve estimation efficiency. The EASE are two-step estimators adaptive to model mis-specification, leading to improved (optimal in some cases) efficiency under model mis-specification, and equal (optimal) efficiency under a linear model. This adaptive property, often unaddressed in the existing literature, is crucial for advocating ‘safe’ use of the unlabeled data. The construction of EASE primarily involves a flexible ‘semi-non-parametric’ imputation, including a smoothing step that works well even when the number of covariates is not small; and a follow up ‘refitting’ step along with a cross-validation (CV) strategy both of which have useful practical as well as theoretical implications towards addressing two important issues: under-smoothing and over-fitting. We establish asymptotic results including consistency, asymptotic normality and the adaptive properties of EASE. We also provide influence function expansions and a ‘double’ CV strategy for inference. The results are further validated through extensive simulations, followed by application to an EMR study on auto-immunity.
Tasks	Imputation
Published	2017-01-17
URL	http://arxiv.org/abs/1701.04889v2
PDF	http://arxiv.org/pdf/1701.04889v2.pdf
PWC	https://paperswithcode.com/paper/efficient-and-adaptive-linear-regression-in
Repo
Framework

Sampling and Reconstruction of Graph Signals via Weak Submodularity and Semidefinite Relaxation


Title	Sampling and Reconstruction of Graph Signals via Weak Submodularity and Semidefinite Relaxation
Authors	Abolfazl Hashemi, Rasoul Shafipour, Haris Vikalo, Gonzalo Mateos
Abstract	We study the problem of sampling a bandlimited graph signal in the presence of noise, where the objective is to select a node subset of prescribed cardinality that minimizes the signal reconstruction mean squared error (MSE). To that end, we formulate the task at hand as the minimization of MSE subject to binary constraints, and approximate the resulting NP-hard problem via semidefinite programming (SDP) relaxation. Moreover, we provide an alternative formulation based on maximizing a monotone weak submodular function and propose a randomized-greedy algorithm to find a sub-optimal subset. We then derive a worst-case performance guarantee on the MSE returned by the randomized greedy algorithm for general non-stationary graph signals. The efficacy of the proposed methods is illustrated through numerical simulations on synthetic and real-world graphs. Notably, the randomized greedy algorithm yields an order-of-magnitude speedup over state-of-the-art greedy sampling schemes, while incurring only a marginal MSE performance loss.
Tasks
Published	2017-10-31
URL	http://arxiv.org/abs/1711.00142v1
PDF	http://arxiv.org/pdf/1711.00142v1.pdf
PWC	https://paperswithcode.com/paper/sampling-and-reconstruction-of-graph-signals
Repo
Framework

Matrix completion with queries


Title	Matrix completion with queries
Authors	Natali Ruchansky, Mark Crovella, Evimaria Terzi
Abstract	In many applications, e.g., recommender systems and traffic monitoring, the data comes in the form of a matrix that is only partially observed and low rank. A fundamental data-analysis task for these datasets is matrix completion, where the goal is to accurately infer the entries missing from the matrix. Even when the data satisfies the low-rank assumption, classical matrix-completion methods may output completions with significant error – in that the reconstructed matrix differs significantly from the true underlying matrix. Often, this is due to the fact that the information contained in the observed entries is insufficient. In this work, we address this problem by proposing an active version of matrix completion, where queries can be made to the true underlying matrix. Subsequently, we design Order&Extend, which is the first algorithm to unify a matrix-completion approach and a querying strategy into a single algorithm. Order&Extend is able identify and alleviate insufficient information by judiciously querying a small number of additional entries. In an extensive experimental evaluation on real-world datasets, we demonstrate that our algorithm is efficient and is able to accurately reconstruct the true matrix while asking only a small number of queries.
Tasks	Matrix Completion, Recommendation Systems
Published	2017-05-01
URL	http://arxiv.org/abs/1705.00399v1
PDF	http://arxiv.org/pdf/1705.00399v1.pdf
PWC	https://paperswithcode.com/paper/matrix-completion-with-queries
Repo
Framework

Hierarchical Label Inference for Video Classification


Title	Hierarchical Label Inference for Video Classification
Authors	Nelson Nauata, Jonathan Smith, Greg Mori
Abstract	Videos are a rich source of high-dimensional structured data, with a wide range of interacting components at varying levels of granularity. In order to improve understanding of unconstrained internet videos, it is important to consider the role of labels at separate levels of abstraction. In this paper, we consider the use of the Bidirectional Inference Neural Network (BINN) for performing graph-based inference in label space for the task of video classification. We take advantage of the inherent hierarchy between labels at increasing granularity. The BINN is evaluated on the first and second release of the YouTube-8M large scale multilabel video dataset. Our results demonstrate the effectiveness of BINN, achieving significant improvements against baseline models.
Tasks	Video Classification
Published	2017-06-15
URL	http://arxiv.org/abs/1706.05028v2
PDF	http://arxiv.org/pdf/1706.05028v2.pdf
PWC	https://paperswithcode.com/paper/hierarchical-label-inference-for-video
Repo
Framework