January 26, 2020

3378 words 16 mins read

Paper Group ANR 1494

Paper Group ANR 1494

A Neural-based Program Decompiler. The Curious Case of Machine Learning In Malware Detection. Compositional Hierarchical Tensor Factorization: Representing Hierarchical Intrinsic and Extrinsic Causal Factors. FrameRank: A Text Processing Approach to Video Summarization. EV-IMO: Motion Segmentation Dataset and Learning Pipeline for Event Cameras. Di …

A Neural-based Program Decompiler

Title A Neural-based Program Decompiler
Authors Cheng Fu, Huili Chen, Haolan Liu, Xinyun Chen, Yuandong Tian, Farinaz Koushanfar, Jishen Zhao
Abstract Reverse engineering of binary executables is a critical problem in the computer security domain. On the one hand, malicious parties may recover interpretable source codes from the software products to gain commercial advantages. On the other hand, binary decompilation can be leveraged for code vulnerability analysis and malware detection. However, efficient binary decompilation is challenging. Conventional decompilers have the following major limitations: (i) they are only applicable to specific source-target language pair, hence incurs undesired development cost for new language tasks; (ii) their output high-level code cannot effectively preserve the correct functionality of the input binary; (iii) their output program does not capture the semantics of the input and the reversed program is hard to interpret. To address the above problems, we propose Coda, the first end-to-end neural-based framework for code decompilation. Coda decomposes the decompilation task into two key phases: First, Coda employs an instruction type-aware encoder and a tree decoder for generating an abstract syntax tree (AST) with attention feeding during the code sketch generation stage. Second, Coda then updates the code sketch using an iterative error correction machine guided by an ensembled neural error predictor. By finding a good approximate candidate and then fixing it towards perfect, Coda achieves superior performance compared to baseline approaches. We assess Coda’s performance with extensive experiments on various benchmarks. Evaluation results show that Coda achieves an average of 82% program recovery accuracy on unseen binary samples, where the state-of-the-art decompilers yield 0% accuracy. Furthermore, Coda outperforms the sequence-to-sequence model with attention by a margin of 70% program accuracy.
Tasks Malware Detection
Published 2019-06-28
URL https://arxiv.org/abs/1906.12029v1
PDF https://arxiv.org/pdf/1906.12029v1.pdf
PWC https://paperswithcode.com/paper/a-neural-based-program-decompiler
Repo
Framework

The Curious Case of Machine Learning In Malware Detection

Title The Curious Case of Machine Learning In Malware Detection
Authors Sherif Saad, William Briguglio, Haytham Elmiligi
Abstract In this paper, we argue that machine learning techniques are not ready for malware detection in the wild. Given the current trend in malware development and the increase of unconventional malware attacks, we expect that dynamic malware analysis is the future for antimalware detection and prevention systems. A comprehensive review of machine learning for malware detection is presented. Then, we discuss how malware detection in the wild present unique challenges for the current state-of-the-art machine learning techniques. We defined three critical problems that limit the success of malware detectors powered by machine learning in the wild. Next, we discuss possible solutions to these challenges and present the requirements of next-generation malware detection. Finally, we outline potential research directions in machine learning for malware detection.
Tasks Malware Detection
Published 2019-05-18
URL https://arxiv.org/abs/1905.07573v1
PDF https://arxiv.org/pdf/1905.07573v1.pdf
PWC https://paperswithcode.com/paper/the-curious-case-of-machine-learning-in
Repo
Framework

Compositional Hierarchical Tensor Factorization: Representing Hierarchical Intrinsic and Extrinsic Causal Factors

Title Compositional Hierarchical Tensor Factorization: Representing Hierarchical Intrinsic and Extrinsic Causal Factors
Authors M. Alex O. Vasilescu, Eric Kim
Abstract Visual objects are composed of a recursive hierarchy of perceptual wholes and parts, whose properties, such as shape, reflectance, and color, constitute a hierarchy of intrinsic causal factors of object appearance. However, object appearance is the compositional consequence of both an object’s intrinsic and extrinsic causal factors, where the extrinsic causal factors are related to illumination, and imaging conditions. Therefore, this paper proposes a unified tensor model of wholes and parts, and introduces a compositional hierarchical tensor factorization that disentangles the hierarchical causal structure of object image formation, and subsumes multilinear block tensor decomposition as a special case. The resulting object representation is an interpretable combinatorial choice of wholes’ and parts’ representations that renders object recognition robust to occlusion and reduces training data requirements. We demonstrate ourapproach in the context of face recognition by training on an extremely reduced dataset of synthetic images, and report encouragingface verification results on two datasets - the Freiburg dataset, andthe Labeled Face in the Wild (LFW) dataset consisting of real world images, thus, substantiating the suitability of our approach for data starved domains.
Tasks Face Recognition, Object Recognition
Published 2019-11-11
URL https://arxiv.org/abs/1911.04180v2
PDF https://arxiv.org/pdf/1911.04180v2.pdf
PWC https://paperswithcode.com/paper/compositional-hierarchical-tensor
Repo
Framework

FrameRank: A Text Processing Approach to Video Summarization

Title FrameRank: A Text Processing Approach to Video Summarization
Authors Zhuo Lei, Chao Zhang, Qian Zhang, Guoping Qiu
Abstract Video summarization has been extensively studied in the past decades. However, user-generated video summarization is much less explored since there lack large-scale video datasets within which human-generated video summaries are unambiguously defined and annotated. Toward this end, we propose a user-generated video summarization dataset - UGSum52 - that consists of 52 videos (207 minutes). In constructing the dataset, because of the subjectivity of user-generated video summarization, we manually annotate 25 summaries for each video, which are in total 1300 summaries. To the best of our knowledge, it is currently the largest dataset for user-generated video summarization. Based on this dataset, we present FrameRank, an unsupervised video summarization method that employs a frame-to-frame level affinity graph to identify coherent and informative frames to summarize a video. We use the Kullback-Leibler(KL)-divergence-based graph to rank temporal segments according to the amount of semantic information contained in their frames. We illustrate the effectiveness of our method by applying it to three datasets SumMe, TVSum and UGSum52 and show it achieves state-of-the-art results.
Tasks Unsupervised Video Summarization, Video Summarization
Published 2019-04-11
URL http://arxiv.org/abs/1904.05544v2
PDF http://arxiv.org/pdf/1904.05544v2.pdf
PWC https://paperswithcode.com/paper/framerank-a-text-processing-approach-to-video
Repo
Framework

EV-IMO: Motion Segmentation Dataset and Learning Pipeline for Event Cameras

Title EV-IMO: Motion Segmentation Dataset and Learning Pipeline for Event Cameras
Authors Anton Mitrokhin, Chengxi Ye, Cornelia Fermuller, Yiannis Aloimonos, Tobi Delbruck
Abstract We present the first event-based learning approach for motion segmentation in indoor scenes and the first event-based dataset - EV-IMO - which includes accurate pixel-wise motion masks, egomotion and ground truth depth. Our approach is based on an efficient implementation of the SfM learning pipeline using a low parameter neural network architecture on event data. In addition to camera egomotion and a dense depth map, the network estimates pixel-wise independently moving object segmentation and computes per-object 3D translational velocities for moving objects. We also train a shallow network with just 40k parameters, which is able to compute depth and egomotion. Our EV-IMO dataset features 32 minutes of indoor recording with up to 3 fast moving objects simultaneously in the camera field of view. The objects and the camera are tracked by the VICON motion capture system. By 3D scanning the room and the objects, accurate depth map ground truth and pixel-wise object masks are obtained, which are reliable even in poor lighting conditions and during fast motion. We then train and evaluate our learning pipeline on EV-IMO and demonstrate that our approach far surpasses its rivals and is well suited for scene constrained robotics applications.
Tasks Motion Capture, Motion Segmentation, Semantic Segmentation
Published 2019-03-18
URL https://arxiv.org/abs/1903.07520v2
PDF https://arxiv.org/pdf/1903.07520v2.pdf
PWC https://paperswithcode.com/paper/ev-imo-motion-segmentation-dataset-and
Repo
Framework

Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction

Title Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction
Authors Nikolay Savinov, Lubor Ladicky, Christian Haene, Marc Pollefeys
Abstract Dense semantic 3D reconstruction is typically formulated as a discrete or continuous problem over label assignments in a voxel grid, combining semantic and depth likelihoods in a Markov Random Field framework. The depth and semantic information is incorporated as a unary potential, smoothed by a pairwise regularizer. However, modelling likelihoods as a unary potential does not model the problem correctly leading to various undesirable visibility artifacts. We propose to formulate an optimization problem that directly optimizes the reprojection error of the 3D model with respect to the image estimates, which corresponds to the optimization over rays, where the cost function depends on the semantic class and depth of the first occupied voxel along the ray. The 2-label formulation is made feasible by transforming it into a graph-representable form under QPBO relaxation, solvable using graph cut. The multi-label problem is solved by applying alpha-expansion using the same relaxation in each expansion move. Our method was indeed shown to be feasible in practice, running comparably fast to the competing methods, while not suffering from ray potential approximation artifacts.
Tasks 3D Reconstruction
Published 2019-06-25
URL https://arxiv.org/abs/1906.10491v1
PDF https://arxiv.org/pdf/1906.10491v1.pdf
PWC https://paperswithcode.com/paper/discrete-optimization-of-ray-potentials-for-1
Repo
Framework

Rate of convergence for geometric inference based on the empirical Christoffel function

Title Rate of convergence for geometric inference based on the empirical Christoffel function
Authors Mai Trang Vu, François Bachoc, Edouard Pauwels
Abstract We consider the problem of estimating the support of a measure from a finite, independent, sample. The estimators which are considered are constructed based on the empirical Christoffel function. Such estimators have been proposed for the problem of set estimation with heuristic justifications. We carry out a detailed finite sample analysis, that allows us to select the threshold and degree parameters as a function of the sample size. We provide a convergence rate analysis of the resulting support estimation procedure. Our analysis establishes that we may obtain finite sample bounds which are close to the minimax optimal rates. Our results rely on concentration inequalities for the empirical Christoffel function and on estimates of the supremum of the Christoffel-Darboux kernel on sets with smooth boundaries, that can be considered of independent interest.
Tasks
Published 2019-10-31
URL https://arxiv.org/abs/1910.14458v1
PDF https://arxiv.org/pdf/1910.14458v1.pdf
PWC https://paperswithcode.com/paper/rate-of-convergence-for-geometric-inference
Repo
Framework

Semi-interactive Attention Network for Answer Understanding in Reverse-QA

Title Semi-interactive Attention Network for Answer Understanding in Reverse-QA
Authors Qing Yin, Guan Luo, Xiaodong Zhu, Qinghua Hu, Ou Wu
Abstract Question answering (QA) is an important natural language processing (NLP) task and has received much attention in academic research and industry communities. Existing QA studies assume that questions are raised by humans and answers are generated by machines. Nevertheless, in many real applications, machines are also required to determine human needs or perceive human states. In such scenarios, machines may proactively raise questions and humans supply answers. Subsequently, machines should attempt to understand the true meaning of these answers. This new QA approach is called reverse-QA (rQA) throughout this paper. In this work, the human answer understanding problem is investigated and solved by classifying the answers into predefined answer-label categories (e.g., True, False, Uncertain). To explore the relationships between questions and answers, we use the interactive attention network (IAN) model and propose an improved structure called semi-interactive attention network (Semi-IAN). Two Chinese data sets for rQA are compiled. We evaluate several conventional text classification models for comparison, and experimental results indicate the promising performance of our proposed models.
Tasks Question Answering
Published 2019-01-12
URL http://arxiv.org/abs/1901.03788v1
PDF http://arxiv.org/pdf/1901.03788v1.pdf
PWC https://paperswithcode.com/paper/semi-interactive-attention-network-for-answer
Repo
Framework

Simple and Lightweight Human Pose Estimation

Title Simple and Lightweight Human Pose Estimation
Authors Zhe Zhang, Jie Tang, Gangshan Wu
Abstract Recent research on human pose estimation has achieved significant improvement. However, most existing methods tend to pursue higher scores using complex architecture or computationally expensive models on benchmark datasets, ignoring the deployment costs in practice. In this paper, we investigate the problem of simple and lightweight human pose estimation. We first redesign a lightweight bottleneck block with two non-novel concepts: depthwise convolution and attention mechanism. And then, based on the lightweight block, we present a Lightweight Pose Network (LPN) following the architecture design principles of SimpleBaseline. The model size (#Params) of our small network LPN-50 is only 9% of SimpleBaseline(ResNet50), and the computational complexity (FLOPs) is only 11%. To give full play to the potential of our LPN and get more accurate predicted results, we also propose an iterative training strategy and a model-agnostic post-processing function Beta-Soft-Argmax. We empirically demonstrate the effectiveness and efficiency of our methods on the benchmark dataset: the COCO keypoint detection dataset. Besides, we show the speed superiority of our lightweight network at inference time on a non-GPU platform. Specifically, our LPN-50 can achieve 68.7 in AP score on the COCO test-dev set, with only 2.7M parameters and 1.0 GFLOPs, while the inference speed is 17 FPS on an Intel i7-8700K CPU machine.
Tasks Keypoint Detection, Pose Estimation
Published 2019-11-23
URL https://arxiv.org/abs/1911.10346v2
PDF https://arxiv.org/pdf/1911.10346v2.pdf
PWC https://paperswithcode.com/paper/simple-and-lightweight-human-pose-estimation
Repo
Framework

A Fast Matrix-Completion-Based Approach for Recommendation Systems

Title A Fast Matrix-Completion-Based Approach for Recommendation Systems
Authors Meng Qiao, Zheng Shan, Fudong Liu, Wenjie Sun
Abstract Matrix completion is widely used in machine learning, engineering control, image processing, and recommendation systems. Currently, a popular algorithm for matrix completion is Singular Value Threshold (SVT). In this algorithm, the singular value threshold should be set first. However, in a recommendation system, the dimension of the preference matrix keeps changing. Therefore, it is difficult to directly apply SVT. In addition, what the users of a recommendation system need is a sequence of personalized recommended results rather than the estimation of their scores. According to the above ideas, this paper proposes a novel approach named probability completion model~(PCM). By reducing the data dimension, the transitivity of the similar matrix, and singular value decomposition, this approach quickly obtains a completion matrix with the same probability distribution as the original matrix. The approach greatly reduces the computation time based on the accuracy of the sacrifice part, and can quickly obtain a low-rank similarity matrix with data trend approximation properties. The experimental results show that PCM can quickly generate a complementary matrix with similar data trends as the original matrix. The LCS score and efficiency of PCM are both higher than SVT.
Tasks Matrix Completion, Recommendation Systems
Published 2019-12-02
URL https://arxiv.org/abs/1912.00600v2
PDF https://arxiv.org/pdf/1912.00600v2.pdf
PWC https://paperswithcode.com/paper/a-fast-matrix-completion-based-approach-for
Repo
Framework

Gradient Coding with Clustering and Multi-message Communication

Title Gradient Coding with Clustering and Multi-message Communication
Authors Emre Ozfatura, Deniz Gunduz, Sennur Ulukus
Abstract Gradient descent (GD) methods are commonly employed in machine learning problems to optimize the parameters of the model in an iterative fashion. For problems with massive datasets, computations are distributed to many parallel computing servers (i.e., workers) to speed up GD iterations. While distributed computing can increase the computation speed significantly, the per-iteration completion time is limited by the slowest straggling workers. Coded distributed computing can mitigate straggling workers by introducing redundant computations; however, existing coded computing schemes are mainly designed against persistent stragglers, and partial computations at straggling workers are discarded, leading to wasted computational capacity. In this paper, we propose a novel gradient coding (GC) scheme which allows multiple coded computations to be conveyed from each worker to the master per iteration. We numerically show that the proposed GC with multi-message communication (MMC) together with clustering provides significant improvements in the average completion time (of each iteration), with minimal or no increase in the communication load.
Tasks
Published 2019-03-05
URL http://arxiv.org/abs/1903.01974v1
PDF http://arxiv.org/pdf/1903.01974v1.pdf
PWC https://paperswithcode.com/paper/gradient-coding-with-clustering-and-multi
Repo
Framework

Estimation of Pelvic Sagittal Inclination from Anteroposterior Radiograph Using Convolutional Neural Networks: Proof-of-Concept Study

Title Estimation of Pelvic Sagittal Inclination from Anteroposterior Radiograph Using Convolutional Neural Networks: Proof-of-Concept Study
Authors Ata Jodeiri, Yoshito Otake, Reza A. Zoroofi, Yuta Hiasa, Masaki Takao, Keisuke Uemura, Nobuhiko Sugano, Yoshinobu Sato
Abstract Alignment of the bones in standing position provides useful information in surgical planning. In total hip arthroplasty (THA), pelvic sagittal inclination (PSI) angle in the standing position is an important factor in planning of cup alignment and has been estimated mainly from radiographs. Previous methods for PSI estimation used a patient-specific CT to create digitally reconstructed radiographs (DRRs) and compare them with the radiograph to estimate relative position between the pelvis and the x-ray detector. In this study, we developed a method that estimates PSI angle from a single anteroposterior radiograph using two convolutional neural networks (CNNs) without requiring the patient-specific CT, which reduces radiation exposure of the patient and opens up the possibility of application in a larger number of hospitals where CT is not acquired in a routine protocol.
Tasks
Published 2019-10-26
URL https://arxiv.org/abs/1910.12122v1
PDF https://arxiv.org/pdf/1910.12122v1.pdf
PWC https://paperswithcode.com/paper/estimation-of-pelvic-sagittal-inclination
Repo
Framework

kPAM-SC: Generalizable Manipulation Planning using KeyPoint Affordance and Shape Completion

Title kPAM-SC: Generalizable Manipulation Planning using KeyPoint Affordance and Shape Completion
Authors Wei Gao, Russ Tedrake
Abstract Manipulation planning is the task of computing robot trajectories that move a set of objects to their target configuration while satisfying physically feasibility. In contrast to existing works that assume known object templates, we are interested in manipulation planning for a category of objects with potentially unknown instances and large intra-category shape variation. To achieve it, we need an object representation with which the manipulation planner can reason about both the physical feasibility and desired object configuration, while being generalizable to novel instances. The widely-used pose representation is not suitable, as representing an object with a parameterized transformation from a fixed template cannot capture large intra-category shape variation. Hence, we propose a new hybrid object representation consisting of semantic keypoint and dense geometry (a point cloud or mesh) as the interface between the perception module and motion planner. Leveraging advances in learning-based keypoint detection and shape completion, both dense geometry and keypoints can be perceived from raw sensor input. Using the proposed hybrid object representation, we formulate the manipulation task as a motion planning problem which encodes both the object target configuration and physical feasibility for a category of objects. In this way, many existing manipulation planners can be generalized to categories of objects, and the resulting perception-to-action manipulation pipeline is robust to large intra-category shape variation. Extensive hardware experiments demonstrate our pipeline can produce robot trajectories that accomplish tasks with never-before-seen objects.
Tasks Keypoint Detection, Motion Planning
Published 2019-09-16
URL https://arxiv.org/abs/1909.06980v1
PDF https://arxiv.org/pdf/1909.06980v1.pdf
PWC https://paperswithcode.com/paper/kpam-sc-generalizable-manipulation-planning
Repo
Framework

DNANet: De-Normalized Attention Based Multi-Resolution Network for Human Pose Estimation

Title DNANet: De-Normalized Attention Based Multi-Resolution Network for Human Pose Estimation
Authors Kun Zhang, Peng He, Ping Yao, Ge Chen, Chuanguang Yang, Huimin Li, Li Fu, Tianyao Zheng
Abstract Recently, multi-resolution networks (such as Hourglass, CPN, HRNet, etc.) have achieved significant performance on the task of human pose estimation by combining features from various resolutions. In this paper, we propose a novel type of attention module, namely De-Normalized Attention (DNA) to deal with the feature attenuations of conventional attention modules. Our method extends the original HRNet with spatial, channel-wise and resolution-wise DNAs, which aims at evaluating the importance of features from different locations, channels and resolutions to enhance the network capability for feature representation. We also propose to add fine-to-coarse connections across high-to-low resolutions in-side each layer of HRNet to increase the maximum depth of network topology. In addition, we propose to modify the keypoint regressor at the end of HRNet for accurate keypoint heatmap prediction. The effectiveness of our proposed network is demonstrated on COCO keypoint detection dataset, achieving state-of-the-art performance at 77.9 AP score on COCO val2017 dataset and 77.0 on test-dev 2017 dataset without using extra keypoint training data. Our paper will be accompanied with publicly available codes at GitHub.
Tasks Keypoint Detection, Pose Estimation
Published 2019-09-11
URL https://arxiv.org/abs/1909.05090v3
PDF https://arxiv.org/pdf/1909.05090v3.pdf
PWC https://paperswithcode.com/paper/dnanet-de-normalized-attention-based-multi
Repo
Framework

In Defense of Uniform Convergence: Generalization via derandomization with an application to interpolating predictors

Title In Defense of Uniform Convergence: Generalization via derandomization with an application to interpolating predictors
Authors Jeffrey Negrea, Gintare Karolina Dziugaite, Daniel M. Roy
Abstract We propose to study the generalization error of a learned predictor $\hat h$ in terms of that of a surrogate (potentially randomized) predictor that is coupled to $\hat h$ and designed to trade empirical risk for control of generalization error. In the case where $\hat h$ interpolates the data, it is interesting to consider theoretical surrogate classifiers that are partially derandomized or rerandomized, e.g., fit to the training data but with modified label noise. We also show that replacing $\hat h$ by its conditional distribution with respect to an arbitrary $\sigma$-field is a convenient way to derandomize. We study two examples, inspired by the work of Nagarajan and Kolter (2019) and Bartlett et al. (2019), where the learned classifier $\hat h$ interpolates the training data with high probability, has small risk, and, yet, does not belong to a nonrandom class with a tight uniform bound on two-sided generalization error. At the same time, we bound the risk of $\hat h$ in terms of surrogates constructed by conditioning and denoising, respectively, and shown to belong to nonrandom classes with uniformly small generalization error.
Tasks Denoising
Published 2019-12-09
URL https://arxiv.org/abs/1912.04265v2
PDF https://arxiv.org/pdf/1912.04265v2.pdf
PWC https://paperswithcode.com/paper/in-defense-of-uniform-convergence
Repo
Framework
comments powered by Disqus