October 21, 2019

3203 words 16 mins read

Paper Group AWR 153

Paper Group AWR 153

Deep Recurrent Survival Analysis. SqueezeNext: Hardware-Aware Neural Network Design. Taking a Deeper Look at the Inverse Compositional Algorithm. Deep-Net: Deep Neural Network for Cyber Security Use Cases. PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud. Spatial-Temporal Person Re-identification. Learning-based Model Predict …

Deep Recurrent Survival Analysis

Title Deep Recurrent Survival Analysis
Authors Kan Ren, Jiarui Qin, Lei Zheng, Zhengyu Yang, Weinan Zhang, Lin Qiu, Yong Yu
Abstract Survival analysis is a hotspot in statistical research for modeling time-to-event information with data censorship handling, which has been widely used in many applications such as clinical research, information system and other fields with survivorship bias. Many works have been proposed for survival analysis ranging from traditional statistic methods to machine learning models. However, the existing methodologies either utilize counting-based statistics on the segmented data, or have a pre-assumption on the event probability distribution w.r.t. time. Moreover, few works consider sequential patterns within the feature space. In this paper, we propose a Deep Recurrent Survival Analysis model which combines deep learning for conditional probability prediction at fine-grained level of the data, and survival analysis for tackling the censorship. By capturing the time dependency through modeling the conditional probability of the event for each sample, our method predicts the likelihood of the true event occurrence and estimates the survival rate over time, i.e., the probability of the non-occurrence of the event, for the censored data. Meanwhile, without assuming any specific form of the event probability distribution, our model shows great advantages over the previous works on fitting various sophisticated data distributions. In the experiments on the three real-world tasks from different fields, our model significantly outperforms the state-of-the-art solutions under various metrics.
Tasks Survival Analysis
Published 2018-09-07
URL http://arxiv.org/abs/1809.02403v2
PDF http://arxiv.org/pdf/1809.02403v2.pdf
PWC https://paperswithcode.com/paper/deep-recurrent-survival-analysis
Repo https://github.com/rk2900/drsa
Framework tf

SqueezeNext: Hardware-Aware Neural Network Design

Title SqueezeNext: Hardware-Aware Neural Network Design
Authors Amir Gholami, Kiseok Kwon, Bichen Wu, Zizheng Tai, Xiangyu Yue, Peter Jin, Sicheng Zhao, Kurt Keutzer
Abstract One of the main barriers for deploying neural networks on embedded systems has been large memory and power consumption of existing neural networks. In this work, we introduce SqueezeNext, a new family of neural network architectures whose design was guided by considering previous architectures such as SqueezeNet, as well as by simulation results on a neural network accelerator. This new network is able to match AlexNet’s accuracy on the ImageNet benchmark with $112\times$ fewer parameters, and one of its deeper variants is able to achieve VGG-19 accuracy with only 4.4 Million parameters, ($31\times$ smaller than VGG-19). SqueezeNext also achieves better top-5 classification accuracy with $1.3\times$ fewer parameters as compared to MobileNet, but avoids using depthwise-separable convolutions that are inefficient on some mobile processor platforms. This wide range of accuracy gives the user the ability to make speed-accuracy tradeoffs, depending on the available resources on the target hardware. Using hardware simulation results for power and inference speed on an embedded system has guided us to design variations of the baseline model that are $2.59\times$/$8.26\times$ faster and $2.25\times$/$7.5\times$ more energy efficient as compared to SqueezeNet/AlexNet without any accuracy degradation.
Tasks
Published 2018-03-23
URL http://arxiv.org/abs/1803.10615v2
PDF http://arxiv.org/pdf/1803.10615v2.pdf
PWC https://paperswithcode.com/paper/squeezenext-hardware-aware-neural-network
Repo https://github.com/luuuyi/SqueezeNext.PyTorch
Framework pytorch

Taking a Deeper Look at the Inverse Compositional Algorithm

Title Taking a Deeper Look at the Inverse Compositional Algorithm
Authors Zhaoyang Lv, Frank Dellaert, James M. Rehg, Andreas Geiger
Abstract In this paper, we provide a modern synthesis of the classic inverse compositional algorithm for dense image alignment. We first discuss the assumptions made by this well-established technique, and subsequently propose to relax these assumptions by incorporating data-driven priors into this model. More specifically, we unroll a robust version of the inverse compositional algorithm and replace multiple components of this algorithm using more expressive models whose parameters we train in an end-to-end fashion from data. Our experiments on several challenging 3D rigid motion estimation tasks demonstrate the advantages of combining optimization with learning-based techniques, outperforming the classic inverse compositional algorithm as well as data-driven image-to-pose regression approaches.
Tasks Motion Estimation
Published 2018-12-17
URL http://arxiv.org/abs/1812.06861v2
PDF http://arxiv.org/pdf/1812.06861v2.pdf
PWC https://paperswithcode.com/paper/taking-a-deeper-look-at-the-inverse
Repo https://github.com/lvzhaoyang/DeeperInverseCompositionalAlgorithm
Framework pytorch

Deep-Net: Deep Neural Network for Cyber Security Use Cases

Title Deep-Net: Deep Neural Network for Cyber Security Use Cases
Authors Vinayakumar R, Barathi Ganesh HB, Prabaharan Poornachandran, Anand Kumar M, Soman KP
Abstract Deep neural networks (DNNs) have witnessed as a powerful approach in this year by solving long-standing Artificial intelligence (AI) supervised and unsupervised tasks exists in natural language processing, speech processing, computer vision and others. In this paper, we attempt to apply DNNs on three different cyber security use cases: Android malware classification, incident detection and fraud detection. The data set of each use case contains real known benign and malicious activities samples. The efficient network architecture for DNN is chosen by conducting various trails of experiments for network parameters and network structures. The experiments of such chosen efficient configurations of DNNs are run up to 1000 epochs with learning rate set in the range [0.01-0.5]. Experiments of DNN performed well in comparison to the classical machine learning algorithms in all cases of experiments of cyber security use cases. This is due to the fact that DNNs implicitly extract and build better features, identifies the characteristics of the data that lead to better accuracy. The best accuracy obtained by DNN and XGBoost on Android malware classification 0.940 and 0.741, incident detection 1.00 and 0.997 fraud detection 0.972 and 0.916 respectively.
Tasks Fraud Detection, Malware Classification
Published 2018-12-09
URL http://arxiv.org/abs/1812.03519v1
PDF http://arxiv.org/pdf/1812.03519v1.pdf
PWC https://paperswithcode.com/paper/deep-net-deep-neural-network-for-cyber
Repo https://github.com/vinayakumarr/Deep-Net
Framework none

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud

Title PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud
Authors Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
Abstract In this paper, we propose PointRCNN for 3D object detection from raw point cloud. The whole framework is composed of two stages: stage-1 for the bottom-up 3D proposal generation and stage-2 for refining proposals in the canonical coordinates to obtain the final detection results. Instead of generating proposals from RGB image or projecting point cloud to bird’s view or voxels as previous methods do, our stage-1 sub-network directly generates a small number of high-quality 3D proposals from point cloud in a bottom-up manner via segmenting the point cloud of the whole scene into foreground points and background. The stage-2 sub-network transforms the pooled points of each proposal to canonical coordinates to learn better local spatial features, which is combined with global semantic features of each point learned in stage-1 for accurate box refinement and confidence prediction. Extensive experiments on the 3D detection benchmark of KITTI dataset show that our proposed architecture outperforms state-of-the-art methods with remarkable margins by using only point cloud as input. The code is available at https://github.com/sshaoshuai/PointRCNN.
Tasks 3D Object Detection, Object Detection, Object Proposal Generation
Published 2018-12-11
URL https://arxiv.org/abs/1812.04244v2
PDF https://arxiv.org/pdf/1812.04244v2.pdf
PWC https://paperswithcode.com/paper/pointrcnn-3d-object-proposal-generation-and
Repo https://github.com/sshaoshuai/Pointnet2.PyTorch
Framework pytorch

Spatial-Temporal Person Re-identification

Title Spatial-Temporal Person Re-identification
Authors Guangcong Wang, Jianhuang Lai, Peigen Huang, Xiaohua Xie
Abstract Most of current person re-identification (ReID) methods neglect a spatial-temporal constraint. Given a query image, conventional methods compute the feature distances between the query image and all the gallery images and return a similarity ranked table. When the gallery database is very large in practice, these approaches fail to obtain a good performance due to appearance ambiguity across different camera views. In this paper, we propose a novel two-stream spatial-temporal person ReID (st-ReID) framework that mines both visual semantic information and spatial-temporal information. To this end, a joint similarity metric with Logistic Smoothing (LS) is introduced to integrate two kinds of heterogeneous information into a unified framework. To approximate a complex spatial-temporal probability distribution, we develop a fast Histogram-Parzen (HP) method. With the help of the spatial-temporal constraint, the st-ReID model eliminates lots of irrelevant images and thus narrows the gallery database. Without bells and whistles, our st-ReID method achieves rank-1 accuracy of 98.1% on Market-1501 and 94.4% on DukeMTMC-reID, improving from the baselines 91.2% and 83.8%, respectively, outperforming all previous state-of-the-art methods by a large margin.
Tasks Person Re-Identification
Published 2018-12-08
URL http://arxiv.org/abs/1812.03282v1
PDF http://arxiv.org/pdf/1812.03282v1.pdf
PWC https://paperswithcode.com/paper/spatial-temporal-person-re-identification
Repo https://github.com/Wanggcong/Spatial-Temporal-Re-identification
Framework pytorch

Learning-based Model Predictive Control for Safe Exploration

Title Learning-based Model Predictive Control for Safe Exploration
Authors Torsten Koller, Felix Berkenkamp, Matteo Turchetta, Andreas Krause
Abstract Learning-based methods have been successful in solving complex control tasks without significant prior knowledge about the system. However, these methods typically do not provide any safety guarantees, which prevents their use in safety-critical, real-world applications. In this paper, we present a learning-based model predictive control scheme that can provide provable high-probability safety guarantees. To this end, we exploit regularity assumptions on the dynamics in terms of a Gaussian process prior to construct provably accurate confidence intervals on predicted trajectories. Unlike previous approaches, we do not assume that model uncertainties are independent. Based on these predictions, we guarantee that trajectories satisfy safety constraints. Moreover, we use a terminal set constraint to recursively guarantee the existence of safe control actions at every iteration. In our experiments, we show that the resulting algorithm can be used to safely and efficiently explore and learn about dynamic systems.
Tasks Safe Exploration
Published 2018-03-22
URL http://arxiv.org/abs/1803.08287v3
PDF http://arxiv.org/pdf/1803.08287v3.pdf
PWC https://paperswithcode.com/paper/learning-based-model-predictive-control-for
Repo https://github.com/befelix/safe-exploration
Framework pytorch

Pyramid Stereo Matching Network

Title Pyramid Stereo Matching Network
Authors Jia-Ren Chang, Yong-Sheng Chen
Abstract Recent work has shown that depth estimation from a stereo pair of images can be formulated as a supervised learning task to be resolved with convolutional neural networks (CNNs). However, current architectures rely on patch-based Siamese networks, lacking the means to exploit context information for finding correspondence in illposed regions. To tackle this problem, we propose PSMNet, a pyramid stereo matching network consisting of two main modules: spatial pyramid pooling and 3D CNN. The spatial pyramid pooling module takes advantage of the capacity of global context information by aggregating context in different scales and locations to form a cost volume. The 3D CNN learns to regularize cost volume using stacked multiple hourglass networks in conjunction with intermediate supervision. The proposed approach was evaluated on several benchmark datasets. Our method ranked first in the KITTI 2012 and 2015 leaderboards before March 18, 2018. The codes of PSMNet are available at: https://github.com/JiaRenChang/PSMNet.
Tasks Depth Estimation, Stereo Matching, Stereo Matching Hand
Published 2018-03-23
URL http://arxiv.org/abs/1803.08669v1
PDF http://arxiv.org/pdf/1803.08669v1.pdf
PWC https://paperswithcode.com/paper/pyramid-stereo-matching-network
Repo https://github.com/JiaRenChang/PSMNet
Framework pytorch

Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control

Title Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control
Authors Frederik Ebert, Chelsea Finn, Sudeep Dasari, Annie Xie, Alex Lee, Sergey Levine
Abstract Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains. We present a deep RL method that is practical for real-world robotics tasks, such as robotic manipulation, and generalizes effectively to never-before-seen tasks and objects. In these settings, ground truth reward signals are typically unavailable, and we therefore propose a self-supervised model-based approach, where a predictive model learns to directly predict the future from raw sensory readings, such as camera images. At test time, we explore three distinct goal specification methods: designated pixels, where a user specifies desired object manipulation tasks by selecting particular pixels in an image and corresponding goal positions, goal images, where the desired goal state is specified with an image, and image classifiers, which define spaces of goal states. Our deep predictive models are trained using data collected autonomously and continuously by a robot interacting with hundreds of objects, without human supervision. We demonstrate that visual MPC can generalize to never-before-seen objects—both rigid and deformable—and solve a range of user-defined object manipulation tasks using the same model.
Tasks
Published 2018-12-03
URL http://arxiv.org/abs/1812.00568v1
PDF http://arxiv.org/pdf/1812.00568v1.pdf
PWC https://paperswithcode.com/paper/visual-foresight-model-based-deep
Repo https://github.com/SudeepDasari/visual_foresight
Framework none

Boltzmann Encoded Adversarial Machines

Title Boltzmann Encoded Adversarial Machines
Authors Charles K. Fisher, Aaron M. Smith, Jonathan R. Walsh
Abstract Restricted Boltzmann Machines (RBMs) are a class of generative neural network that are typically trained to maximize a log-likelihood objective function. We argue that likelihood-based training strategies may fail because the objective does not sufficiently penalize models that place a high probability in regions where the training data distribution has low probability. To overcome this problem, we introduce Boltzmann Encoded Adversarial Machines (BEAMs). A BEAM is an RBM trained against an adversary that uses the hidden layer activations of the RBM to discriminate between the training data and the probability distribution generated by the model. We present experiments demonstrating that BEAMs outperform RBMs and GANs on multiple benchmarks.
Tasks
Published 2018-04-23
URL http://arxiv.org/abs/1804.08682v1
PDF http://arxiv.org/pdf/1804.08682v1.pdf
PWC https://paperswithcode.com/paper/boltzmann-encoded-adversarial-machines
Repo https://github.com/annachen/beam
Framework none

Detecting Synapse Location and Connectivity by Signed Proximity Estimation and Pruning with Deep Nets

Title Detecting Synapse Location and Connectivity by Signed Proximity Estimation and Pruning with Deep Nets
Authors Toufiq Parag, Daniel Berger, Lee Kamentsky, Benedikt Staffler, Donglai Wei, Moritz Helmstaedter, Jeff W. Lichtman, Hanspeter Pfister
Abstract Synaptic connectivity detection is a critical task for neural reconstruction from Electron Microscopy (EM) data. Most of the existing algorithms for synapse detection do not identify the cleft location and direction of connectivity simultaneously. The few methods that computes direction along with contact location have only been demonstrated to work on either dyadic (most common in vertebrate brain) or polyadic (found in fruit fly brain) synapses, but not on both types. In this paper, we present an algorithm to automatically predict the location as well as the direction of both dyadic and polyadic synapses. The proposed algorithm first generates candidate synaptic connections from voxelwise predictions of signed proximity generated by a 3D U-net. A second 3D CNN then prunes the set of candidates to produce the final detection of cleft and connectivity orientation. Experimental results demonstrate that the proposed method outperforms the existing methods for determining synapses in both rodent and fruit fly brain.
Tasks
Published 2018-07-08
URL http://arxiv.org/abs/1807.02739v2
PDF http://arxiv.org/pdf/1807.02739v2.pdf
PWC https://paperswithcode.com/paper/detecting-synapse-location-and-connectivity
Repo https://github.com/paragt/EMSynConn
Framework none

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

Title ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Authors Han Cai, Ligeng Zhu, Song Han
Abstract Neural architecture search (NAS) has a great impact by automatically designing effective neural network architectures. However, the prohibitive computational demand of conventional NAS algorithms (e.g. $10^4$ GPU hours) makes it difficult to \emph{directly} search the architectures on large-scale tasks (e.g. ImageNet). Differentiable NAS can reduce the cost of GPU hours via a continuous representation of network architecture but suffers from the high GPU memory consumption issue (grow linearly w.r.t. candidate set size). As a result, they need to utilize~\emph{proxy} tasks, such as training on a smaller dataset, or learning with only a few blocks, or training just for a few epochs. These architectures optimized on proxy tasks are not guaranteed to be optimal on the target task. In this paper, we present \emph{ProxylessNAS} that can \emph{directly} learn the architectures for large-scale target tasks and target hardware platforms. We address the high memory consumption issue of differentiable NAS and reduce the computational cost (GPU hours and GPU memory) to the same level of regular training while still allowing a large candidate set. Experiments on CIFAR-10 and ImageNet demonstrate the effectiveness of directness and specialization. On CIFAR-10, our model achieves 2.08% test error with only 5.7M parameters, better than the previous state-of-the-art architecture AmoebaNet-B, while using 6$\times$ fewer parameters. On ImageNet, our model achieves 3.1% better top-1 accuracy than MobileNetV2, while being 1.2$\times$ faster with measured GPU latency. We also apply ProxylessNAS to specialize neural architectures for hardware with direct hardware metrics (e.g. latency) and provide insights for efficient CNN architecture design.
Tasks Image Classification, Neural Architecture Search
Published 2018-12-02
URL http://arxiv.org/abs/1812.00332v2
PDF http://arxiv.org/pdf/1812.00332v2.pdf
PWC https://paperswithcode.com/paper/proxylessnas-direct-neural-architecture
Repo https://github.com/mit-han-lab/once-for-all
Framework pytorch

Coherence Models for Dialogue

Title Coherence Models for Dialogue
Authors Alessandra Cervone, Evgeny Stepanov, Giuseppe Riccardi
Abstract Coherence across multiple turns is a major challenge for state-of-the-art dialogue models. Arguably the most successful approach to automatically learning text coherence is the entity grid, which relies on modelling patterns of distribution of entities across multiple sentences of a text. Originally applied to the evaluation of automatic summaries and the news genre, among its many extensions, this model has also been successfully used to assess dialogue coherence. Nevertheless, both the original grid and its extensions do not model intents, a crucial aspect that has been studied widely in the literature in connection to dialogue structure. We propose to augment the original grid document representation for dialogue with the intentional structure of the conversation. Our models outperform the original grid representation on both text discrimination and insertion, the two main standard tasks for coherence assessment across three different dialogue datasets, confirming that intents play a key role in modelling dialogue coherence.
Tasks
Published 2018-06-21
URL http://arxiv.org/abs/1806.08044v1
PDF http://arxiv.org/pdf/1806.08044v1.pdf
PWC https://paperswithcode.com/paper/coherence-models-for-dialogue
Repo https://github.com/alecervi/Coherence-models-for-dialogue
Framework none

Monte Carlo Q-learning for General Game Playing

Title Monte Carlo Q-learning for General Game Playing
Authors Hui Wang, Michael Emmerich, Aske Plaat
Abstract After the recent groundbreaking results of AlphaGo, we have seen a strong interest in reinforcement learning in game playing. General Game Playing (GGP) provides a good testbed for reinforcement learning. In GGP, a specification of games rules is given. GGP problems can be solved by reinforcement learning. Q-learning is one of the canonical reinforcement learning methods, and has been used by (Banerjee & Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex), to allow comparison to Banerjee et al. As expected, Q-learning converges, although much slower than MCTS. Borrowing an idea from MCTS, we enhance Q-learning with Monte Carlo Search, to give QM-learning. This enhancement improves the performance of pure Q-learning. We believe that QM-learning can also be used to improve performance of reinforcement learning further for larger games, something which we will test in future work.
Tasks Board Games, Q-Learning
Published 2018-02-16
URL http://arxiv.org/abs/1802.05944v2
PDF http://arxiv.org/pdf/1802.05944v2.pdf
PWC https://paperswithcode.com/paper/monte-carlo-q-learning-for-general-game
Repo https://github.com/FrankPortman/stannis
Framework none

Efficient Proximal Mapping Computation for Unitarily Invariant Low-Rank Inducing Norms

Title Efficient Proximal Mapping Computation for Unitarily Invariant Low-Rank Inducing Norms
Authors Christian Grussler, Pontus Giselsson
Abstract Low-rank inducing unitarily invariant norms have been introduced to convexify problems with low-rank/sparsity constraint. They are the convex envelope of a unitary invariant norm and the indicator function of an upper bounding rank constraint. The most well-known member of this family is the so-called nuclear norm. To solve optimization problems involving such norms with proximal splitting methods, efficient ways of evaluating the proximal mapping of the low-rank inducing norms are needed. This is known for the nuclear norm, but not for most other members of the low-rank inducing family. This work supplies a framework that reduces the proximal mapping evaluation into a nested binary search, in which each iteration requires the solution of a much simpler problem. This simpler problem can often be solved analytically as it is demonstrated for the so-called low-rank inducing Frobenius and spectral norms. Moreover, the framework allows to compute the proximal mapping of compositions of these norms with increasing convex functions and the projections onto their epigraphs. This has the additional advantage that we can also deal with compositions of increasing convex functions and low-rank inducing norms in proximal splitting methods.
Tasks
Published 2018-10-17
URL http://arxiv.org/abs/1810.07570v1
PDF http://arxiv.org/pdf/1810.07570v1.pdf
PWC https://paperswithcode.com/paper/efficient-proximal-mapping-computation-for
Repo https://github.com/LowRankOpt/LRINorm
Framework none
comments powered by Disqus