October 19, 2019

3346 words 16 mins read

Paper Group ANR 347

Paper Group ANR 347

Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation. FaceShop: Deep Sketch-based Face Image Editing. Combining Axiom Injection and Knowledge Base Completion for Efficient Natural Language Inference. Auto-tuning TensorFlow Threading Model for CPU Backend. L1-(2D)2PCANet: A Deep Learning Network for Face Recognition …

Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation

Title Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation
Authors Peter Karkus, David Hsu, Wee Sun Lee
Abstract We propose to take a novel approach to robot system design where each building block of a larger system is represented as a differentiable program, i.e. a deep neural network. This representation allows for integrating algorithmic planning and deep learning in a principled manner, and thus combine the benefits of model-free and model-based methods. We apply the proposed approach to a challenging partially observable robot navigation task. The robot must navigate to a goal in a previously unseen 3-D environment without knowing its initial location, and instead relying on a 2-D floor map and visual observations from an onboard camera. We introduce the Navigation Networks (NavNets) that encode state estimation, planning and acting in a single, end-to-end trainable recurrent neural network. In preliminary simulation experiments we successfully trained navigation networks to solve the challenging partially observable navigation task.
Tasks Robot Navigation
Published 2018-07-17
URL http://arxiv.org/abs/1807.06696v1
PDF http://arxiv.org/pdf/1807.06696v1.pdf
PWC https://paperswithcode.com/paper/integrating-algorithmic-planning-and-deep
Repo
Framework

FaceShop: Deep Sketch-based Face Image Editing

Title FaceShop: Deep Sketch-based Face Image Editing
Authors Tiziano Portenier, Qiyang Hu, Attila Szabó, Siavash Arjomand Bigdeli, Paolo Favaro, Matthias Zwicker
Abstract We present a novel system for sketch-based face image editing, enabling users to edit images intuitively by sketching a few strokes on a region of interest. Our interface features tools to express a desired image manipulation by providing both geometry and color constraints as user-drawn strokes. As an alternative to the direct user input, our proposed system naturally supports a copy-paste mode, which allows users to edit a given image region by using parts of another exemplar image without the need of hand-drawn sketching at all. The proposed interface runs in real-time and facilitates an interactive and iterative workflow to quickly express the intended edits. Our system is based on a novel sketch domain and a convolutional neural network trained end-to-end to automatically learn to render image regions corresponding to the input strokes. To achieve high quality and semantically consistent results we train our neural network on two simultaneous tasks, namely image completion and image translation. To the best of our knowledge, we are the first to combine these two tasks in a unified framework for interactive image editing. Our results show that the proposed sketch domain, network architecture, and training procedure generalize well to real user input and enable high quality synthesis results without additional post-processing.
Tasks
Published 2018-04-24
URL http://arxiv.org/abs/1804.08972v2
PDF http://arxiv.org/pdf/1804.08972v2.pdf
PWC https://paperswithcode.com/paper/faceshop-deep-sketch-based-face-image-editing
Repo
Framework

Combining Axiom Injection and Knowledge Base Completion for Efficient Natural Language Inference

Title Combining Axiom Injection and Knowledge Base Completion for Efficient Natural Language Inference
Authors Masashi Yoshikawa, Koji Mineshima, Hiroshi Noji, Daisuke Bekki
Abstract In logic-based approaches to reasoning tasks such as Recognizing Textual Entailment (RTE), it is important for a system to have a large amount of knowledge data. However, there is a tradeoff between adding more knowledge data for improved RTE performance and maintaining an efficient RTE system, as such a big database is problematic in terms of the memory usage and computational complexity. In this work, we show the processing time of a state-of-the-art logic-based RTE system can be significantly reduced by replacing its search-based axiom injection (abduction) mechanism by that based on Knowledge Base Completion (KBC). We integrate this mechanism in a Coq plugin that provides a proof automation tactic for natural language inference. Additionally, we show empirically that adding new knowledge data contributes to better RTE performance while not harming the processing speed in this framework.
Tasks Knowledge Base Completion, Natural Language Inference
Published 2018-11-15
URL http://arxiv.org/abs/1811.06203v1
PDF http://arxiv.org/pdf/1811.06203v1.pdf
PWC https://paperswithcode.com/paper/combining-axiom-injection-and-knowledge-base
Repo
Framework

Auto-tuning TensorFlow Threading Model for CPU Backend

Title Auto-tuning TensorFlow Threading Model for CPU Backend
Authors Niranjan Hasabnis
Abstract TensorFlow is a popular deep learning framework used by data scientists to solve a wide-range of machine learning and deep learning problems such as image classification and speech recognition. It also operates at a large scale and in heterogeneous environments — it allows users to train neural network models or deploy them for inference using GPUs, CPUs and deep learning specific custom-designed hardware such as TPUs. Even though TensorFlow supports a variety of optimized backends, realizing the best performance using a backend may require additional efforts. For instance, getting the best performance from a CPU backend requires careful tuning of its threading model. Unfortunately, the best tuning approach used today is manual, tedious, time-consuming, and, more importantly, may not guarantee the best performance. In this paper, we develop an automatic approach, called TensorTuner, to search for optimal parameter settings of TensorFlow’s threading model for CPU backends. We evaluate TensorTuner on both Eigen and Intel’s MKL CPU backends using a set of neural networks from TensorFlow’s benchmarking suite. Our evaluation results demonstrate that the parameter settings found by TensorTuner produce 2% to 123% performance improvement for the Eigen CPU backend and 1.5% to 28% performance improvement for the MKL CPU backend over the performance obtained using their best-known parameter settings. This highlights the fact that the default parameter settings in Eigen CPU backend are not the ideal settings; and even for a carefully hand-tuned MKL backend, the settings may be sub-optimal. Our evaluations also revealed that TensorTuner is efficient at finding the optimal settings — it is able to converge to the optimal settings quickly by pruning more than 90% of the parameter search space.
Tasks Image Classification, Speech Recognition
Published 2018-12-04
URL http://arxiv.org/abs/1812.01665v1
PDF http://arxiv.org/pdf/1812.01665v1.pdf
PWC https://paperswithcode.com/paper/auto-tuning-tensorflow-threading-model-for
Repo
Framework

L1-(2D)2PCANet: A Deep Learning Network for Face Recognition

Title L1-(2D)2PCANet: A Deep Learning Network for Face Recognition
Authors YunKun Li, XiaoJun Wu, Josef Kittler
Abstract In this paper, we propose a novel deep learning network L1-(2D)2PCANet for face recognition, which is based on L1-norm-based two-directional two-dimensional principal component analysis (L1-(2D)2PCA). In our network, the role of L1-(2D)2PCA is to learn the filters of multiple convolution layers. After the convolution layers, we deploy binary hashing and block-wise histogram for pooling. We test our network on some benchmark facial datasets YALE, AR, Extended Yale B, LFW-a and FERET with CNN, PCANet, 2DPCANet and L1-PCANet as comparison. The results show that the recognition performance of L1-(2D)2PCANet in all tests is better than baseline networks, especially when there are outliers in the test data. Owing to the L1-norm, L1-2D2PCANet is robust to outliers and changes of the training images.
Tasks Face Recognition
Published 2018-05-26
URL http://arxiv.org/abs/1805.10476v1
PDF http://arxiv.org/pdf/1805.10476v1.pdf
PWC https://paperswithcode.com/paper/l1-2d2pcanet-a-deep-learning-network-for-face
Repo
Framework

Automated Game Design via Conceptual Expansion

Title Automated Game Design via Conceptual Expansion
Authors Matthew Guzdial, Mark Riedl
Abstract Automated game design has remained a key challenge within the field of Game AI. In this paper, we introduce a method for recombining existing games to create new games through a process called conceptual expansion. Prior automated game design approaches have relied on hand-authored or crowd-sourced knowledge, which limits the scope and applications of such systems. Our approach instead relies on machine learning to learn approximate representations of games. Our approach recombines knowledge from these learned representations to create new games via conceptual expansion. We evaluate this approach by demonstrating the ability for the system to recreate existing games. To the best of our knowledge, this represents the first machine learning-based automated game design system.
Tasks
Published 2018-09-06
URL http://arxiv.org/abs/1809.02232v1
PDF http://arxiv.org/pdf/1809.02232v1.pdf
PWC https://paperswithcode.com/paper/automated-game-design-via-conceptual
Repo
Framework

Multi-Resolution Networks for Semantic Segmentation in Whole Slide Images

Title Multi-Resolution Networks for Semantic Segmentation in Whole Slide Images
Authors Feng Gu, Nikolay Burlutskiy, Mats Andersson, Lena Kajland Wilen
Abstract Digital pathology provides an excellent opportunity for applying fully convolutional networks (FCNs) to tasks, such as semantic segmentation of whole slide images (WSIs). However, standard FCNs face challenges with respect to multi-resolution, inherited from the pyramid arrangement of WSIs. As a result, networks specifically designed to learn and aggregate information at different levels are desired. In this paper, we propose two novel multi-resolution networks based on the popular `U-Net’ architecture, which are evaluated on a benchmark dataset for binary semantic segmentation in WSIs. The proposed methods outperform the U-Net, demonstrating superior learning and generalization capabilities. |
Tasks Semantic Segmentation
Published 2018-07-25
URL http://arxiv.org/abs/1807.09607v1
PDF http://arxiv.org/pdf/1807.09607v1.pdf
PWC https://paperswithcode.com/paper/multi-resolution-networks-for-semantic
Repo
Framework

AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction

Title AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction
Authors Francisco J. Pulgar, Francisco Charte, Antonio J. Rivera, María J. del Jesus
Abstract High dimensionality, i.e. data having a large number of variables, tends to be a challenge for most machine learning tasks, including classification. A classifier usually builds a model representing how a set of inputs explain the outputs. The larger is the set of inputs and/or outputs, the more complex would be that model. There is a family of classification algorithms, known as lazy learning methods, which does not build a model. One of the best known members of this family is the kNN algorithm. Its strategy relies on searching a set of nearest neighbors, using the input variables as position vectors and computing distances among them. These distances loss significance in high-dimensional spaces. Therefore kNN, as many other classifiers, tends to worse its performance as the number of input variables grows. In this work AEkNN, a new kNN-based algorithm with built-in dimensionality reduction, is presented. Aiming to obtain a new representation of the data, having a lower dimensionality but with more informational features, AEkNN internally uses autoencoders. From this new feature vectors the computed distances should be more significant, thus providing a way to choose better neighbors. A experimental evaluation of the new proposal is conducted, analyzing several configurations and comparing them against the classical kNN algorithm. The obtained conclusions demonstrate that AEkNN offers better results in predictive and runtime performance.
Tasks Dimensionality Reduction
Published 2018-02-23
URL http://arxiv.org/abs/1802.08465v2
PDF http://arxiv.org/pdf/1802.08465v2.pdf
PWC https://paperswithcode.com/paper/aeknn-an-autoencoder-knn-based-classifier
Repo
Framework

Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks

Title Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks
Authors Deepak Mittal, Shweta Bhardwaj, Mitesh M. Khapra, Balaraman Ravindran
Abstract Recently there has been a lot of work on pruning filters from deep convolutional neural networks (CNNs) with the intention of reducing computations. The key idea is to rank the filters based on a certain criterion (say, $l_1$-norm, average percentage of zeros, etc) and retain only the top ranked filters. Once the low scoring filters are pruned away the remainder of the network is fine tuned and is shown to give performance comparable to the original unpruned network. In this work, we report experiments which suggest that the comparable performance of the pruned network is not due to the specific criterion chosen but due to the inherent plasticity of deep neural networks which allows them to recover from the loss of pruned filters once the rest of the filters are fine-tuned. Specifically, we show counter-intuitive results wherein by randomly pruning 25-50% filters from deep CNNs we are able to obtain the same performance as obtained by using state of the art pruning methods. We empirically validate our claims by doing an exhaustive evaluation with VGG-16 and ResNet-50. Further, we also evaluate a real world scenario where a CNN trained on all 1000 ImageNet classes needs to be tested on only a small set of classes at test time (say, only animals). We create a new benchmark dataset from ImageNet to evaluate such class specific pruning and show that even here a random pruning strategy gives close to state of the art performance. Lastly, unlike existing approaches which mainly focus on the task of image classification, in this work we also report results on object detection. We show that using a simple random pruning strategy we can achieve significant speed up in object detection (74$%$ improvement in fps) while retaining the same accuracy as that of the original Faster RCNN model.
Tasks Image Classification, Object Detection
Published 2018-01-31
URL http://arxiv.org/abs/1801.10447v1
PDF http://arxiv.org/pdf/1801.10447v1.pdf
PWC https://paperswithcode.com/paper/recovering-from-random-pruning-on-the
Repo
Framework

Evaluation of Machine Learning Fameworks on Finis Terrae II

Title Evaluation of Machine Learning Fameworks on Finis Terrae II
Authors Andres Gomez Tato
Abstract Machine Learning (ML) and Deep Learning (DL) are two technologies used to extract representations of the data for a specific purpose. ML algorithms take a set of data as input to generate one or several predictions. To define the final version of one model, usually there is an initial step devoted to train the algorithm (get the right final values of the parameters of the model). There are several techniques, from supervised learning to reinforcement learning, which have different requirements. On the market, there are some frameworks or APIs that reduce the effort for designing a new ML model. In this report, using the benchmark DLBENCH, we will analyse the performance and the execution modes of some well-known ML frameworks on the Finis Terrae II supercomputer when supervised learning is used. The report will show that placement of data and allocated hardware can have a large influence on the final timeto-solution.
Tasks
Published 2018-01-14
URL http://arxiv.org/abs/1801.04546v1
PDF http://arxiv.org/pdf/1801.04546v1.pdf
PWC https://paperswithcode.com/paper/evaluation-of-machine-learning-fameworks-on
Repo
Framework

Bio-LSTM: A Biomechanically Inspired Recurrent Neural Network for 3D Pedestrian Pose and Gait Prediction

Title Bio-LSTM: A Biomechanically Inspired Recurrent Neural Network for 3D Pedestrian Pose and Gait Prediction
Authors Xiaoxiao Du, Ram Vasudevan, Matthew Johnson-Roberson
Abstract In applications such as autonomous driving, it is important to understand, infer, and anticipate the intention and future behavior of pedestrians. This ability allows vehicles to avoid collisions and improve ride safety and quality. This paper proposes a biomechanically inspired recurrent neural network (Bio-LSTM) that can predict the location and 3D articulated body pose of pedestrians in a global coordinate frame, given 3D poses and locations estimated in prior frames with inaccuracy. The proposed network is able to predict poses and global locations for multiple pedestrians simultaneously, for pedestrians up to 45 meters from the cameras (urban intersection scale). The outputs of the proposed network are full-body 3D meshes represented in Skinned Multi-Person Linear (SMPL) model parameters. The proposed approach relies on a novel objective function that incorporates the periodicity of human walking (gait), the mirror symmetry of the human body, and the change of ground reaction forces in a human gait cycle. This paper presents prediction results on the PedX dataset, a large-scale, in-the-wild data set collected at real urban intersections with heavy pedestrian traffic. Results show that the proposed network can successfully learn the characteristics of pedestrian gait and produce accurate and consistent 3D pose predictions.
Tasks Autonomous Driving
Published 2018-09-11
URL https://arxiv.org/abs/1809.03705v3
PDF https://arxiv.org/pdf/1809.03705v3.pdf
PWC https://paperswithcode.com/paper/bio-lstm-a-biomechanically-inspired-recurrent
Repo
Framework

CaricatureShop: Personalized and Photorealistic Caricature Sketching

Title CaricatureShop: Personalized and Photorealistic Caricature Sketching
Authors Xiaoguang Han, Kangcheng Hou, Dong Du, Yuda Qiu, Yizhou Yu, Kun Zhou, Shuguang Cui
Abstract In this paper, we propose the first sketching system for interactively personalized and photorealistic face caricaturing. Input an image of a human face, the users can create caricature photos by manipulating its facial feature curves. Our system firstly performs exaggeration on the recovered 3D face model according to the edited sketches, which is conducted by assigning the laplacian of each vertex a scaling factor. To construct the mapping between 2D sketches and a vertex-wise scaling field, a novel deep learning architecture is developed. With the obtained 3D caricature model, two images are generated, one obtained by applying 2D warping guided by the underlying 3D mesh deformation and the other obtained by re-rendering the deformed 3D textured model. These two images are then seamlessly integrated to produce our final output. Due to the severely stretching of meshes, the rendered texture is of blurry appearances. A deep learning approach is exploited to infer the missing details for enhancing these blurry regions. Moreover, a relighting operation is invented to further improve the photorealism of the result. Both quantitative and qualitative experiment results validated the efficiency of our sketching system and the superiority of our proposed techniques against existing methods.
Tasks Caricature
Published 2018-07-24
URL http://arxiv.org/abs/1807.09064v1
PDF http://arxiv.org/pdf/1807.09064v1.pdf
PWC https://paperswithcode.com/paper/caricatureshop-personalized-and
Repo
Framework

Randomized Iterative Algorithms for Fisher Discriminant Analysis

Title Randomized Iterative Algorithms for Fisher Discriminant Analysis
Authors Agniva Chowdhury, Jiasen Yang, Petros Drineas
Abstract Fisher discriminant analysis (FDA) is a widely used method for classification and dimensionality reduction. When the number of predictor variables greatly exceeds the number of observations, one of the alternatives for conventional FDA is regularized Fisher discriminant analysis (RFDA). In this paper, we present a simple, iterative, sketching-based algorithm for RFDA that comes with provable accuracy guarantees when compared to the conventional approach. Our analysis builds upon two simple structural results that boil down to randomized matrix multiplication, a fundamental and well-understood primitive of randomized linear algebra. We analyze the behavior of RFDA when the ridge leverage and the standard leverage scores are used to select predictor variables and we prove that accurate approximations can be achieved by a sample whose size depends on the effective degrees of freedom of the RFDA problem. Our results yield significant improvements over existing approaches and our empirical evaluations support our theoretical analyses.
Tasks Dimensionality Reduction
Published 2018-09-09
URL http://arxiv.org/abs/1809.03045v2
PDF http://arxiv.org/pdf/1809.03045v2.pdf
PWC https://paperswithcode.com/paper/randomized-iterative-algorithms-for-fisher
Repo
Framework

Wearable Affective Robot

Title Wearable Affective Robot
Authors Min Chen, Jun Zhou, Guangming Tao, Jun Yang, Long Hu
Abstract With the development of the artificial intelligence (AI), the AI applications have influenced and changed people’s daily life greatly. Here, a wearable affective robot that integrates the affective robot, social robot, brain wearable, and wearable 2.0 is proposed for the first time. The proposed wearable affective robot is intended for a wide population, and we believe that it can improve the human health on the spirit level, meeting the fashion requirements at the same time. In this paper, the architecture and design of an innovative wearable affective robot, which is dubbed as Fitbot, are introduced in terms of hardware and algorithm’s perspectives. In addition, the important functional component of the robot-brain wearable device is introduced from the aspect of the hardware design, EEG data acquisition and analysis, user behavior perception, and algorithm deployment, etc. Then, the EEG based cognition of user’s behavior is realized. Through the continuous acquisition of the in-depth, in-breadth data, the Fitbot we present can gradually enrich user’s life modeling and enable the wearable robot to recognize user’s intention and further understand the behavioral motivation behind the user’s emotion. The learning algorithm for the life modeling embedded in Fitbot can achieve better user’s experience of affective social interaction. Finally, the application service scenarios and some challenging issues of a wearable affective robot are discussed.
Tasks EEG
Published 2018-10-25
URL http://arxiv.org/abs/1810.10743v1
PDF http://arxiv.org/pdf/1810.10743v1.pdf
PWC https://paperswithcode.com/paper/wearable-affective-robot
Repo
Framework

Multispectral Compressive Imaging Strategies using Fabry-Pérot Filtered Sensors

Title Multispectral Compressive Imaging Strategies using Fabry-Pérot Filtered Sensors
Authors Kévin Degraux, Valerio Cambareri, Bert Geelen, Laurent Jacques, Gauthier Lafruit
Abstract This paper introduces two acquisition device architectures for multispectral compressive imaging. Unlike most existing methods, the proposed computational imaging techniques do not include any dispersive element, as they use a dedicated sensor which integrates narrowband Fabry-P'erot spectral filters at the pixel level. The first scheme leverages joint inpainting and super-resolution to fill in those voxels that are missing due to the device’s limited pixel count. The second scheme, in link with compressed sensing, introduces spatial random convolutions, but is more complex and may be affected by diffraction. In both cases we solve the associated inverse problems by using the same signal prior. Specifically, we propose a redundant analysis signal prior in a convex formulation. Through numerical simulations, we explore different realistic setups. Our objective is also to highlight some practical guidelines and discuss their complexity trade-offs to integrate these schemes into actual computational imaging systems. Our conclusion is that the second technique performs best at high compression levels, in a properly sized and calibrated setup. Otherwise, the first, simpler technique should be favored.
Tasks Super-Resolution
Published 2018-02-06
URL http://arxiv.org/abs/1802.02040v1
PDF http://arxiv.org/pdf/1802.02040v1.pdf
PWC https://paperswithcode.com/paper/multispectral-compressive-imaging-strategies
Repo
Framework
comments powered by Disqus