October 19, 2019

3346 words 16 mins read

Paper Group ANR 347

Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation. FaceShop: Deep Sketch-based Face Image Editing. Combining Axiom Injection and Knowledge Base Completion for Efficient Natural Language Inference. Auto-tuning TensorFlow Threading Model for CPU Backend. L1-(2D)2PCANet: A Deep Learning Network for Face Recognition …


Title	Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation
Authors	Peter Karkus, David Hsu, Wee Sun Lee
Abstract	We propose to take a novel approach to robot system design where each building block of a larger system is represented as a differentiable program, i.e. a deep neural network. This representation allows for integrating algorithmic planning and deep learning in a principled manner, and thus combine the benefits of model-free and model-based methods. We apply the proposed approach to a challenging partially observable robot navigation task. The robot must navigate to a goal in a previously unseen 3-D environment without knowing its initial location, and instead relying on a 2-D floor map and visual observations from an onboard camera. We introduce the Navigation Networks (NavNets) that encode state estimation, planning and acting in a single, end-to-end trainable recurrent neural network. In preliminary simulation experiments we successfully trained navigation networks to solve the challenging partially observable navigation task.
Tasks	Robot Navigation
Published	2018-07-17
URL	http://arxiv.org/abs/1807.06696v1
PDF	http://arxiv.org/pdf/1807.06696v1.pdf
PWC	https://paperswithcode.com/paper/integrating-algorithmic-planning-and-deep
Repo
Framework

FaceShop: Deep Sketch-based Face Image Editing


Title	FaceShop: Deep Sketch-based Face Image Editing
Authors	Tiziano Portenier, Qiyang Hu, Attila Szabó, Siavash Arjomand Bigdeli, Paolo Favaro, Matthias Zwicker
Abstract	We present a novel system for sketch-based face image editing, enabling users to edit images intuitively by sketching a few strokes on a region of interest. Our interface features tools to express a desired image manipulation by providing both geometry and color constraints as user-drawn strokes. As an alternative to the direct user input, our proposed system naturally supports a copy-paste mode, which allows users to edit a given image region by using parts of another exemplar image without the need of hand-drawn sketching at all. The proposed interface runs in real-time and facilitates an interactive and iterative workflow to quickly express the intended edits. Our system is based on a novel sketch domain and a convolutional neural network trained end-to-end to automatically learn to render image regions corresponding to the input strokes. To achieve high quality and semantically consistent results we train our neural network on two simultaneous tasks, namely image completion and image translation. To the best of our knowledge, we are the first to combine these two tasks in a unified framework for interactive image editing. Our results show that the proposed sketch domain, network architecture, and training procedure generalize well to real user input and enable high quality synthesis results without additional post-processing.
Tasks
Published	2018-04-24
URL	http://arxiv.org/abs/1804.08972v2
PDF	http://arxiv.org/pdf/1804.08972v2.pdf
PWC	https://paperswithcode.com/paper/faceshop-deep-sketch-based-face-image-editing
Repo
Framework

Combining Axiom Injection and Knowledge Base Completion for Efficient Natural Language Inference


Title	Combining Axiom Injection and Knowledge Base Completion for Efficient Natural Language Inference
Authors	Masashi Yoshikawa, Koji Mineshima, Hiroshi Noji, Daisuke Bekki
Abstract	In logic-based approaches to reasoning tasks such as Recognizing Textual Entailment (RTE), it is important for a system to have a large amount of knowledge data. However, there is a tradeoff between adding more knowledge data for improved RTE performance and maintaining an efficient RTE system, as such a big database is problematic in terms of the memory usage and computational complexity. In this work, we show the processing time of a state-of-the-art logic-based RTE system can be significantly reduced by replacing its search-based axiom injection (abduction) mechanism by that based on Knowledge Base Completion (KBC). We integrate this mechanism in a Coq plugin that provides a proof automation tactic for natural language inference. Additionally, we show empirically that adding new knowledge data contributes to better RTE performance while not harming the processing speed in this framework.
Tasks	Knowledge Base Completion, Natural Language Inference
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06203v1
PDF	http://arxiv.org/pdf/1811.06203v1.pdf
PWC	https://paperswithcode.com/paper/combining-axiom-injection-and-knowledge-base
Repo
Framework

Auto-tuning TensorFlow Threading Model for CPU Backend


Title	Auto-tuning TensorFlow Threading Model for CPU Backend
Authors	Niranjan Hasabnis
Abstract	TensorFlow is a popular deep learning framework used by data scientists to solve a wide-range of machine learning and deep learning problems such as image classification and speech recognition. It also operates at a large scale and in heterogeneous environments — it allows users to train neural network models or deploy them for inference using GPUs, CPUs and deep learning specific custom-designed hardware such as TPUs. Even though TensorFlow supports a variety of optimized backends, realizing the best performance using a backend may require additional efforts. For instance, getting the best performance from a CPU backend requires careful tuning of its threading model. Unfortunately, the best tuning approach used today is manual, tedious, time-consuming, and, more importantly, may not guarantee the best performance. In this paper, we develop an automatic approach, called TensorTuner, to search for optimal parameter settings of TensorFlow’s threading model for CPU backends. We evaluate TensorTuner on both Eigen and Intel’s MKL CPU backends using a set of neural networks from TensorFlow’s benchmarking suite. Our evaluation results demonstrate that the parameter settings found by TensorTuner produce 2% to 123% performance improvement for the Eigen CPU backend and 1.5% to 28% performance improvement for the MKL CPU backend over the performance obtained using their best-known parameter settings. This highlights the fact that the default parameter settings in Eigen CPU backend are not the ideal settings; and even for a carefully hand-tuned MKL backend, the settings may be sub-optimal. Our evaluations also revealed that TensorTuner is efficient at finding the optimal settings — it is able to converge to the optimal settings quickly by pruning more than 90% of the parameter search space.
Tasks	Image Classification, Speech Recognition
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01665v1
PDF	http://arxiv.org/pdf/1812.01665v1.pdf
PWC	https://paperswithcode.com/paper/auto-tuning-tensorflow-threading-model-for
Repo
Framework

L1-(2D)2PCANet: A Deep Learning Network for Face Recognition


Title	L1-(2D)2PCANet: A Deep Learning Network for Face Recognition
Authors	YunKun Li, XiaoJun Wu, Josef Kittler
Abstract	In this paper, we propose a novel deep learning network L1-(2D)2PCANet for face recognition, which is based on L1-norm-based two-directional two-dimensional principal component analysis (L1-(2D)2PCA). In our network, the role of L1-(2D)2PCA is to learn the filters of multiple convolution layers. After the convolution layers, we deploy binary hashing and block-wise histogram for pooling. We test our network on some benchmark facial datasets YALE, AR, Extended Yale B, LFW-a and FERET with CNN, PCANet, 2DPCANet and L1-PCANet as comparison. The results show that the recognition performance of L1-(2D)2PCANet in all tests is better than baseline networks, especially when there are outliers in the test data. Owing to the L1-norm, L1-2D2PCANet is robust to outliers and changes of the training images.
Tasks	Face Recognition
Published	2018-05-26
URL	http://arxiv.org/abs/1805.10476v1
PDF	http://arxiv.org/pdf/1805.10476v1.pdf
PWC	https://paperswithcode.com/paper/l1-2d2pcanet-a-deep-learning-network-for-face
Repo
Framework

Automated Game Design via Conceptual Expansion


Title	Automated Game Design via Conceptual Expansion
Authors	Matthew Guzdial, Mark Riedl
Abstract	Automated game design has remained a key challenge within the field of Game AI. In this paper, we introduce a method for recombining existing games to create new games through a process called conceptual expansion. Prior automated game design approaches have relied on hand-authored or crowd-sourced knowledge, which limits the scope and applications of such systems. Our approach instead relies on machine learning to learn approximate representations of games. Our approach recombines knowledge from these learned representations to create new games via conceptual expansion. We evaluate this approach by demonstrating the ability for the system to recreate existing games. To the best of our knowledge, this represents the first machine learning-based automated game design system.
Tasks
Published	2018-09-06
URL	http://arxiv.org/abs/1809.02232v1
PDF	http://arxiv.org/pdf/1809.02232v1.pdf
PWC	https://paperswithcode.com/paper/automated-game-design-via-conceptual
Repo
Framework

Multi-Resolution Networks for Semantic Segmentation in Whole Slide Images


Title	Multi-Resolution Networks for Semantic Segmentation in Whole Slide Images
Authors	Feng Gu, Nikolay Burlutskiy, Mats Andersson, Lena Kajland Wilen
Abstract	Digital pathology provides an excellent opportunity for applying fully convolutional networks (FCNs) to tasks, such as semantic segmentation of whole slide images (WSIs). However, standard FCNs face challenges with respect to multi-resolution, inherited from the pyramid arrangement of WSIs. As a result, networks specifically designed to learn and aggregate information at different levels are desired. In this paper, we propose two novel multi-resolution networks based on the popular `U-Net’ architecture, which are evaluated on a benchmark dataset for binary semantic segmentation in WSIs. The proposed methods outperform the U-Net, demonstrating superior learning and generalization capabilities. \|
Tasks	Semantic Segmentation
Published	2018-07-25
URL	http://arxiv.org/abs/1807.09607v1
PDF	http://arxiv.org/pdf/1807.09607v1.pdf
PWC	https://paperswithcode.com/paper/multi-resolution-networks-for-semantic
Repo
Framework

AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction


Title	AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction
Authors	Francisco J. Pulgar, Francisco Charte, Antonio J. Rivera, María J. del Jesus
Abstract	High dimensionality, i.e. data having a large number of variables, tends to be a challenge for most machine learning tasks, including classification. A classifier usually builds a model representing how a set of inputs explain the outputs. The larger is the set of inputs and/or outputs, the more complex would be that model. There is a family of classification algorithms, known as lazy learning methods, which does not build a model. One of the best known members of this family is the kNN algorithm. Its strategy relies on searching a set of nearest neighbors, using the input variables as position vectors and computing distances among them. These distances loss significance in high-dimensional spaces. Therefore kNN, as many other classifiers, tends to worse its performance as the number of input variables grows. In this work AEkNN, a new kNN-based algorithm with built-in dimensionality reduction, is presented. Aiming to obtain a new representation of the data, having a lower dimensionality but with more informational features, AEkNN internally uses autoencoders. From this new feature vectors the computed distances should be more significant, thus providing a way to choose better neighbors. A experimental evaluation of the new proposal is conducted, analyzing several configurations and comparing them against the classical kNN algorithm. The obtained conclusions demonstrate that AEkNN offers better results in predictive and runtime performance.
Tasks	Dimensionality Reduction
Published	2018-02-23
URL	http://arxiv.org/abs/1802.08465v2
PDF	http://arxiv.org/pdf/1802.08465v2.pdf
PWC	https://paperswithcode.com/paper/aeknn-an-autoencoder-knn-based-classifier
Repo
Framework

Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks


Title	Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks
Authors	Deepak Mittal, Shweta Bhardwaj, Mitesh M. Khapra, Balaraman Ravindran
Abstract	Recently there has been a lot of work on pruning filters from deep convolutional neural networks (CNNs) with the intention of reducing computations. The key idea is to rank the filters based on a certain criterion (say, $l_1$-norm, average percentage of zeros, etc) and retain only the top ranked filters. Once the low scoring filters are pruned away the remainder of the network is fine tuned and is shown to give performance comparable to the original unpruned network. In this work, we report experiments which suggest that the comparable performance of the pruned network is not due to the specific criterion chosen but due to the inherent plasticity of deep neural networks which allows them to recover from the loss of pruned filters once the rest of the filters are fine-tuned. Specifically, we show counter-intuitive results wherein by randomly pruning 25-50% filters from deep CNNs we are able to obtain the same performance as obtained by using state of the art pruning methods. We empirically validate our claims by doing an exhaustive evaluation with VGG-16 and ResNet-50. Further, we also evaluate a real world scenario where a CNN trained on all 1000 ImageNet classes needs to be tested on only a small set of classes at test time (say, only animals). We create a new benchmark dataset from ImageNet to evaluate such class specific pruning and show that even here a random pruning strategy gives close to state of the art performance. Lastly, unlike existing approaches which mainly focus on the task of image classification, in this work we also report results on object detection. We show that using a simple random pruning strategy we can achieve significant speed up in object detection (74$%$ improvement in fps) while retaining the same accuracy as that of the original Faster RCNN model.
Tasks	Image Classification, Object Detection
Published	2018-01-31
URL	http://arxiv.org/abs/1801.10447v1
PDF	http://arxiv.org/pdf/1801.10447v1.pdf
PWC	https://paperswithcode.com/paper/recovering-from-random-pruning-on-the
Repo
Framework

Evaluation of Machine Learning Fameworks on Finis Terrae II


Title	Evaluation of Machine Learning Fameworks on Finis Terrae II
Authors	Andres Gomez Tato
Abstract	Machine Learning (ML) and Deep Learning (DL) are two technologies used to extract representations of the data for a specific purpose. ML algorithms take a set of data as input to generate one or several predictions. To define the final version of one model, usually there is an initial step devoted to train the algorithm (get the right final values of the parameters of the model). There are several techniques, from supervised learning to reinforcement learning, which have different requirements. On the market, there are some frameworks or APIs that reduce the effort for designing a new ML model. In this report, using the benchmark DLBENCH, we will analyse the performance and the execution modes of some well-known ML frameworks on the Finis Terrae II supercomputer when supervised learning is used. The report will show that placement of data and allocated hardware can have a large influence on the final timeto-solution.
Tasks
Published	2018-01-14
URL	http://arxiv.org/abs/1801.04546v1
PDF	http://arxiv.org/pdf/1801.04546v1.pdf
PWC	https://paperswithcode.com/paper/evaluation-of-machine-learning-fameworks-on
Repo
Framework

Bio-LSTM: A Biomechanically Inspired Recurrent Neural Network for 3D Pedestrian Pose and Gait Prediction


Title	Bio-LSTM: A Biomechanically Inspired Recurrent Neural Network for 3D Pedestrian Pose and Gait Prediction
Authors	Xiaoxiao Du, Ram Vasudevan, Matthew Johnson-Roberson
Abstract	In applications such as autonomous driving, it is important to understand, infer, and anticipate the intention and future behavior of pedestrians. This ability allows vehicles to avoid collisions and improve ride safety and quality. This paper proposes a biomechanically inspired recurrent neural network (Bio-LSTM) that can predict the location and 3D articulated body pose of pedestrians in a global coordinate frame, given 3D poses and locations estimated in prior frames with inaccuracy. The proposed network is able to predict poses and global locations for multiple pedestrians simultaneously, for pedestrians up to 45 meters from the cameras (urban intersection scale). The outputs of the proposed network are full-body 3D meshes represented in Skinned Multi-Person Linear (SMPL) model parameters. The proposed approach relies on a novel objective function that incorporates the periodicity of human walking (gait), the mirror symmetry of the human body, and the change of ground reaction forces in a human gait cycle. This paper presents prediction results on the PedX dataset, a large-scale, in-the-wild data set collected at real urban intersections with heavy pedestrian traffic. Results show that the proposed network can successfully learn the characteristics of pedestrian gait and produce accurate and consistent 3D pose predictions.
Tasks	Autonomous Driving
Published	2018-09-11
URL	https://arxiv.org/abs/1809.03705v3
PDF	https://arxiv.org/pdf/1809.03705v3.pdf
PWC	https://paperswithcode.com/paper/bio-lstm-a-biomechanically-inspired-recurrent
Repo
Framework

CaricatureShop: Personalized and Photorealistic Caricature Sketching


Title	CaricatureShop: Personalized and Photorealistic Caricature Sketching
Authors	Xiaoguang Han, Kangcheng Hou, Dong Du, Yuda Qiu, Yizhou Yu, Kun Zhou, Shuguang Cui
Abstract	In this paper, we propose the first sketching system for interactively personalized and photorealistic face caricaturing. Input an image of a human face, the users can create caricature photos by manipulating its facial feature curves. Our system firstly performs exaggeration on the recovered 3D face model according to the edited sketches, which is conducted by assigning the laplacian of each vertex a scaling factor. To construct the mapping between 2D sketches and a vertex-wise scaling field, a novel deep learning architecture is developed. With the obtained 3D caricature model, two images are generated, one obtained by applying 2D warping guided by the underlying 3D mesh deformation and the other obtained by re-rendering the deformed 3D textured model. These two images are then seamlessly integrated to produce our final output. Due to the severely stretching of meshes, the rendered texture is of blurry appearances. A deep learning approach is exploited to infer the missing details for enhancing these blurry regions. Moreover, a relighting operation is invented to further improve the photorealism of the result. Both quantitative and qualitative experiment results validated the efficiency of our sketching system and the superiority of our proposed techniques against existing methods.
Tasks	Caricature
Published	2018-07-24
URL	http://arxiv.org/abs/1807.09064v1
PDF	http://arxiv.org/pdf/1807.09064v1.pdf
PWC	https://paperswithcode.com/paper/caricatureshop-personalized-and
Repo
Framework

Randomized Iterative Algorithms for Fisher Discriminant Analysis


Title	Randomized Iterative Algorithms for Fisher Discriminant Analysis
Authors	Agniva Chowdhury, Jiasen Yang, Petros Drineas
Abstract	Fisher discriminant analysis (FDA) is a widely used method for classification and dimensionality reduction. When the number of predictor variables greatly exceeds the number of observations, one of the alternatives for conventional FDA is regularized Fisher discriminant analysis (RFDA). In this paper, we present a simple, iterative, sketching-based algorithm for RFDA that comes with provable accuracy guarantees when compared to the conventional approach. Our analysis builds upon two simple structural results that boil down to randomized matrix multiplication, a fundamental and well-understood primitive of randomized linear algebra. We analyze the behavior of RFDA when the ridge leverage and the standard leverage scores are used to select predictor variables and we prove that accurate approximations can be achieved by a sample whose size depends on the effective degrees of freedom of the RFDA problem. Our results yield significant improvements over existing approaches and our empirical evaluations support our theoretical analyses.
Tasks	Dimensionality Reduction
Published	2018-09-09
URL	http://arxiv.org/abs/1809.03045v2
PDF	http://arxiv.org/pdf/1809.03045v2.pdf
PWC	https://paperswithcode.com/paper/randomized-iterative-algorithms-for-fisher
Repo
Framework

Wearable Affective Robot


Title	Wearable Affective Robot
Authors	Min Chen, Jun Zhou, Guangming Tao, Jun Yang, Long Hu
Abstract	With the development of the artificial intelligence (AI), the AI applications have influenced and changed people’s daily life greatly. Here, a wearable affective robot that integrates the affective robot, social robot, brain wearable, and wearable 2.0 is proposed for the first time. The proposed wearable affective robot is intended for a wide population, and we believe that it can improve the human health on the spirit level, meeting the fashion requirements at the same time. In this paper, the architecture and design of an innovative wearable affective robot, which is dubbed as Fitbot, are introduced in terms of hardware and algorithm’s perspectives. In addition, the important functional component of the robot-brain wearable device is introduced from the aspect of the hardware design, EEG data acquisition and analysis, user behavior perception, and algorithm deployment, etc. Then, the EEG based cognition of user’s behavior is realized. Through the continuous acquisition of the in-depth, in-breadth data, the Fitbot we present can gradually enrich user’s life modeling and enable the wearable robot to recognize user’s intention and further understand the behavioral motivation behind the user’s emotion. The learning algorithm for the life modeling embedded in Fitbot can achieve better user’s experience of affective social interaction. Finally, the application service scenarios and some challenging issues of a wearable affective robot are discussed.
Tasks	EEG
Published	2018-10-25
URL	http://arxiv.org/abs/1810.10743v1
PDF	http://arxiv.org/pdf/1810.10743v1.pdf
PWC	https://paperswithcode.com/paper/wearable-affective-robot
Repo
Framework

Multispectral Compressive Imaging Strategies using Fabry-Pérot Filtered Sensors


Title	Multispectral Compressive Imaging Strategies using Fabry-Pérot Filtered Sensors
Authors	Kévin Degraux, Valerio Cambareri, Bert Geelen, Laurent Jacques, Gauthier Lafruit
Abstract	This paper introduces two acquisition device architectures for multispectral compressive imaging. Unlike most existing methods, the proposed computational imaging techniques do not include any dispersive element, as they use a dedicated sensor which integrates narrowband Fabry-P'erot spectral filters at the pixel level. The first scheme leverages joint inpainting and super-resolution to fill in those voxels that are missing due to the device’s limited pixel count. The second scheme, in link with compressed sensing, introduces spatial random convolutions, but is more complex and may be affected by diffraction. In both cases we solve the associated inverse problems by using the same signal prior. Specifically, we propose a redundant analysis signal prior in a convex formulation. Through numerical simulations, we explore different realistic setups. Our objective is also to highlight some practical guidelines and discuss their complexity trade-offs to integrate these schemes into actual computational imaging systems. Our conclusion is that the second technique performs best at high compression levels, in a properly sized and calibrated setup. Otherwise, the first, simpler technique should be favored.
Tasks	Super-Resolution
Published	2018-02-06
URL	http://arxiv.org/abs/1802.02040v1
PDF	http://arxiv.org/pdf/1802.02040v1.pdf
PWC	https://paperswithcode.com/paper/multispectral-compressive-imaging-strategies
Repo
Framework