Paper Group ANR 347
Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation. FaceShop: Deep Sketch-based Face Image Editing. Combining Axiom Injection and Knowledge Base Completion for Efficient Natural Language Inference. Auto-tuning TensorFlow Threading Model for CPU Backend. L1-(2D)2PCANet: A Deep Learning Network for Face Recognition …
Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation
Title | Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation |
Authors | Peter Karkus, David Hsu, Wee Sun Lee |
Abstract | We propose to take a novel approach to robot system design where each building block of a larger system is represented as a differentiable program, i.e. a deep neural network. This representation allows for integrating algorithmic planning and deep learning in a principled manner, and thus combine the benefits of model-free and model-based methods. We apply the proposed approach to a challenging partially observable robot navigation task. The robot must navigate to a goal in a previously unseen 3-D environment without knowing its initial location, and instead relying on a 2-D floor map and visual observations from an onboard camera. We introduce the Navigation Networks (NavNets) that encode state estimation, planning and acting in a single, end-to-end trainable recurrent neural network. In preliminary simulation experiments we successfully trained navigation networks to solve the challenging partially observable navigation task. |
Tasks | Robot Navigation |
Published | 2018-07-17 |
URL | http://arxiv.org/abs/1807.06696v1 |
http://arxiv.org/pdf/1807.06696v1.pdf | |
PWC | https://paperswithcode.com/paper/integrating-algorithmic-planning-and-deep |
Repo | |
Framework | |
FaceShop: Deep Sketch-based Face Image Editing
Title | FaceShop: Deep Sketch-based Face Image Editing |
Authors | Tiziano Portenier, Qiyang Hu, Attila Szabó, Siavash Arjomand Bigdeli, Paolo Favaro, Matthias Zwicker |
Abstract | We present a novel system for sketch-based face image editing, enabling users to edit images intuitively by sketching a few strokes on a region of interest. Our interface features tools to express a desired image manipulation by providing both geometry and color constraints as user-drawn strokes. As an alternative to the direct user input, our proposed system naturally supports a copy-paste mode, which allows users to edit a given image region by using parts of another exemplar image without the need of hand-drawn sketching at all. The proposed interface runs in real-time and facilitates an interactive and iterative workflow to quickly express the intended edits. Our system is based on a novel sketch domain and a convolutional neural network trained end-to-end to automatically learn to render image regions corresponding to the input strokes. To achieve high quality and semantically consistent results we train our neural network on two simultaneous tasks, namely image completion and image translation. To the best of our knowledge, we are the first to combine these two tasks in a unified framework for interactive image editing. Our results show that the proposed sketch domain, network architecture, and training procedure generalize well to real user input and enable high quality synthesis results without additional post-processing. |
Tasks | |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.08972v2 |
http://arxiv.org/pdf/1804.08972v2.pdf | |
PWC | https://paperswithcode.com/paper/faceshop-deep-sketch-based-face-image-editing |
Repo | |
Framework | |
Combining Axiom Injection and Knowledge Base Completion for Efficient Natural Language Inference
Title | Combining Axiom Injection and Knowledge Base Completion for Efficient Natural Language Inference |
Authors | Masashi Yoshikawa, Koji Mineshima, Hiroshi Noji, Daisuke Bekki |
Abstract | In logic-based approaches to reasoning tasks such as Recognizing Textual Entailment (RTE), it is important for a system to have a large amount of knowledge data. However, there is a tradeoff between adding more knowledge data for improved RTE performance and maintaining an efficient RTE system, as such a big database is problematic in terms of the memory usage and computational complexity. In this work, we show the processing time of a state-of-the-art logic-based RTE system can be significantly reduced by replacing its search-based axiom injection (abduction) mechanism by that based on Knowledge Base Completion (KBC). We integrate this mechanism in a Coq plugin that provides a proof automation tactic for natural language inference. Additionally, we show empirically that adding new knowledge data contributes to better RTE performance while not harming the processing speed in this framework. |
Tasks | Knowledge Base Completion, Natural Language Inference |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06203v1 |
http://arxiv.org/pdf/1811.06203v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-axiom-injection-and-knowledge-base |
Repo | |
Framework | |
Auto-tuning TensorFlow Threading Model for CPU Backend
Title | Auto-tuning TensorFlow Threading Model for CPU Backend |
Authors | Niranjan Hasabnis |
Abstract | TensorFlow is a popular deep learning framework used by data scientists to solve a wide-range of machine learning and deep learning problems such as image classification and speech recognition. It also operates at a large scale and in heterogeneous environments — it allows users to train neural network models or deploy them for inference using GPUs, CPUs and deep learning specific custom-designed hardware such as TPUs. Even though TensorFlow supports a variety of optimized backends, realizing the best performance using a backend may require additional efforts. For instance, getting the best performance from a CPU backend requires careful tuning of its threading model. Unfortunately, the best tuning approach used today is manual, tedious, time-consuming, and, more importantly, may not guarantee the best performance. In this paper, we develop an automatic approach, called TensorTuner, to search for optimal parameter settings of TensorFlow’s threading model for CPU backends. We evaluate TensorTuner on both Eigen and Intel’s MKL CPU backends using a set of neural networks from TensorFlow’s benchmarking suite. Our evaluation results demonstrate that the parameter settings found by TensorTuner produce 2% to 123% performance improvement for the Eigen CPU backend and 1.5% to 28% performance improvement for the MKL CPU backend over the performance obtained using their best-known parameter settings. This highlights the fact that the default parameter settings in Eigen CPU backend are not the ideal settings; and even for a carefully hand-tuned MKL backend, the settings may be sub-optimal. Our evaluations also revealed that TensorTuner is efficient at finding the optimal settings — it is able to converge to the optimal settings quickly by pruning more than 90% of the parameter search space. |
Tasks | Image Classification, Speech Recognition |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01665v1 |
http://arxiv.org/pdf/1812.01665v1.pdf | |
PWC | https://paperswithcode.com/paper/auto-tuning-tensorflow-threading-model-for |
Repo | |
Framework | |
L1-(2D)2PCANet: A Deep Learning Network for Face Recognition
Title | L1-(2D)2PCANet: A Deep Learning Network for Face Recognition |
Authors | YunKun Li, XiaoJun Wu, Josef Kittler |
Abstract | In this paper, we propose a novel deep learning network L1-(2D)2PCANet for face recognition, which is based on L1-norm-based two-directional two-dimensional principal component analysis (L1-(2D)2PCA). In our network, the role of L1-(2D)2PCA is to learn the filters of multiple convolution layers. After the convolution layers, we deploy binary hashing and block-wise histogram for pooling. We test our network on some benchmark facial datasets YALE, AR, Extended Yale B, LFW-a and FERET with CNN, PCANet, 2DPCANet and L1-PCANet as comparison. The results show that the recognition performance of L1-(2D)2PCANet in all tests is better than baseline networks, especially when there are outliers in the test data. Owing to the L1-norm, L1-2D2PCANet is robust to outliers and changes of the training images. |
Tasks | Face Recognition |
Published | 2018-05-26 |
URL | http://arxiv.org/abs/1805.10476v1 |
http://arxiv.org/pdf/1805.10476v1.pdf | |
PWC | https://paperswithcode.com/paper/l1-2d2pcanet-a-deep-learning-network-for-face |
Repo | |
Framework | |
Automated Game Design via Conceptual Expansion
Title | Automated Game Design via Conceptual Expansion |
Authors | Matthew Guzdial, Mark Riedl |
Abstract | Automated game design has remained a key challenge within the field of Game AI. In this paper, we introduce a method for recombining existing games to create new games through a process called conceptual expansion. Prior automated game design approaches have relied on hand-authored or crowd-sourced knowledge, which limits the scope and applications of such systems. Our approach instead relies on machine learning to learn approximate representations of games. Our approach recombines knowledge from these learned representations to create new games via conceptual expansion. We evaluate this approach by demonstrating the ability for the system to recreate existing games. To the best of our knowledge, this represents the first machine learning-based automated game design system. |
Tasks | |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.02232v1 |
http://arxiv.org/pdf/1809.02232v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-game-design-via-conceptual |
Repo | |
Framework | |
Multi-Resolution Networks for Semantic Segmentation in Whole Slide Images
Title | Multi-Resolution Networks for Semantic Segmentation in Whole Slide Images |
Authors | Feng Gu, Nikolay Burlutskiy, Mats Andersson, Lena Kajland Wilen |
Abstract | Digital pathology provides an excellent opportunity for applying fully convolutional networks (FCNs) to tasks, such as semantic segmentation of whole slide images (WSIs). However, standard FCNs face challenges with respect to multi-resolution, inherited from the pyramid arrangement of WSIs. As a result, networks specifically designed to learn and aggregate information at different levels are desired. In this paper, we propose two novel multi-resolution networks based on the popular `U-Net’ architecture, which are evaluated on a benchmark dataset for binary semantic segmentation in WSIs. The proposed methods outperform the U-Net, demonstrating superior learning and generalization capabilities. | |
Tasks | Semantic Segmentation |
Published | 2018-07-25 |
URL | http://arxiv.org/abs/1807.09607v1 |
http://arxiv.org/pdf/1807.09607v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-resolution-networks-for-semantic |
Repo | |
Framework | |
AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction
Title | AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction |
Authors | Francisco J. Pulgar, Francisco Charte, Antonio J. Rivera, María J. del Jesus |
Abstract | High dimensionality, i.e. data having a large number of variables, tends to be a challenge for most machine learning tasks, including classification. A classifier usually builds a model representing how a set of inputs explain the outputs. The larger is the set of inputs and/or outputs, the more complex would be that model. There is a family of classification algorithms, known as lazy learning methods, which does not build a model. One of the best known members of this family is the kNN algorithm. Its strategy relies on searching a set of nearest neighbors, using the input variables as position vectors and computing distances among them. These distances loss significance in high-dimensional spaces. Therefore kNN, as many other classifiers, tends to worse its performance as the number of input variables grows. In this work AEkNN, a new kNN-based algorithm with built-in dimensionality reduction, is presented. Aiming to obtain a new representation of the data, having a lower dimensionality but with more informational features, AEkNN internally uses autoencoders. From this new feature vectors the computed distances should be more significant, thus providing a way to choose better neighbors. A experimental evaluation of the new proposal is conducted, analyzing several configurations and comparing them against the classical kNN algorithm. The obtained conclusions demonstrate that AEkNN offers better results in predictive and runtime performance. |
Tasks | Dimensionality Reduction |
Published | 2018-02-23 |
URL | http://arxiv.org/abs/1802.08465v2 |
http://arxiv.org/pdf/1802.08465v2.pdf | |
PWC | https://paperswithcode.com/paper/aeknn-an-autoencoder-knn-based-classifier |
Repo | |
Framework | |
Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks
Title | Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks |
Authors | Deepak Mittal, Shweta Bhardwaj, Mitesh M. Khapra, Balaraman Ravindran |
Abstract | Recently there has been a lot of work on pruning filters from deep convolutional neural networks (CNNs) with the intention of reducing computations. The key idea is to rank the filters based on a certain criterion (say, $l_1$-norm, average percentage of zeros, etc) and retain only the top ranked filters. Once the low scoring filters are pruned away the remainder of the network is fine tuned and is shown to give performance comparable to the original unpruned network. In this work, we report experiments which suggest that the comparable performance of the pruned network is not due to the specific criterion chosen but due to the inherent plasticity of deep neural networks which allows them to recover from the loss of pruned filters once the rest of the filters are fine-tuned. Specifically, we show counter-intuitive results wherein by randomly pruning 25-50% filters from deep CNNs we are able to obtain the same performance as obtained by using state of the art pruning methods. We empirically validate our claims by doing an exhaustive evaluation with VGG-16 and ResNet-50. Further, we also evaluate a real world scenario where a CNN trained on all 1000 ImageNet classes needs to be tested on only a small set of classes at test time (say, only animals). We create a new benchmark dataset from ImageNet to evaluate such class specific pruning and show that even here a random pruning strategy gives close to state of the art performance. Lastly, unlike existing approaches which mainly focus on the task of image classification, in this work we also report results on object detection. We show that using a simple random pruning strategy we can achieve significant speed up in object detection (74$%$ improvement in fps) while retaining the same accuracy as that of the original Faster RCNN model. |
Tasks | Image Classification, Object Detection |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1801.10447v1 |
http://arxiv.org/pdf/1801.10447v1.pdf | |
PWC | https://paperswithcode.com/paper/recovering-from-random-pruning-on-the |
Repo | |
Framework | |
Evaluation of Machine Learning Fameworks on Finis Terrae II
Title | Evaluation of Machine Learning Fameworks on Finis Terrae II |
Authors | Andres Gomez Tato |
Abstract | Machine Learning (ML) and Deep Learning (DL) are two technologies used to extract representations of the data for a specific purpose. ML algorithms take a set of data as input to generate one or several predictions. To define the final version of one model, usually there is an initial step devoted to train the algorithm (get the right final values of the parameters of the model). There are several techniques, from supervised learning to reinforcement learning, which have different requirements. On the market, there are some frameworks or APIs that reduce the effort for designing a new ML model. In this report, using the benchmark DLBENCH, we will analyse the performance and the execution modes of some well-known ML frameworks on the Finis Terrae II supercomputer when supervised learning is used. The report will show that placement of data and allocated hardware can have a large influence on the final timeto-solution. |
Tasks | |
Published | 2018-01-14 |
URL | http://arxiv.org/abs/1801.04546v1 |
http://arxiv.org/pdf/1801.04546v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluation-of-machine-learning-fameworks-on |
Repo | |
Framework | |
Bio-LSTM: A Biomechanically Inspired Recurrent Neural Network for 3D Pedestrian Pose and Gait Prediction
Title | Bio-LSTM: A Biomechanically Inspired Recurrent Neural Network for 3D Pedestrian Pose and Gait Prediction |
Authors | Xiaoxiao Du, Ram Vasudevan, Matthew Johnson-Roberson |
Abstract | In applications such as autonomous driving, it is important to understand, infer, and anticipate the intention and future behavior of pedestrians. This ability allows vehicles to avoid collisions and improve ride safety and quality. This paper proposes a biomechanically inspired recurrent neural network (Bio-LSTM) that can predict the location and 3D articulated body pose of pedestrians in a global coordinate frame, given 3D poses and locations estimated in prior frames with inaccuracy. The proposed network is able to predict poses and global locations for multiple pedestrians simultaneously, for pedestrians up to 45 meters from the cameras (urban intersection scale). The outputs of the proposed network are full-body 3D meshes represented in Skinned Multi-Person Linear (SMPL) model parameters. The proposed approach relies on a novel objective function that incorporates the periodicity of human walking (gait), the mirror symmetry of the human body, and the change of ground reaction forces in a human gait cycle. This paper presents prediction results on the PedX dataset, a large-scale, in-the-wild data set collected at real urban intersections with heavy pedestrian traffic. Results show that the proposed network can successfully learn the characteristics of pedestrian gait and produce accurate and consistent 3D pose predictions. |
Tasks | Autonomous Driving |
Published | 2018-09-11 |
URL | https://arxiv.org/abs/1809.03705v3 |
https://arxiv.org/pdf/1809.03705v3.pdf | |
PWC | https://paperswithcode.com/paper/bio-lstm-a-biomechanically-inspired-recurrent |
Repo | |
Framework | |
CaricatureShop: Personalized and Photorealistic Caricature Sketching
Title | CaricatureShop: Personalized and Photorealistic Caricature Sketching |
Authors | Xiaoguang Han, Kangcheng Hou, Dong Du, Yuda Qiu, Yizhou Yu, Kun Zhou, Shuguang Cui |
Abstract | In this paper, we propose the first sketching system for interactively personalized and photorealistic face caricaturing. Input an image of a human face, the users can create caricature photos by manipulating its facial feature curves. Our system firstly performs exaggeration on the recovered 3D face model according to the edited sketches, which is conducted by assigning the laplacian of each vertex a scaling factor. To construct the mapping between 2D sketches and a vertex-wise scaling field, a novel deep learning architecture is developed. With the obtained 3D caricature model, two images are generated, one obtained by applying 2D warping guided by the underlying 3D mesh deformation and the other obtained by re-rendering the deformed 3D textured model. These two images are then seamlessly integrated to produce our final output. Due to the severely stretching of meshes, the rendered texture is of blurry appearances. A deep learning approach is exploited to infer the missing details for enhancing these blurry regions. Moreover, a relighting operation is invented to further improve the photorealism of the result. Both quantitative and qualitative experiment results validated the efficiency of our sketching system and the superiority of our proposed techniques against existing methods. |
Tasks | Caricature |
Published | 2018-07-24 |
URL | http://arxiv.org/abs/1807.09064v1 |
http://arxiv.org/pdf/1807.09064v1.pdf | |
PWC | https://paperswithcode.com/paper/caricatureshop-personalized-and |
Repo | |
Framework | |
Randomized Iterative Algorithms for Fisher Discriminant Analysis
Title | Randomized Iterative Algorithms for Fisher Discriminant Analysis |
Authors | Agniva Chowdhury, Jiasen Yang, Petros Drineas |
Abstract | Fisher discriminant analysis (FDA) is a widely used method for classification and dimensionality reduction. When the number of predictor variables greatly exceeds the number of observations, one of the alternatives for conventional FDA is regularized Fisher discriminant analysis (RFDA). In this paper, we present a simple, iterative, sketching-based algorithm for RFDA that comes with provable accuracy guarantees when compared to the conventional approach. Our analysis builds upon two simple structural results that boil down to randomized matrix multiplication, a fundamental and well-understood primitive of randomized linear algebra. We analyze the behavior of RFDA when the ridge leverage and the standard leverage scores are used to select predictor variables and we prove that accurate approximations can be achieved by a sample whose size depends on the effective degrees of freedom of the RFDA problem. Our results yield significant improvements over existing approaches and our empirical evaluations support our theoretical analyses. |
Tasks | Dimensionality Reduction |
Published | 2018-09-09 |
URL | http://arxiv.org/abs/1809.03045v2 |
http://arxiv.org/pdf/1809.03045v2.pdf | |
PWC | https://paperswithcode.com/paper/randomized-iterative-algorithms-for-fisher |
Repo | |
Framework | |
Wearable Affective Robot
Title | Wearable Affective Robot |
Authors | Min Chen, Jun Zhou, Guangming Tao, Jun Yang, Long Hu |
Abstract | With the development of the artificial intelligence (AI), the AI applications have influenced and changed people’s daily life greatly. Here, a wearable affective robot that integrates the affective robot, social robot, brain wearable, and wearable 2.0 is proposed for the first time. The proposed wearable affective robot is intended for a wide population, and we believe that it can improve the human health on the spirit level, meeting the fashion requirements at the same time. In this paper, the architecture and design of an innovative wearable affective robot, which is dubbed as Fitbot, are introduced in terms of hardware and algorithm’s perspectives. In addition, the important functional component of the robot-brain wearable device is introduced from the aspect of the hardware design, EEG data acquisition and analysis, user behavior perception, and algorithm deployment, etc. Then, the EEG based cognition of user’s behavior is realized. Through the continuous acquisition of the in-depth, in-breadth data, the Fitbot we present can gradually enrich user’s life modeling and enable the wearable robot to recognize user’s intention and further understand the behavioral motivation behind the user’s emotion. The learning algorithm for the life modeling embedded in Fitbot can achieve better user’s experience of affective social interaction. Finally, the application service scenarios and some challenging issues of a wearable affective robot are discussed. |
Tasks | EEG |
Published | 2018-10-25 |
URL | http://arxiv.org/abs/1810.10743v1 |
http://arxiv.org/pdf/1810.10743v1.pdf | |
PWC | https://paperswithcode.com/paper/wearable-affective-robot |
Repo | |
Framework | |
Multispectral Compressive Imaging Strategies using Fabry-Pérot Filtered Sensors
Title | Multispectral Compressive Imaging Strategies using Fabry-Pérot Filtered Sensors |
Authors | Kévin Degraux, Valerio Cambareri, Bert Geelen, Laurent Jacques, Gauthier Lafruit |
Abstract | This paper introduces two acquisition device architectures for multispectral compressive imaging. Unlike most existing methods, the proposed computational imaging techniques do not include any dispersive element, as they use a dedicated sensor which integrates narrowband Fabry-P'erot spectral filters at the pixel level. The first scheme leverages joint inpainting and super-resolution to fill in those voxels that are missing due to the device’s limited pixel count. The second scheme, in link with compressed sensing, introduces spatial random convolutions, but is more complex and may be affected by diffraction. In both cases we solve the associated inverse problems by using the same signal prior. Specifically, we propose a redundant analysis signal prior in a convex formulation. Through numerical simulations, we explore different realistic setups. Our objective is also to highlight some practical guidelines and discuss their complexity trade-offs to integrate these schemes into actual computational imaging systems. Our conclusion is that the second technique performs best at high compression levels, in a properly sized and calibrated setup. Otherwise, the first, simpler technique should be favored. |
Tasks | Super-Resolution |
Published | 2018-02-06 |
URL | http://arxiv.org/abs/1802.02040v1 |
http://arxiv.org/pdf/1802.02040v1.pdf | |
PWC | https://paperswithcode.com/paper/multispectral-compressive-imaging-strategies |
Repo | |
Framework | |