October 19, 2019

3173 words 15 mins read

Paper Group ANR 357

TensOrMachine: Probabilistic Boolean Tensor Decomposition. Efficient algorithms for robust submodular maximization under matroid constraints. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control. Deep Video Portraits. Near-Lossless Deep Feature Compression for Collaborative Intelligence. A Study of Car-to-Train Ass …

TensOrMachine: Probabilistic Boolean Tensor Decomposition


Title	TensOrMachine: Probabilistic Boolean Tensor Decomposition
Authors	Tammo Rukat, Chris C. Holmes, Christopher Yau
Abstract	Boolean tensor decomposition approximates data of multi-way binary relationships as product of interpretable low-rank binary factors, following the rules of Boolean algebra. Here, we present its first probabilistic treatment. We facilitate scalable sampling-based posterior inference by exploitation of the combinatorial structure of the factor conditionals. Maximum a posteriori decompositions feature higher accuracies than existing techniques throughout a wide range of simulated conditions. Moreover, the probabilistic approach facilitates the treatment of missing data and enables model selection with much greater accuracy. We investigate three real-world data-sets. First, temporal interaction networks in a hospital ward and behavioural data of university students demonstrate the inference of instructive latent patterns. Next, we decompose a tensor with more than 10 billion data points, indicating relations of gene expression in cancer patients. Not only does this demonstrate scalability, it also provides an entirely novel perspective on relational properties of continuous data and, in the present example, on the molecular heterogeneity of cancer. Our implementation is available on GitHub: https://github.com/TammoR/LogicalFactorisationMachines.
Tasks	Model Selection
Published	2018-05-11
URL	http://arxiv.org/abs/1805.04582v1
PDF	http://arxiv.org/pdf/1805.04582v1.pdf
PWC	https://paperswithcode.com/paper/tensormachine-probabilistic-boolean-tensor
Repo
Framework

Efficient algorithms for robust submodular maximization under matroid constraints


Title	Efficient algorithms for robust submodular maximization under matroid constraints
Authors	Sebastian Pokutta, Mohit Singh, Alfredo Torrico
Abstract	In this work, we consider robust submodular maximization with matroid constraints. We give an efficient bi-criteria approximation algorithm that outputs a small family of feasible sets whose union has (nearly) optimal objective value. This algorithm theoretically performs less function calls than previous works at cost of adding more elements to the final solution. We also provide significant implementation improvements showing that our algorithm outperforms the algorithms in the existing literature. We finally assess the performance of our contributions in three real-world applications.
Tasks
Published	2018-07-25
URL	http://arxiv.org/abs/1807.09405v1
PDF	http://arxiv.org/pdf/1807.09405v1.pdf
PWC	https://paperswithcode.com/paper/efficient-algorithms-for-robust-submodular
Repo
Framework

Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control


Title	Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control
Authors	Kendall Lowrey, Aravind Rajeswaran, Sham Kakade, Emanuel Todorov, Igor Mordatch
Abstract	We propose a plan online and learn offline (POLO) framework for the setting where an agent, with an internal model, needs to continually act and learn in the world. Our work builds on the synergistic relationship between local model-based control, global value function learning, and exploration. We study how local trajectory optimization can cope with approximation errors in the value function, and can stabilize and accelerate value function learning. Conversely, we also study how approximate value functions can help reduce the planning horizon and allow for better policies beyond local solutions. Finally, we also demonstrate how trajectory optimization can be used to perform temporally coordinated exploration in conjunction with estimating uncertainty in value function approximation. This exploration is critical for fast and stable learning of the value function. Combining these components enable solutions to complex simulated control tasks, like humanoid locomotion and dexterous in-hand manipulation, in the equivalent of a few minutes of experience in the real world.
Tasks
Published	2018-11-05
URL	http://arxiv.org/abs/1811.01848v3
PDF	http://arxiv.org/pdf/1811.01848v3.pdf
PWC	https://paperswithcode.com/paper/plan-online-learn-offline-efficient-learning
Repo
Framework

Deep Video Portraits


Title	Deep Video Portraits
Authors	Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Nießner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, Christian Theobalt
Abstract	We present a novel approach that enables photo-realistic re-animation of portrait videos using only an input video. In contrast to existing approaches that are restricted to manipulations of facial expressions only, we are the first to transfer the full 3D head position, head rotation, face expression, eye gaze, and eye blinking from a source actor to a portrait video of a target actor. The core of our approach is a generative neural network with a novel space-time architecture. The network takes as input synthetic renderings of a parametric face model, based on which it predicts photo-realistic video frames for a given target actor. The realism in this rendering-to-video transfer is achieved by careful adversarial training, and as a result, we can create modified target videos that mimic the behavior of the synthetically-created input. In order to enable source-to-target video re-animation, we render a synthetic target video with the reconstructed head animation parameters from a source video, and feed it into the trained network – thus taking full control of the target. With the ability to freely recombine source and target parameters, we are able to demonstrate a large variety of video rewrite applications without explicitly modeling hair, body or background. For instance, we can reenact the full head using interactive user-controlled editing, and realize high-fidelity visual dubbing. To demonstrate the high quality of our output, we conduct an extensive series of experiments and evaluations, where for instance a user study shows that our video edits are hard to detect.
Tasks
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11714v1
PDF	http://arxiv.org/pdf/1805.11714v1.pdf
PWC	https://paperswithcode.com/paper/deep-video-portraits
Repo
Framework

Near-Lossless Deep Feature Compression for Collaborative Intelligence


Title	Near-Lossless Deep Feature Compression for Collaborative Intelligence
Authors	Hyomin Choi, Ivan V. Bajic
Abstract	Collaborative intelligence is a new paradigm for efficient deployment of deep neural networks across the mobile-cloud infrastructure. By dividing the network between the mobile and the cloud, it is possible to distribute the computational workload such that the overall energy and/or latency of the system is minimized. However, this necessitates sending deep feature data from the mobile to the cloud in order to perform inference. In this work, we examine the differences between the deep feature data and natural image data, and propose a simple and effective near-lossless deep feature compressor. The proposed method achieves up to 5% bit rate reduction compared to HEVC-Intra and even more against other popular image codecs. Finally, we suggest an approach for reconstructing the input image from compressed deep features in the cloud, that could serve to supplement the inference performed by the deep model.
Tasks
Published	2018-04-26
URL	http://arxiv.org/abs/1804.09963v2
PDF	http://arxiv.org/pdf/1804.09963v2.pdf
PWC	https://paperswithcode.com/paper/near-lossless-deep-feature-compression-for
Repo
Framework

A Study of Car-to-Train Assignment Problem for Rail Express Cargos on Scheduled and Unscheduled Train Service Network


Title	A Study of Car-to-Train Assignment Problem for Rail Express Cargos on Scheduled and Unscheduled Train Service Network
Authors	Boliang Lin
Abstract	Freight train services in a railway network system are generally divided into two categories: one is the unscheduled train, whose operating frequency fluctuates with origin-destination (OD) demands; the other is the scheduled train, which is running based on regular timetable just like the passenger trains. The timetable will be released to the public if determined and it would not be influenced by OD demands. Typically, the total capacity of scheduled trains can usually satisfy the predicted demands of express cargos in average. However, the demands are changing in practice. Therefore, how to distribute the shipments between different stations to unscheduled and scheduled train services has become an important research field in railway transportation. This paper focuses on the coordinated optimization of the rail express cargos distribution in two service networks. On the premise of fully utilizing the capacity of scheduled service network first, we established a Car-to-Train (CTT) assignment model to assign rail express cargos to scheduled and unscheduled trains scientifically. The objective function is to maximize the net income of transporting the rail express cargos. The constraints include the capacity restriction on the service arcs, flow balance constraints, logical relationship constraint between two groups of decision variables and the due date constraint. The last constraint is to ensure that the total transportation time of a shipment would not be longer than its predefined due date. Finally, we discuss the linearization techniques to simplify the model proposed in this paper, which make it possible for obtaining global optimal solution by using the commercial software.
Tasks
Published	2018-03-14
URL	http://arxiv.org/abs/1803.05760v1
PDF	http://arxiv.org/pdf/1803.05760v1.pdf
PWC	https://paperswithcode.com/paper/a-study-of-car-to-train-assignment-problem
Repo
Framework

Seed-Point Detection of Clumped Convex Objects by Short-Range Attractive Long-Range Repulsive Particle Clustering


Title	Seed-Point Detection of Clumped Convex Objects by Short-Range Attractive Long-Range Repulsive Particle Clustering
Authors	James Kapaldo, Xu Han, Domingo Mery
Abstract	Locating the center of convex objects is important in both image processing and unsupervised machine learning/data clustering fields. The automated analysis of biological images uses both of these fields for locating cell nuclei and for discovering new biological effects or cell phenotypes. In this work, we develop a novel clustering method for locating the centers of overlapping convex objects by modeling particles that interact by a short-range attractive and long-range repulsive potential and are confined to a potential well created from the data. We apply this method to locating the centers of clumped nuclei in cultured cells, where we show that it results in a significant improvement over existing methods (8.2% in F$_1$ score); and we apply it to unsupervised learning on a difficult data set that has rare classes without local density maxima, and show it is able to well locate cluster centers when other clustering techniques fail.
Tasks
Published	2018-04-11
URL	http://arxiv.org/abs/1804.04071v1
PDF	http://arxiv.org/pdf/1804.04071v1.pdf
PWC	https://paperswithcode.com/paper/seed-point-detection-of-clumped-convex
Repo
Framework

PVSNet: Palm Vein Authentication Siamese Network Trained using Triplet Loss and Adaptive Hard Mining by Learning Enforced Domain Specific Features


Title	PVSNet: Palm Vein Authentication Siamese Network Trained using Triplet Loss and Adaptive Hard Mining by Learning Enforced Domain Specific Features
Authors	Daksh Thapar, Gaurav Jaswal, Aditya Nigam, Vivek Kanhangad
Abstract	Designing an end-to-end deep learning network to match the biometric features with limited training samples is an extremely challenging task. To address this problem, we propose a new way to design an end-to-end deep CNN framework i.e., PVSNet that works in two major steps: first, an encoder-decoder network is used to learn generative domain-specific features followed by a Siamese network in which convolutional layers are pre-trained in an unsupervised fashion as an autoencoder. The proposed model is trained via triplet loss function that is adjusted for learning feature embeddings in a way that minimizes the distance between embedding-pairs from the same subject and maximizes the distance with those from different subjects, with a margin. In particular, a triplet Siamese matching network using an adaptive margin based hard negative mining has been suggested. The hyper-parameters associated with the training strategy, like the adaptive margin, have been tuned to make the learning more effective on biometric datasets. In extensive experimentation, the proposed network outperforms most of the existing deep learning solutions on three type of typical vein datasets which clearly demonstrates the effectiveness of our proposed method.
Tasks
Published	2018-12-15
URL	http://arxiv.org/abs/1812.06271v1
PDF	http://arxiv.org/pdf/1812.06271v1.pdf
PWC	https://paperswithcode.com/paper/pvsnet-palm-vein-authentication-siamese
Repo
Framework

Self-supervised CNN for Unconstrained 3D Facial Performance Capture from an RGB-D Camera


Title	Self-supervised CNN for Unconstrained 3D Facial Performance Capture from an RGB-D Camera
Authors	Yudong Guo, Juyong Zhang, Lin Cai, Jianfei Cai, Jianmin Zheng
Abstract	We present a novel method for real-time 3D facial performance capture with consumer-level RGB-D sensors. Our capturing system is targeted at robust and stable 3D face capturing in the wild, in which the RGB-D facial data contain noise, imperfection and occlusion, and often exhibit high variability in motion, pose, expression and lighting conditions, thus posing great challenges. The technical contribution is a self-supervised deep learning framework, which is trained directly from raw RGB-D data. The key novelties include: (1) learning both the core tensor and the parameters for refining our parametric face model; (2) using vertex displacement and UV map for learning surface detail; (3) designing the loss function by incorporating temporal coherence and same identity constraints based on pairs of RGB-D images and utilizing sparse norms, in addition to the conventional terms for photo-consistency, feature similarity, regularization as well as geometry consistency; and (4) augmenting the training data set in new ways. The method is demonstrated in a live setup that runs in real-time on a smartphone and an RGB-D sensor. Extensive experiments show that our method is robust to severe occlusion, fast motion, large rotation, exaggerated facial expressions and diverse lighting.
Tasks
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05323v2
PDF	http://arxiv.org/pdf/1808.05323v2.pdf
PWC	https://paperswithcode.com/paper/self-supervised-cnn-for-unconstrained-3d
Repo
Framework

Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections


Title	Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections
Authors	Xin Zhang, Armando Solar-Lezama, Rishabh Singh
Abstract	We present a new algorithm to generate minimal, stable, and symbolic corrections to an input that will cause a neural network with ReLU activations to change its output. We argue that such a correction is a useful way to provide feedback to a user when the network’s output is different from a desired output. Our algorithm generates such a correction by solving a series of linear constraint satisfaction problems. The technique is evaluated on three neural network models: one predicting whether an applicant will pay a mortgage, one predicting whether a first-order theorem can be proved efficiently by a solver using certain heuristics, and the final one judging whether a drawing is an accurate rendition of a canonical drawing of a cat.
Tasks
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07384v2
PDF	http://arxiv.org/pdf/1802.07384v2.pdf
PWC	https://paperswithcode.com/paper/interpreting-neural-network-judgments-via
Repo
Framework

Integrating Human-Provided Information Into Belief State Representation Using Dynamic Factorization


Title	Integrating Human-Provided Information Into Belief State Representation Using Dynamic Factorization
Authors	Rohan Chitnis, Leslie Pack Kaelbling, Tomás Lozano-Pérez
Abstract	In partially observed environments, it can be useful for a human to provide the robot with declarative information that represents probabilistic relational constraints on properties of objects in the world, augmenting the robot’s sensory observations. For instance, a robot tasked with a search-and-rescue mission may be informed by the human that two victims are probably in the same room. An important question arises: how should we represent the robot’s internal knowledge so that this information is correctly processed and combined with raw sensory information? In this paper, we provide an efficient belief state representation that dynamically selects an appropriate factoring, combining aspects of the belief when they are correlated through information and separating them when they are not. This strategy works in open domains, in which the set of possible objects is not known in advance, and provides significant improvements in inference time over a static factoring, leading to more efficient planning for complex partially observed tasks. We validate our approach experimentally in two open-domain planning problems: a 2D discrete gridworld task and a 3D continuous cooking task. A supplementary video can be found at http://tinyurl.com/chitnis-iros-18.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1803.00119v4
PDF	http://arxiv.org/pdf/1803.00119v4.pdf
PWC	https://paperswithcode.com/paper/integrating-human-provided-information-into
Repo
Framework

Deep Residual Networks with a Fully Connected Recon-struction Layer for Single Image Super-Resolution


Title	Deep Residual Networks with a Fully Connected Recon-struction Layer for Single Image Super-Resolution
Authors	Yongliang Tang, Jiashui Huang, Faen Zhang, Weiguo Gong
Abstract	Recently, deep neural networks have achieved impressive performance in terms of both reconstruction accuracy and efficiency for single image super-resolution (SISR). However, the network model of these methods is a fully convolutional neural network, which is limit to exploit the differentiated contextual information over the global region of the input image because of the weight sharing in convolution height and width extent. In this paper, we discuss a new SISR architecture where features are extracted in the low-resolution (LR) space, and then we use a fully connected layer which learns an array of differentiated upsampling weights to reconstruct the desired high-resolution (HR) image from the final obtained LR features. By doing so, we effectively exploit the differentiated contextual information over the whole input image region, whilst maintaining the low computational complexity for the overall SR operations. In addition, we introduce an edge difference constraint into our loss function to preserve edges and texture structures. Extensive experiments validate that our SISR method outperforms the existing state-of-the-art methods.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-05-24
URL	https://arxiv.org/abs/1805.10143v2
PDF	https://arxiv.org/pdf/1805.10143v2.pdf
PWC	https://paperswithcode.com/paper/deep-residual-networks-with-a-fully-connected
Repo
Framework

3D Geometry-Aware Semantic Labeling of Outdoor Street Scenes


Title	3D Geometry-Aware Semantic Labeling of Outdoor Street Scenes
Authors	Yiran Zhong, Yuchao Dai, Hongdong Li
Abstract	This paper is concerned with the problem of how to better exploit 3D geometric information for dense semantic image labeling. Existing methods often treat the available 3D geometry information (e.g., 3D depth-map) simply as an additional image channel besides the R-G-B color channels, and apply the same technique for RGB image labeling. In this paper, we demonstrate that directly performing 3D convolution in the framework of a residual connected 3D voxel top-down modulation network can lead to superior results. Specifically, we propose a 3D semantic labeling method to label outdoor street scenes whenever a dense depth map is available. Experiments on the “Synthia” and “Cityscape” datasets show our method outperforms the state-of-the-art methods, suggesting such a simple 3D representation is effective in incorporating 3D geometric information.
Tasks
Published	2018-08-13
URL	http://arxiv.org/abs/1808.04028v1
PDF	http://arxiv.org/pdf/1808.04028v1.pdf
PWC	https://paperswithcode.com/paper/3d-geometry-aware-semantic-labeling-of
Repo
Framework

A hybrid approach of interpolations and CNN to obtain super-resolution


Title	A hybrid approach of interpolations and CNN to obtain super-resolution
Authors	Ram Krishna Pandey, A G Ramakrishnan
Abstract	We propose a novel architecture that learns an end-to-end mapping function to improve the spatial resolution of the input natural images. The model is unique in forming a nonlinear combination of three traditional interpolation techniques using the convolutional neural network. Another proposed architecture uses a skip connection with nearest neighbor interpolation, achieving almost similar results. The architectures have been carefully designed to ensure that the reconstructed images lie precisely in the manifold of high-resolution images, thereby preserving the high-frequency components with fine details. We have compared with the state of the art and recent deep learning based natural image super-resolution techniques and found that our methods are able to preserve the sharp details in the image, while also obtaining comparable or better PSNR than them. Since our methods use only traditional interpolations and a shallow CNN with less number of smaller filters, the computational cost is kept low. We have reported the results of two proposed architectures on five standard datasets, for an upscale factor of 2. Our methods generalize well in most cases, which is evident from the better results obtained with increasingly complex datasets. For 4-times upscaling, we have designed similar architectures for comparing with other methods.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-05-23
URL	http://arxiv.org/abs/1805.09400v1
PDF	http://arxiv.org/pdf/1805.09400v1.pdf
PWC	https://paperswithcode.com/paper/a-hybrid-approach-of-interpolations-and-cnn
Repo
Framework

New Techniques for Preserving Global Structure and Denoising with Low Information Loss in Single-Image Super-Resolution


Title	New Techniques for Preserving Global Structure and Denoising with Low Information Loss in Single-Image Super-Resolution
Authors	Yijie Bei, Alex Damian, Shijia Hu, Sachit Menon, Nikhil Ravi, Cynthia Rudin
Abstract	This work identifies and addresses two important technical challenges in single-image super-resolution: (1) how to upsample an image without magnifying noise and (2) how to preserve large scale structure when upsampling. We summarize the techniques we developed for our second place entry in Track 1 (Bicubic Downsampling), seventh place entry in Track 2 (Realistic Adverse Conditions), and seventh place entry in Track 3 (Realistic difficult) in the 2018 NTIRE Super-Resolution Challenge. Furthermore, we present new neural network architectures that specifically address the two challenges listed above: denoising and preservation of large-scale structure.
Tasks	Denoising, Image Super-Resolution, Super-Resolution
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03383v2
PDF	http://arxiv.org/pdf/1805.03383v2.pdf
PWC	https://paperswithcode.com/paper/new-techniques-for-preserving-global
Repo
Framework