January 30, 2020

2951 words 14 mins read

Paper Group ANR 244

Nonparametric Contextual Bandits in an Unknown Metric Space. Which Ads to Show? Advertisement Image Assessment with Auxiliary Information via Multi-step Modality Fusion. Making Meaning: Semiotics Within Predictive Knowledge Architectures. SkrGAN: Sketching-rendering Unconditional Generative Adversarial Networks for Medical Image Synthesis. 6-DOF Gr …

Nonparametric Contextual Bandits in an Unknown Metric Space


Title	Nonparametric Contextual Bandits in an Unknown Metric Space
Authors	Nirandika Wanigasekara, Christina Lee Yu
Abstract	Consider a nonparametric contextual multi-arm bandit problem where each arm $a \in [K]$ is associated to a nonparametric reward function $f_a: [0,1] \to \mathbb{R}$ mapping from contexts to the expected reward. Suppose that there is a large set of arms, yet there is a simple but unknown structure amongst the arm reward functions, e.g. finite types or smooth with respect to an unknown metric space. We present a novel algorithm which learns data-driven similarities amongst the arms, in order to implement adaptive partitioning of the context-arm space for more efficient learning. We provide regret bounds along with simulations that highlight the algorithm’s dependence on the local geometry of the reward functions.
Tasks	Multi-Armed Bandits
Published	2019-08-03
URL	https://arxiv.org/abs/1908.01228v1
PDF	https://arxiv.org/pdf/1908.01228v1.pdf
PWC	https://paperswithcode.com/paper/nonparametric-contextual-bandits-in-an
Repo
Framework

Which Ads to Show? Advertisement Image Assessment with Auxiliary Information via Multi-step Modality Fusion


Title	Which Ads to Show? Advertisement Image Assessment with Auxiliary Information via Multi-step Modality Fusion
Authors	Kyung-Wha Park, JungHoon Lee, Sunyoung Kwon, Jung-Woo Ha, Kyung-Min Kim, Byoung-Tak Zhang
Abstract	Assessing aesthetic preference is a fundamental task related to human cognition. It can also contribute to various practical applications such as image creation for online advertisements. Despite crucial influences of image quality, auxiliary information of ad images such as tags and target subjects can also determine image preference. Existing studies mainly focus on images and thus are less useful for advertisement scenarios where rich auxiliary data are available. Here we propose a modality fusion-based neural network that evaluates the aesthetic preference of images with auxiliary information. Our method fully utilizes auxiliary data by introducing multi-step modality fusion using both conditional batch normalization-based low-level and attention-based high-level fusion mechanisms, inspired by the findings from statistical analyses on real advertisement data. Our approach achieved state-of-the-art performance on the AVA dataset, a widely used dataset for aesthetic assessment. Besides, the proposed method is evaluated on large-scale real-world advertisement image data with rich auxiliary attributes, providing promising preference prediction results. Through extensive experiments, we investigate how image and auxiliary information together influence click-through rate.
Tasks
Published	2019-10-06
URL	https://arxiv.org/abs/1910.02358v1
PDF	https://arxiv.org/pdf/1910.02358v1.pdf
PWC	https://paperswithcode.com/paper/which-ads-to-show-advertisement-image
Repo
Framework

Making Meaning: Semiotics Within Predictive Knowledge Architectures


Title	Making Meaning: Semiotics Within Predictive Knowledge Architectures
Authors	Alex Kearney, Oliver Oxton
Abstract	Within Reinforcement Learning, there is a fledgling approach to conceptualizing the environment in terms of predictions. Central to this predictive approach is the assertion that it is possible to construct ontologies in terms of predictions about sensation, behaviour, and time—to categorize the world into entities which express all aspects of the world using only predictions. This construction of ontologies is integral to predictive approaches to machine knowledge where objects are described exclusively in terms of how they are perceived. In this paper, we ground the Pericean model of semiotics in terms of Reinforcement Learning Methods, describing Peirce’s Three Categories in the notation of General Value Functions. Using the Peircean model of semiotics, we demonstrate that predictions alone are insufficient to construct an ontology; however, we identify predictions as being integral to the meaning-making process. Moreover, we discuss how predictive knowledge provides a particularly stable foundation for semiosis\textemdash the process of making meaning\textemdash and suggest a possible avenue of research to design algorithmic methods which construct semantics and meaning using predictions.
Tasks
Published	2019-04-18
URL	http://arxiv.org/abs/1904.09023v1
PDF	http://arxiv.org/pdf/1904.09023v1.pdf
PWC	https://paperswithcode.com/paper/making-meaning-semiotics-within-predictive
Repo
Framework

SkrGAN: Sketching-rendering Unconditional Generative Adversarial Networks for Medical Image Synthesis


Title	SkrGAN: Sketching-rendering Unconditional Generative Adversarial Networks for Medical Image Synthesis
Authors	Tianyang Zhang, Huazhu Fu, Yitian Zhao, Jun Cheng, Mengjie Guo, Zaiwang Gu, Bing Yang, Yuting Xiao, Shenghua Gao, Jiang Liu
Abstract	Generative Adversarial Networks (GANs) have the capability of synthesizing images, which have been successfully applied to medical image synthesis tasks. However, most of existing methods merely consider the global contextual information and ignore the fine foreground structures, e.g., vessel, skeleton, which may contain diagnostic indicators for medical image analysis. Inspired by human painting procedure, which is composed of stroking and color rendering steps, we propose a Sketching-rendering Unconditional Generative Adversarial Network (SkrGAN) to introduce a sketch prior constraint to guide the medical image generation. In our SkrGAN, a sketch guidance module is utilized to generate a high quality structural sketch from random noise, then a color render mapping is used to embed the sketch-based representations and resemble the background appearances. Experimental results show that the proposed SkrGAN achieves the state-of-the-art results in synthesizing images for various image modalities, including retinal color fundus, X-Ray, Computed Tomography (CT) and Magnetic Resonance Imaging (MRI). In addition, we also show that the performances of medical image segmentation method have been improved by using our synthesized images as data augmentation.
Tasks	Computed Tomography (CT), Data Augmentation, Image Generation, Medical Image Generation, Medical Image Segmentation, Semantic Segmentation
Published	2019-08-06
URL	https://arxiv.org/abs/1908.04346v1
PDF	https://arxiv.org/pdf/1908.04346v1.pdf
PWC	https://paperswithcode.com/paper/skrgan-sketching-rendering-unconditional
Repo
Framework

6-DOF GraspNet: Variational Grasp Generation for Object Manipulation


Title	6-DOF GraspNet: Variational Grasp Generation for Object Manipulation
Authors	Arsalan Mousavian, Clemens Eppner, Dieter Fox
Abstract	Generating grasp poses is a crucial component for any robot object manipulation task. In this work, we formulate the problem of grasp generation as sampling a set of grasps using a variational autoencoder and assess and refine the sampled grasps using a grasp evaluator model. Both Grasp Sampler and Grasp Refinement networks take 3D point clouds observed by a depth camera as input. We evaluate our approach in simulation and real-world robot experiments. Our approach achieves 88% success rate on various commonly used objects with diverse appearances, scales, and weights. Our model is trained purely in simulation and works in the real world without any extra steps. The video of our experiments can be found at: https://research.nvidia.com/publication/2019-10_6-DOF-GraspNet%3A-Variational
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10520v2
PDF	https://arxiv.org/pdf/1905.10520v2.pdf
PWC	https://paperswithcode.com/paper/6-dof-graspnet-variational-grasp-generation
Repo
Framework

Learning to Predict Novel Noun-Noun Compounds


Title	Learning to Predict Novel Noun-Noun Compounds
Authors	Prajit Dhar, Lonneke van der Plas
Abstract	We introduce temporally and contextually-aware models for the novel task of predicting unseen but plausible concepts, as conveyed by noun-noun compounds in a time-stamped corpus. We train compositional models on observed compounds, more specifically the composed distributed representations of their constituents across a time-stamped corpus, while giving it corrupted instances (where head or modifier are replaced by a random constituent) as negative evidence. The model captures generalisations over this data and learns what combinations give rise to plausible compounds and which ones do not. After training, we query the model for the plausibility of automatically generated novel combinations and verify whether the classifications are accurate. For our best model, we find that in around 85% of the cases, the novel compounds generated are attested in previously unseen data. An additional estimated 5% are plausible despite not being attested in the recent corpus, based on judgments from independent human raters.
Tasks
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03634v2
PDF	https://arxiv.org/pdf/1906.03634v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-predict-novel-noun-noun-compounds
Repo
Framework

Dynamic Trip-Vehicle Dispatch with Scheduled and On-Demand Requests


Title	Dynamic Trip-Vehicle Dispatch with Scheduled and On-Demand Requests
Authors	Taoan Huang, Bohui Fang, Xiaohui Bei, Fei Fang
Abstract	Transportation service providers that dispatch drivers and vehicles to riders start to support both on-demand ride requests posted in real time and rides scheduled in advance, leading to new challenges which, to the best of our knowledge, have not been addressed by existing works. To fill the gap, we design novel trip-vehicle dispatch algorithms to handle both types of requests while taking into account an estimated request distribution of on-demand requests. At the core of the algorithms is the newly proposed Constrained Spatio-Temporal value function (CST-function), which is polynomial-time computable and represents the expected value a vehicle could gain with the constraint that it needs to arrive at a specific location at a given time. Built upon CST-function, we design a randomized best-fit algorithm for scheduled requests and an online planning algorithm for on-demand requests given the scheduled requests as constraints. We evaluate the algorithms through extensive experiments on a real-world dataset of an online ride-hailing platform.
Tasks
Published	2019-07-20
URL	https://arxiv.org/abs/1907.08739v1
PDF	https://arxiv.org/pdf/1907.08739v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-trip-vehicle-dispatch-with-scheduled
Repo
Framework

Comparing EM with GD in Mixture Models of Two Components


Title	Comparing EM with GD in Mixture Models of Two Components
Authors	Guojun Zhang, Pascal Poupart, George Trimponias
Abstract	The expectation-maximization (EM) algorithm has been widely used in minimizing the negative log likelihood (also known as cross entropy) of mixture models. However, little is understood about the goodness of the fixed points it converges to. In this paper, we study the regions where one component is missing in two-component mixture models, which we call one-cluster regions. We analyze the propensity of such regions to trap EM and gradient descent (GD) for mixtures of two Gaussians and mixtures of two Bernoullis. In the case of Gaussian mixtures, EM escapes one-cluster regions exponentially fast, while GD escapes them linearly fast. In the case of mixtures of Bernoullis, we find that there exist one-cluster regions that are stable for GD and therefore trap GD, but those regions are unstable for EM, allowing EM to escape. Those regions are local minima that appear universally in experiments and can be arbitrarily bad. This work implies that EM is less likely than GD to converge to certain bad local optima in mixture models.
Tasks
Published	2019-07-08
URL	https://arxiv.org/abs/1907.03783v3
PDF	https://arxiv.org/pdf/1907.03783v3.pdf
PWC	https://paperswithcode.com/paper/comparing-em-with-gd-in-mixture-models-of-two
Repo
Framework

DispVoxNets: Non-Rigid Point Set Alignment with Supervised Learning Proxies


Title	DispVoxNets: Non-Rigid Point Set Alignment with Supervised Learning Proxies
Authors	Soshi Shimada, Vladislav Golyanik, Edgar Tretschk, Didier Stricker, Christian Theobalt
Abstract	We introduce a supervised-learning framework for non-rigid point set alignment of a new kind - Displacements on Voxels Networks (DispVoxNets) - which abstracts away from the point set representation and regresses 3D displacement fields on regularly sampled proxy 3D voxel grids. Thanks to recently released collections of deformable objects with known intra-state correspondences, DispVoxNets learn a deformation model and further priors (e.g., weak point topology preservation) for different object categories such as cloths, human bodies and faces. DispVoxNets cope with large deformations, noise and clustered outliers more robustly than the state-of-the-art. At test time, our approach runs orders of magnitude faster than previous techniques. All properties of DispVoxNets are ascertained numerically and qualitatively in extensive experiments and comparisons to several previous methods.
Tasks
Published	2019-07-24
URL	https://arxiv.org/abs/1907.10367v2
PDF	https://arxiv.org/pdf/1907.10367v2.pdf
PWC	https://paperswithcode.com/paper/dispvoxnets-non-rigid-point-set-alignment
Repo
Framework

A Type-coherent, Expressive Representation as an Initial Step to Language Understanding


Title	A Type-coherent, Expressive Representation as an Initial Step to Language Understanding
Authors	Gene Louis Kim, Lenhart Schubert
Abstract	A growing interest in tasks involving language understanding by the NLP community has led to the need for effective semantic parsing and inference. Modern NLP systems use semantic representations that do not quite fulfill the nuanced needs for language understanding: adequately modeling language semantics, enabling general inferences, and being accurately recoverable. This document describes underspecified logical forms (ULF) for Episodic Logic (EL), which is an initial form for a semantic representation that balances these needs. ULFs fully resolve the semantic type structure while leaving issues such as quantifier scope, word sense, and anaphora unresolved; they provide a starting point for further resolution into EL, and enable certain structural inferences without further resolution. This document also presents preliminary results of creating a hand-annotated corpus of ULFs for the purpose of training a precise ULF parser, showing a three-person pairwise interannotator agreement of 0.88 on confident annotations. We hypothesize that a divide-and-conquer approach to semantic parsing starting with derivation of ULFs will lead to semantic analyses that do justice to subtle aspects of linguistic meaning, and will enable construction of more accurate semantic parsers.
Tasks	Semantic Parsing
Published	2019-03-22
URL	http://arxiv.org/abs/1903.09333v2
PDF	http://arxiv.org/pdf/1903.09333v2.pdf
PWC	https://paperswithcode.com/paper/a-type-coherent-expressive-representation-as
Repo
Framework

Attribute noise robust binary classification


Title	Attribute noise robust binary classification
Authors	Aditya Petety, Sandhya Tripathi, N Hemachandra
Abstract	We consider the problem of learning linear classifiers when both features and labels are binary. In addition, the features are noisy, i.e., they could be flipped with an unknown probability. In Sy-De attribute noise model, where all features could be noisy together with same probability, we show that $0$-$1$ loss ($l_{0-1}$) need not be robust but a popular surrogate, squared loss ($l_{sq}$) is. In Asy-In attribute noise model, we prove that $l_{0-1}$ is robust for any distribution over 2 dimensional feature space. However, due to computational intractability of $l_{0-1}$, we resort to $l_{sq}$ and observe that it need not be Asy-In noise robust. Our empirical results support Sy-De robustness of squared loss for low to moderate noise rates.
Tasks
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07875v1
PDF	https://arxiv.org/pdf/1911.07875v1.pdf
PWC	https://paperswithcode.com/paper/attribute-noise-robust-binary-classification
Repo
Framework

Bayesian Topological Learning for Brain State Classification


Title	Bayesian Topological Learning for Brain State Classification
Authors	Farzana Nasrin, Christopher Oballe, David L. Boothe, Vasileios Maroulas
Abstract	Investigation of human brain states through electroencephalograph (EEG) signals is a crucial step in human-machine communications. However, classifying and analyzing EEG signals are challenging due to their noisy, nonlinear and nonstationary nature. Current methodologies for analyzing these signals often fall short because they have several regularity assumptions baked in. This work provides an effective, flexible and noise-resilient scheme to analyze EEG by extracting pertinent information while abiding by the 3N (noisy, nonlinear and nonstationary) nature of data. We implement a topological tool, namely persistent homology, that tracks the evolution of topological features over time intervals and incorporates individual’s expectations as prior knowledge by means of a Bayesian framework to compute posterior distributions. Relying on these posterior distributions, we apply Bayes factor classification to noisy EEG measurements. The performance of this Bayesian classification scheme is then compared with other existing methods for EEG signals.
Tasks	EEG
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08348v1
PDF	https://arxiv.org/pdf/1912.08348v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-topological-learning-for-brain-state
Repo
Framework

Capsule Attention for Multimodal EEG and EOG Spatiotemporal Representation Learning with Application to Driver Vigilance Estimation


Title	Capsule Attention for Multimodal EEG and EOG Spatiotemporal Representation Learning with Application to Driver Vigilance Estimation
Authors	Guangyi Zhang, Ali Etemad
Abstract	Driver vigilance estimation is an important task for transportation safety. Wearable and portable brain-computer interface devices provide a powerful means for real-time monitoring of the vigilance level of drivers, thus help with avoiding distracted or impaired driving. In this paper, we propose a novel multimodal architecture for in-vehicle vigilance estimation from Electroencephalogram and Electrooculogram. However, most current works in the area lack an effective framework for learning the part-whole relationships within the data and learning useful spatiotemporal representations. To tackle this problem and other issues associated with multimodal biological signal analysis, we propose an architecture composed of a capsule attention mechanism following a deep Long Short-Term Memory (LSTM) network. Our model learns both temporal and hierarchical/spatial dependencies in the data through the LSTM and capsule feature representation layers. To better explore the discriminative ability of the learned representations, we study the effect of the proposed capsule attention mechanism including the number of dynamic routing iterations as well as other parameters. Experiments show the robustness of our method by outperforming other solutions and baseline techniques, setting a new state-of-the-art.
Tasks	EEG, Representation Learning
Published	2019-12-17
URL	https://arxiv.org/abs/1912.07812v2
PDF	https://arxiv.org/pdf/1912.07812v2.pdf
PWC	https://paperswithcode.com/paper/capsule-attention-for-multimodal-eeg-and-eog
Repo
Framework

Background subtraction on depth videos with convolutional neural networks


Title	Background subtraction on depth videos with convolutional neural networks
Authors	Xueying Wang, Lei Liu, Guangli Li, Xiao Dong, Peng Zhao, Xiaobing Feng
Abstract	Background subtraction is a significant component of computer vision systems. It is widely used in video surveillance, object tracking, anomaly detection, etc. A new data source for background subtraction appeared as the emergence of low-cost depth sensors like Microsof t Kinect, Asus Xtion PRO, etc. In this paper, we propose a background subtraction approach on depth videos, which is based on convolutional neural networks (CNNs), called BGSNet-D (BackGround Subtraction neural Networks for Depth videos). The method can be used in color unavailable scenarios like poor lighting situations, and can also be applied to combine with existing RGB background subtraction methods. A preprocessing strategy is designed to reduce the influences incurred by noise from depth sensors. The experimental results on the SBM-RGBD dataset show that the proposed method outperforms existing methods on depth data.
Tasks	Anomaly Detection, Object Tracking
Published	2019-01-17
URL	http://arxiv.org/abs/1901.05676v1
PDF	http://arxiv.org/pdf/1901.05676v1.pdf
PWC	https://paperswithcode.com/paper/background-subtraction-on-depth-videos-with
Repo
Framework

Minimizing Impurity Partition Under Constraints


Title	Minimizing Impurity Partition Under Constraints
Authors	Thuan Nguyen, Thinh Nguyen
Abstract	Set partitioning is a key component of many algorithms in machine learning, signal processing, and communications. In general, the problem of finding a partition that minimizes a given impurity (loss function) is NP-hard. As such, there exists a wealth of literature on approximate algorithms and theoretical analyses of the partitioning problem under different settings. In this paper, we formulate and solve a variant of the partition problem called the minimum impurity partition under constraint (MIPUC). MIPUC finds an optimal partition that minimizes a given loss function under a given concave constraint. MIPUC generalizes the recently proposed deterministic information bottleneck problem which finds an optimal partition that maximizes the mutual information between the input and partition output while minimizing the partition output entropy. Our proposed algorithm is developed based on a novel optimality condition, which allows us to find a locally optimal solution efficiently. Moreover, we show that the optimal partition produces a hard partition that is equivalent to the cuts by hyperplanes in the probability space of the posterior probability that finally yields a polynomial time complexity algorithm to find the globally optimal partition. Both theoretical and numerical results are provided to validate the proposed algorithm.
Tasks
Published	2019-12-31
URL	https://arxiv.org/abs/1912.13141v1
PDF	https://arxiv.org/pdf/1912.13141v1.pdf
PWC	https://paperswithcode.com/paper/minimizing-impurity-partition-under
Repo
Framework