January 26, 2020

3247 words 16 mins read

Paper Group ANR 1446

Learning Weighted Submanifolds with Variational Autoencoders and Riemannian Variational Autoencoders. The VIA Annotation Software for Images, Audio and Video. Referring Expression Generation Using Entity Profiles. Leveraging Pretrained Image Classifiers for Language-Based Segmentation. Differentiable Representations For Multihop Inference Rules. Ef …

Learning Weighted Submanifolds with Variational Autoencoders and Riemannian Variational Autoencoders


Title	Learning Weighted Submanifolds with Variational Autoencoders and Riemannian Variational Autoencoders
Authors	Nina Miolane, Susan Holmes
Abstract	Manifold-valued data naturally arises in medical imaging. In cognitive neuroscience, for instance, brain connectomes base the analysis of coactivation patterns between different brain regions on the analysis of the correlations of their functional Magnetic Resonance Imaging (fMRI) time series - an object thus constrained by construction to belong to the manifold of symmetric positive definite matrices. One of the challenges that naturally arises consists of finding a lower-dimensional subspace for representing such manifold-valued data. Traditional techniques, like principal component analysis, are ill-adapted to tackle non-Euclidean spaces and may fail to achieve a lower-dimensional representation of the data - thus potentially pointing to the absence of lower-dimensional representation of the data. However, these techniques are restricted in that: (i) they do not leverage the assumption that the connectomes belong on a pre-specified manifold, therefore discarding information; (ii) they can only fit a linear subspace to the data. In this paper, we are interested in variants to learn potentially highly curved submanifolds of manifold-valued data. Motivated by the brain connectomes example, we investigate a latent variable generative model, which has the added benefit of providing us with uncertainty estimates - a crucial quantity in the medical applications we are considering. While latent variable models have been proposed to learn linear and nonlinear spaces for Euclidean data, or geodesic subspaces for manifold data, no intrinsic latent variable model exists to learn nongeodesic subspaces for manifold data. This paper fills this gap and formulates a Riemannian variational autoencoder with an intrinsic generative model of manifold-valued data. We evaluate its performances on synthetic and real datasets by introducing the formalism of weighted Riemannian submanifolds.
Tasks	Latent Variable Models, Time Series
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08147v1
PDF	https://arxiv.org/pdf/1911.08147v1.pdf
PWC	https://paperswithcode.com/paper/learning-weighted-submanifolds-with
Repo
Framework

The VIA Annotation Software for Images, Audio and Video


Title	The VIA Annotation Software for Images, Audio and Video
Authors	Abhishek Dutta, Andrew Zisserman
Abstract	In this paper, we introduce a simple and standalone manual annotation tool for images, audio and video: the VGG Image Annotator (VIA). This is a light weight, standalone and offline software package that does not require any installation or setup and runs solely in a web browser. The VIA software allows human annotators to define and describe spatial regions in images or video frames, and temporal segments in audio or video. These manual annotations can be exported to plain text data formats such as JSON and CSV and therefore are amenable to further processing by other software tools. VIA also supports collaborative annotation of a large dataset by a group of human annotators. The BSD open source license of this software allows it to be used in any academic project or commercial application.
Tasks
Published	2019-04-24
URL	https://arxiv.org/abs/1904.10699v3
PDF	https://arxiv.org/pdf/1904.10699v3.pdf
PWC	https://paperswithcode.com/paper/the-vgg-image-annotator-via
Repo
Framework

Referring Expression Generation Using Entity Profiles


Title	Referring Expression Generation Using Entity Profiles
Authors	Meng Cao, Jackie Chi Kit Cheung
Abstract	Referring Expression Generation (REG) is the task of generating contextually appropriate references to entities. A limitation of existing REG systems is that they rely on entity-specific supervised training, which means that they cannot handle entities not seen during training. In this study, we address this in two ways. First, we propose task setups in which we specifically test a REG system’s ability to generalize to entities not seen during training. Second, we propose a profile-based deep neural network model, ProfileREG, which encodes both the local context and an external profile of the entity to generate reference realizations. Our model generates tokens by learning to choose between generating pronouns, generating from a fixed vocabulary, or copying a word from the profile. We evaluate our model on three different splits of the WebNLG dataset, and show that it outperforms competitive baselines in all settings according to automatic and human evaluations.
Tasks
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01528v1
PDF	https://arxiv.org/pdf/1909.01528v1.pdf
PWC	https://paperswithcode.com/paper/referring-expression-generation-using-entity
Repo
Framework

Leveraging Pretrained Image Classifiers for Language-Based Segmentation


Title	Leveraging Pretrained Image Classifiers for Language-Based Segmentation
Authors	David Golub, Ahmed El-Kishky, Roberto Martín-Martín
Abstract	Current semantic segmentation models cannot easily generalize to new object classes unseen during train time: they require additional annotated images and retraining. We propose a novel segmentation model that injects visual priors into semantic segmentation architectures, allowing them to segment out new target labels without retraining. As visual priors, we use the activations of pretrained image classifiers, which provide noisy indications of the spatial location of both the target object and distractor objects in the scene. We leverage language semantics to obtain these activations for a target label unseen by the classifier. Further experiments show that the visual priors obtained via language semantics for both relevant and distracting objects are key to our performance.
Tasks	Semantic Segmentation
Published	2019-11-03
URL	https://arxiv.org/abs/1911.00830v3
PDF	https://arxiv.org/pdf/1911.00830v3.pdf
PWC	https://paperswithcode.com/paper/leveraging-pretrained-image-classifiers-for
Repo
Framework

Differentiable Representations For Multihop Inference Rules


Title	Differentiable Representations For Multihop Inference Rules
Authors	William W. Cohen, Haitian Sun, R. Alex Hofer, Matthew Siegler
Abstract	We present efficient differentiable implementations of second-order multi-hop reasoning using a large symbolic knowledge base (KB). We introduce a new operation which can be used to compositionally construct second-order multi-hop templates in a neural model, and evaluate a number of alternative implementations, with different time and memory trade offs. These techniques scale to KBs with millions of entities and tens of millions of triples, and lead to simple models with competitive performance on several learning tasks requiring multi-hop reasoning.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10417v1
PDF	https://arxiv.org/pdf/1905.10417v1.pdf
PWC	https://paperswithcode.com/paper/differentiable-representations-for-multihop
Repo
Framework

Efficient average-case population recovery in the presence of insertions and deletions


Title	Efficient average-case population recovery in the presence of insertions and deletions
Authors	Frank Ban, Xi Chen, Rocco A. Servedio, Sandip Sinha
Abstract	Several recent works have considered the \emph{trace reconstruction problem}, in which an unknown source string $x\in{0,1}^n$ is transmitted through a probabilistic channel which may randomly delete coordinates or insert random bits, resulting in a \emph{trace} of $x$. The goal is to reconstruct the original string~$x$ from independent traces of $x$. While the best algorithms known for worst-case strings use $\exp(O(n^{1/3}))$ traces \cite{DOS17,NazarovPeres17}, highly efficient algorithms are known \cite{PZ17,HPP18} for the \emph{average-case} version, in which $x$ is uniformly random. We consider a generalization of this average-case trace reconstruction problem, which we call \emph{average-case population recovery in the presence of insertions and deletions}. In this problem, there is an unknown distribution $\cal{D}$ over $s$ unknown source strings $x^1,\dots,x^s \in {0,1}^n$, and each sample is independently generated by drawing some $x^i$ from $\cal{D}$ and returning an independent trace of $x^i$. Building on \cite{PZ17} and \cite{HPP18}, we give an efficient algorithm for this problem. For any support size $s \leq \smash{\exp(\Theta(n^{1/3}))}$, for a $1-o(1)$ fraction of all $s$-element support sets ${x^1,\dots,x^s} \subset {0,1}^n$, for every distribution $\cal{D}$ supported on ${x^1,\dots,x^s}$, our algorithm efficiently recovers ${\cal D}$ up to total variation distance $\epsilon$ with high probability, given access to independent traces of independent draws from $\cal{D}$. The algorithm runs in time poly$(n,s,1/\epsilon)$ and its sample complexity is poly$(s,1/\epsilon,\exp(\log^{1/3}n)).$ This polynomial dependence on the support size $s$ is in sharp contrast with the \emph{worst-case} version (when $x^1,\dots,x^s$ may be any strings in ${0,1}^n$), in which the sample complexity of the most efficient known algorithm \cite{BCFSS19} is doubly exponential in $s$.
Tasks
Published	2019-07-12
URL	https://arxiv.org/abs/1907.05964v1
PDF	https://arxiv.org/pdf/1907.05964v1.pdf
PWC	https://paperswithcode.com/paper/efficient-average-case-population-recovery-in
Repo
Framework

Generalization of Reinforcement Learners with Working and Episodic Memory


Title	Generalization of Reinforcement Learners with Working and Episodic Memory
Authors	Meire Fortunato, Melissa Tan, Ryan Faulkner, Steven Hansen, Adrià Puigdomènech Badia, Gavin Buttimore, Charlie Deck, Joel Z Leibo, Charles Blundell
Abstract	Memory is an important aspect of intelligence and plays a role in many deep reinforcement learning models. However, little progress has been made in understanding when specific memory systems help more than others and how well they generalize. The field also has yet to see a prevalent consistent and rigorous approach for evaluating agent performance on holdout data. In this paper, we aim to develop a comprehensive methodology to test different kinds of memory in an agent and assess how well the agent can apply what it learns in training to a holdout set that differs from the training set along dimensions that we suggest are relevant for evaluating memory-specific generalization. To that end, we first construct a diverse set of memory tasks that allow us to evaluate test-time generalization across multiple dimensions. Second, we develop and perform multiple ablations on an agent architecture that combines multiple memory systems, observe its baseline models, and investigate its performance against the task suite.
Tasks
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13406v2
PDF	https://arxiv.org/pdf/1910.13406v2.pdf
PWC	https://paperswithcode.com/paper/191013406
Repo
Framework

Comprehensive Personalized Ranking Using One-Bit Comparison Data


Title	Comprehensive Personalized Ranking Using One-Bit Comparison Data
Authors	Aria Ameri, Arindam Bose, Mojtaba Soltanalian
Abstract	The task of a personalization system is to recommend items or a set of items according to the users’ taste, and thus predicting their future needs. In this paper, we address such personalized recommendation problems for which one-bit comparison data of user preferences for different items as well as the different user inclinations toward an item are available. We devise a comprehensive personalized ranking (CPR) system by employing a Bayesian treatment. We also provide a connection to the learning method with respect to the CPR optimization criterion to learn the underlying low-rank structure of the rating matrix based on the well-established matrix factorization method. Numerical results are provided to verify the performance of our algorithm.
Tasks
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02408v1
PDF	https://arxiv.org/pdf/1906.02408v1.pdf
PWC	https://paperswithcode.com/paper/comprehensive-personalized-ranking-using-one
Repo
Framework

My lips are concealed: Audio-visual speech enhancement through obstructions


Title	My lips are concealed: Audio-visual speech enhancement through obstructions
Authors	Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman
Abstract	Our objective is an audio-visual model for separating a single speaker from a mixture of sounds such as other speakers and background noise. Moreover, we wish to hear the speaker even when the visual cues are temporarily absent due to occlusion. To this end we introduce a deep audio-visual speech enhancement network that is able to separate a speaker’s voice by conditioning on both the speaker’s lip movements and/or a representation of their voice. The voice representation can be obtained by either (i) enrollment, or (ii) by self-enrollment – learning the representation on-the-fly given sufficient unobstructed visual input. The model is trained by blending audios, and by introducing artificial occlusions around the mouth region that prevent the visual modality from dominating. The method is speaker-independent, and we demonstrate it on real examples of speakers unheard (and unseen) during training. The method also improves over previous models in particular for cases of occlusion in the visual modality.
Tasks	Speech Enhancement
Published	2019-07-11
URL	https://arxiv.org/abs/1907.04975v1
PDF	https://arxiv.org/pdf/1907.04975v1.pdf
PWC	https://paperswithcode.com/paper/my-lips-are-concealed-audio-visual-speech
Repo
Framework

User Validation of Recommendation Serendipity Metrics


Title	User Validation of Recommendation Serendipity Metrics
Authors	Li Chen, Ningxia Wang, Yonghua Yang, Keping Yang, Quan Yuan
Abstract	Though it has been recognized that recommending serendipitous (i.e., surprising and relevant) items can be helpful for increasing users’ satisfaction and behavioral intention, how to measure serendipity in the offline environment is still an open issue. In recent years, a number of metrics have been proposed, but most of them were based on researchers’ assumptions due to the serendipity’s subjective nature. In order to validate these metrics’ actual performance, we collected over 10,000 users’ real feedback data and compared with the metrics’ results. It turns out the user profile based metrics, especially content-based ones, perform better than those based on item popularity, in terms of estimating the unexpectedness facet of recommendations. Moreover, the full metrics, which involve the unexpectedness component, relevance, timeliness, and user curiosity, can more accurately indicate the recommendation’s serendipity degree, relative to those that just involve some of them. The application of these metrics to several recommender algorithms further consolidates their practical usage, because the comparison results are consistent with those from user evaluation. Thus, this work is constructive for filling the gap between offline measurement and user study on recommendation serendipity.
Tasks
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11431v1
PDF	https://arxiv.org/pdf/1906.11431v1.pdf
PWC	https://paperswithcode.com/paper/user-validation-of-recommendation-serendipity
Repo
Framework

Constructing Energy-efficient Mixed-precision Neural Networks through Principal Component Analysis for Edge Intelligence


Title	Constructing Energy-efficient Mixed-precision Neural Networks through Principal Component Analysis for Edge Intelligence
Authors	Indranil Chakraborty, Deboleena Roy, Isha Garg, Aayush Ankit, Kaushik Roy
Abstract	The `Internet of Things’ has brought increased demand for AI-based edge computing in applications ranging from healthcare monitoring systems to autonomous vehicles. Quantization is a powerful tool to address the growing computational cost of such applications, and yields significant compression over full-precision networks. However, quantization can result in substantial loss of performance for complex image classification tasks. To address this, we propose a Principal Component Analysis (PCA) driven methodology to identify the important layers of a binary network, and design mixed-precision networks. The proposed Hybrid-Net achieves a more than 10% improvement in classification accuracy over binary networks such as XNOR-Net for ResNet and VGG architectures on CIFAR-100 and ImageNet datasets while still achieving up to 94% of the energy-efficiency of XNOR-Nets. This work furthers the feasibility of using highly compressed neural networks for energy-efficient neural computing in edge devices. \|
Tasks	Autonomous Vehicles, Dimensionality Reduction, Image Classification, Quantization
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01493v2
PDF	https://arxiv.org/pdf/1906.01493v2.pdf
PWC	https://paperswithcode.com/paper/pca-driven-hybrid-network-design-for-enabling
Repo
Framework

MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement


Title	MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement
Authors	Szu-Wei Fu, Chien-Feng Liao, Yu Tsao, Shou-De Lin
Abstract	Adversarial loss in a conditional generative adversarial network (GAN) is not designed to directly optimize evaluation metrics of a target task, and thus, may not always guide the generator in a GAN to generate data with improved metric scores. To overcome this issue, we propose a novel MetricGAN approach with an aim to optimize the generator with respect to one or multiple evaluation metrics. Moreover, based on MetricGAN, the metric scores of the generated data can also be arbitrarily specified by users. We tested the proposed MetricGAN on a speech enhancement task, which is particularly suitable to verify the proposed approach because there are multiple metrics measuring different aspects of speech signals. Moreover, these metrics are generally complex and could not be fully optimized by Lp or conventional adversarial losses.
Tasks	Speech Enhancement
Published	2019-05-13
URL	https://arxiv.org/abs/1905.04874v1
PDF	https://arxiv.org/pdf/1905.04874v1.pdf
PWC	https://paperswithcode.com/paper/metricgan-generative-adversarial-networks
Repo
Framework

Distributed Learning for Channel Allocation Over a Shared Spectrum


Title	Distributed Learning for Channel Allocation Over a Shared Spectrum
Authors	S. M. Zafaruddin, Ilai Bistritz, Amir Leshem, Dusit Niyato
Abstract	Channel allocation is the task of assigning channels to users such that some objective (e.g., sum-rate) is maximized. In centralized networks such as cellular networks, this task is carried by the base station which gathers the channel state information (CSI) from the users and computes the optimal solution. In distributed networks such as ad-hoc and device-to-device (D2D) networks, no base station exists and conveying global CSI between users is costly or simply impractical. When the CSI is time varying and unknown to the users, the users face the challenge of both learning the channel statistics online and converge to a good channel allocation. This introduces a multi-armed bandit (MAB) scenario with multiple decision makers. If two users or more choose the same channel, a collision occurs and they all receive zero reward. We propose a distributed channel allocation algorithm that each user runs and converges to the optimal allocation while achieving an order optimal regret of O\left(\log T\right). The algorithm is based on a carrier sensing multiple access (CSMA) implementation of the distributed auction algorithm. It does not require any exchange of information between users. Users need only to observe a single channel at a time and sense if there is a transmission on that channel, without decoding the transmissions or identifying the transmitting users. We demonstrate the performance of our algorithm using simulated LTE and 5G channels.
Tasks
Published	2019-02-17
URL	http://arxiv.org/abs/1902.06353v2
PDF	http://arxiv.org/pdf/1902.06353v2.pdf
PWC	https://paperswithcode.com/paper/distributed-learning-for-channel-allocation
Repo
Framework

Learning a Curve Guardian for Motorcycles


Title	Learning a Curve Guardian for Motorcycles
Authors	Simon Hecker, Alexander Liniger, Henrik Maurenbrecher, Dengxin Dai, Luc Van Gool
Abstract	Up to 17% of all motorcycle accidents occur when the rider is maneuvering through a curve and the main cause of curve accidents can be attributed to inappropriate speed and wrong intra-lane position of the motorcycle. Existing curve warning systems lack crucial state estimation components and do not scale well. We propose a new type of road curvature warning system for motorcycles, combining the latest advances in computer vision, optimal control and mapping technologies to alleviate these shortcomings. Our contributes are fourfold: 1) we predict the motorcycle’s intra-lane position using a convolutional neural network (CNN), 2) we predict the motorcycle roll angle using a CNN, 3) we use an upgraded controller model that incorporates road incline for a more realistic model and prediction, 4) we design a scale-able system by utilizing HERE Technologies map database to obtain the accurate road geometry of the future path. In addition, we present two datasets that are used for training and evaluating of our system respectively, both datasets will be made publicly available. We test our system on a diverse set of real world scenarios and present a detailed case-study. We show that our system is able to predict more accurate and safer curve trajectories, and consequently warn and improve the safety for motorcyclists.
Tasks
Published	2019-07-12
URL	https://arxiv.org/abs/1907.05738v1
PDF	https://arxiv.org/pdf/1907.05738v1.pdf
PWC	https://paperswithcode.com/paper/learning-a-curve-guardian-for-motorcycles
Repo
Framework

A GA-based feature selection of the EEG signals by classification evaluation: Application in BCI systems


Title	A GA-based feature selection of the EEG signals by classification evaluation: Application in BCI systems
Authors	Samira Vafay Eslahi, Nader Jafarnia Dabanloo, Keivan Maghooli
Abstract	In electroencephalogram (EEG) signal processing, finding the appropriate information from a dataset has been a big challenge for successful signal classification. The feature selection methods make it possible to solve this problem; however, the method selection is still under investigation to find out which feature can perform the best to extract the most proper features of the signal to improve the classification performance. In this study, we use the genetic algorithm (GA), a heuristic searching algorithm, to find the optimum combination of the feature extraction methods and the classifiers, in the brain-computer interface (BCI) applications. A BCI system can be practical if and only if it performs with high accuracy and high speed alongside each other. In the proposed method, GA performs as a searching engine to find the best combination of the features and classifications. The features used here are Katz, Higuchi, Petrosian, Sevcik, and box-counting dimension (BCD) feature extraction methods. These features are applied to the wavelet subbands and are classified with four classifiers such as adaptive neuro-fuzzy inference system (ANFIS), fuzzy k-nearest neighbors (FKNN), support vector machine (SVM) and linear discriminant analysis (LDA). Due to the huge number of features, the GA optimization is used to find the features with the optimum fitness value (FV). Results reveal that Katz fractal feature estimation method with LDA classification has the best FV. Consequently, due to the low computation time of the first Daubechies wavelet transformation in comparison to the original signal, the final selected methods contain the fractal features of the first coefficient of the detail subbands.
Tasks	EEG, Feature Selection
Published	2019-01-16
URL	http://arxiv.org/abs/1903.02081v1
PDF	http://arxiv.org/pdf/1903.02081v1.pdf
PWC	https://paperswithcode.com/paper/a-ga-based-feature-selection-of-the-eeg
Repo
Framework