January 26, 2020

3247 words 16 mins read

Paper Group ANR 1446

Paper Group ANR 1446

Learning Weighted Submanifolds with Variational Autoencoders and Riemannian Variational Autoencoders. The VIA Annotation Software for Images, Audio and Video. Referring Expression Generation Using Entity Profiles. Leveraging Pretrained Image Classifiers for Language-Based Segmentation. Differentiable Representations For Multihop Inference Rules. Ef …

Learning Weighted Submanifolds with Variational Autoencoders and Riemannian Variational Autoencoders

Title Learning Weighted Submanifolds with Variational Autoencoders and Riemannian Variational Autoencoders
Authors Nina Miolane, Susan Holmes
Abstract Manifold-valued data naturally arises in medical imaging. In cognitive neuroscience, for instance, brain connectomes base the analysis of coactivation patterns between different brain regions on the analysis of the correlations of their functional Magnetic Resonance Imaging (fMRI) time series - an object thus constrained by construction to belong to the manifold of symmetric positive definite matrices. One of the challenges that naturally arises consists of finding a lower-dimensional subspace for representing such manifold-valued data. Traditional techniques, like principal component analysis, are ill-adapted to tackle non-Euclidean spaces and may fail to achieve a lower-dimensional representation of the data - thus potentially pointing to the absence of lower-dimensional representation of the data. However, these techniques are restricted in that: (i) they do not leverage the assumption that the connectomes belong on a pre-specified manifold, therefore discarding information; (ii) they can only fit a linear subspace to the data. In this paper, we are interested in variants to learn potentially highly curved submanifolds of manifold-valued data. Motivated by the brain connectomes example, we investigate a latent variable generative model, which has the added benefit of providing us with uncertainty estimates - a crucial quantity in the medical applications we are considering. While latent variable models have been proposed to learn linear and nonlinear spaces for Euclidean data, or geodesic subspaces for manifold data, no intrinsic latent variable model exists to learn nongeodesic subspaces for manifold data. This paper fills this gap and formulates a Riemannian variational autoencoder with an intrinsic generative model of manifold-valued data. We evaluate its performances on synthetic and real datasets by introducing the formalism of weighted Riemannian submanifolds.
Tasks Latent Variable Models, Time Series
Published 2019-11-19
URL https://arxiv.org/abs/1911.08147v1
PDF https://arxiv.org/pdf/1911.08147v1.pdf
PWC https://paperswithcode.com/paper/learning-weighted-submanifolds-with
Repo
Framework

The VIA Annotation Software for Images, Audio and Video

Title The VIA Annotation Software for Images, Audio and Video
Authors Abhishek Dutta, Andrew Zisserman
Abstract In this paper, we introduce a simple and standalone manual annotation tool for images, audio and video: the VGG Image Annotator (VIA). This is a light weight, standalone and offline software package that does not require any installation or setup and runs solely in a web browser. The VIA software allows human annotators to define and describe spatial regions in images or video frames, and temporal segments in audio or video. These manual annotations can be exported to plain text data formats such as JSON and CSV and therefore are amenable to further processing by other software tools. VIA also supports collaborative annotation of a large dataset by a group of human annotators. The BSD open source license of this software allows it to be used in any academic project or commercial application.
Tasks
Published 2019-04-24
URL https://arxiv.org/abs/1904.10699v3
PDF https://arxiv.org/pdf/1904.10699v3.pdf
PWC https://paperswithcode.com/paper/the-vgg-image-annotator-via
Repo
Framework

Referring Expression Generation Using Entity Profiles

Title Referring Expression Generation Using Entity Profiles
Authors Meng Cao, Jackie Chi Kit Cheung
Abstract Referring Expression Generation (REG) is the task of generating contextually appropriate references to entities. A limitation of existing REG systems is that they rely on entity-specific supervised training, which means that they cannot handle entities not seen during training. In this study, we address this in two ways. First, we propose task setups in which we specifically test a REG system’s ability to generalize to entities not seen during training. Second, we propose a profile-based deep neural network model, ProfileREG, which encodes both the local context and an external profile of the entity to generate reference realizations. Our model generates tokens by learning to choose between generating pronouns, generating from a fixed vocabulary, or copying a word from the profile. We evaluate our model on three different splits of the WebNLG dataset, and show that it outperforms competitive baselines in all settings according to automatic and human evaluations.
Tasks
Published 2019-09-04
URL https://arxiv.org/abs/1909.01528v1
PDF https://arxiv.org/pdf/1909.01528v1.pdf
PWC https://paperswithcode.com/paper/referring-expression-generation-using-entity
Repo
Framework

Leveraging Pretrained Image Classifiers for Language-Based Segmentation

Title Leveraging Pretrained Image Classifiers for Language-Based Segmentation
Authors David Golub, Ahmed El-Kishky, Roberto Martín-Martín
Abstract Current semantic segmentation models cannot easily generalize to new object classes unseen during train time: they require additional annotated images and retraining. We propose a novel segmentation model that injects visual priors into semantic segmentation architectures, allowing them to segment out new target labels without retraining. As visual priors, we use the activations of pretrained image classifiers, which provide noisy indications of the spatial location of both the target object and distractor objects in the scene. We leverage language semantics to obtain these activations for a target label unseen by the classifier. Further experiments show that the visual priors obtained via language semantics for both relevant and distracting objects are key to our performance.
Tasks Semantic Segmentation
Published 2019-11-03
URL https://arxiv.org/abs/1911.00830v3
PDF https://arxiv.org/pdf/1911.00830v3.pdf
PWC https://paperswithcode.com/paper/leveraging-pretrained-image-classifiers-for
Repo
Framework

Differentiable Representations For Multihop Inference Rules

Title Differentiable Representations For Multihop Inference Rules
Authors William W. Cohen, Haitian Sun, R. Alex Hofer, Matthew Siegler
Abstract We present efficient differentiable implementations of second-order multi-hop reasoning using a large symbolic knowledge base (KB). We introduce a new operation which can be used to compositionally construct second-order multi-hop templates in a neural model, and evaluate a number of alternative implementations, with different time and memory trade offs. These techniques scale to KBs with millions of entities and tens of millions of triples, and lead to simple models with competitive performance on several learning tasks requiring multi-hop reasoning.
Tasks
Published 2019-05-24
URL https://arxiv.org/abs/1905.10417v1
PDF https://arxiv.org/pdf/1905.10417v1.pdf
PWC https://paperswithcode.com/paper/differentiable-representations-for-multihop
Repo
Framework

Efficient average-case population recovery in the presence of insertions and deletions

Title Efficient average-case population recovery in the presence of insertions and deletions
Authors Frank Ban, Xi Chen, Rocco A. Servedio, Sandip Sinha
Abstract Several recent works have considered the \emph{trace reconstruction problem}, in which an unknown source string $x\in{0,1}^n$ is transmitted through a probabilistic channel which may randomly delete coordinates or insert random bits, resulting in a \emph{trace} of $x$. The goal is to reconstruct the original string~$x$ from independent traces of $x$. While the best algorithms known for worst-case strings use $\exp(O(n^{1/3}))$ traces \cite{DOS17,NazarovPeres17}, highly efficient algorithms are known \cite{PZ17,HPP18} for the \emph{average-case} version, in which $x$ is uniformly random. We consider a generalization of this average-case trace reconstruction problem, which we call \emph{average-case population recovery in the presence of insertions and deletions}. In this problem, there is an unknown distribution $\cal{D}$ over $s$ unknown source strings $x^1,\dots,x^s \in {0,1}^n$, and each sample is independently generated by drawing some $x^i$ from $\cal{D}$ and returning an independent trace of $x^i$. Building on \cite{PZ17} and \cite{HPP18}, we give an efficient algorithm for this problem. For any support size $s \leq \smash{\exp(\Theta(n^{1/3}))}$, for a $1-o(1)$ fraction of all $s$-element support sets ${x^1,\dots,x^s} \subset {0,1}^n$, for every distribution $\cal{D}$ supported on ${x^1,\dots,x^s}$, our algorithm efficiently recovers ${\cal D}$ up to total variation distance $\epsilon$ with high probability, given access to independent traces of independent draws from $\cal{D}$. The algorithm runs in time poly$(n,s,1/\epsilon)$ and its sample complexity is poly$(s,1/\epsilon,\exp(\log^{1/3}n)).$ This polynomial dependence on the support size $s$ is in sharp contrast with the \emph{worst-case} version (when $x^1,\dots,x^s$ may be any strings in ${0,1}^n$), in which the sample complexity of the most efficient known algorithm \cite{BCFSS19} is doubly exponential in $s$.
Tasks
Published 2019-07-12
URL https://arxiv.org/abs/1907.05964v1
PDF https://arxiv.org/pdf/1907.05964v1.pdf
PWC https://paperswithcode.com/paper/efficient-average-case-population-recovery-in
Repo
Framework

Generalization of Reinforcement Learners with Working and Episodic Memory

Title Generalization of Reinforcement Learners with Working and Episodic Memory
Authors Meire Fortunato, Melissa Tan, Ryan Faulkner, Steven Hansen, Adrià Puigdomènech Badia, Gavin Buttimore, Charlie Deck, Joel Z Leibo, Charles Blundell
Abstract Memory is an important aspect of intelligence and plays a role in many deep reinforcement learning models. However, little progress has been made in understanding when specific memory systems help more than others and how well they generalize. The field also has yet to see a prevalent consistent and rigorous approach for evaluating agent performance on holdout data. In this paper, we aim to develop a comprehensive methodology to test different kinds of memory in an agent and assess how well the agent can apply what it learns in training to a holdout set that differs from the training set along dimensions that we suggest are relevant for evaluating memory-specific generalization. To that end, we first construct a diverse set of memory tasks that allow us to evaluate test-time generalization across multiple dimensions. Second, we develop and perform multiple ablations on an agent architecture that combines multiple memory systems, observe its baseline models, and investigate its performance against the task suite.
Tasks
Published 2019-10-29
URL https://arxiv.org/abs/1910.13406v2
PDF https://arxiv.org/pdf/1910.13406v2.pdf
PWC https://paperswithcode.com/paper/191013406
Repo
Framework

Comprehensive Personalized Ranking Using One-Bit Comparison Data

Title Comprehensive Personalized Ranking Using One-Bit Comparison Data
Authors Aria Ameri, Arindam Bose, Mojtaba Soltanalian
Abstract The task of a personalization system is to recommend items or a set of items according to the users’ taste, and thus predicting their future needs. In this paper, we address such personalized recommendation problems for which one-bit comparison data of user preferences for different items as well as the different user inclinations toward an item are available. We devise a comprehensive personalized ranking (CPR) system by employing a Bayesian treatment. We also provide a connection to the learning method with respect to the CPR optimization criterion to learn the underlying low-rank structure of the rating matrix based on the well-established matrix factorization method. Numerical results are provided to verify the performance of our algorithm.
Tasks
Published 2019-06-06
URL https://arxiv.org/abs/1906.02408v1
PDF https://arxiv.org/pdf/1906.02408v1.pdf
PWC https://paperswithcode.com/paper/comprehensive-personalized-ranking-using-one
Repo
Framework

My lips are concealed: Audio-visual speech enhancement through obstructions

Title My lips are concealed: Audio-visual speech enhancement through obstructions
Authors Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman
Abstract Our objective is an audio-visual model for separating a single speaker from a mixture of sounds such as other speakers and background noise. Moreover, we wish to hear the speaker even when the visual cues are temporarily absent due to occlusion. To this end we introduce a deep audio-visual speech enhancement network that is able to separate a speaker’s voice by conditioning on both the speaker’s lip movements and/or a representation of their voice. The voice representation can be obtained by either (i) enrollment, or (ii) by self-enrollment – learning the representation on-the-fly given sufficient unobstructed visual input. The model is trained by blending audios, and by introducing artificial occlusions around the mouth region that prevent the visual modality from dominating. The method is speaker-independent, and we demonstrate it on real examples of speakers unheard (and unseen) during training. The method also improves over previous models in particular for cases of occlusion in the visual modality.
Tasks Speech Enhancement
Published 2019-07-11
URL https://arxiv.org/abs/1907.04975v1
PDF https://arxiv.org/pdf/1907.04975v1.pdf
PWC https://paperswithcode.com/paper/my-lips-are-concealed-audio-visual-speech
Repo
Framework

User Validation of Recommendation Serendipity Metrics

Title User Validation of Recommendation Serendipity Metrics
Authors Li Chen, Ningxia Wang, Yonghua Yang, Keping Yang, Quan Yuan
Abstract Though it has been recognized that recommending serendipitous (i.e., surprising and relevant) items can be helpful for increasing users’ satisfaction and behavioral intention, how to measure serendipity in the offline environment is still an open issue. In recent years, a number of metrics have been proposed, but most of them were based on researchers’ assumptions due to the serendipity’s subjective nature. In order to validate these metrics’ actual performance, we collected over 10,000 users’ real feedback data and compared with the metrics’ results. It turns out the user profile based metrics, especially content-based ones, perform better than those based on item popularity, in terms of estimating the unexpectedness facet of recommendations. Moreover, the full metrics, which involve the unexpectedness component, relevance, timeliness, and user curiosity, can more accurately indicate the recommendation’s serendipity degree, relative to those that just involve some of them. The application of these metrics to several recommender algorithms further consolidates their practical usage, because the comparison results are consistent with those from user evaluation. Thus, this work is constructive for filling the gap between offline measurement and user study on recommendation serendipity.
Tasks
Published 2019-06-27
URL https://arxiv.org/abs/1906.11431v1
PDF https://arxiv.org/pdf/1906.11431v1.pdf
PWC https://paperswithcode.com/paper/user-validation-of-recommendation-serendipity
Repo
Framework

Constructing Energy-efficient Mixed-precision Neural Networks through Principal Component Analysis for Edge Intelligence

Title Constructing Energy-efficient Mixed-precision Neural Networks through Principal Component Analysis for Edge Intelligence
Authors Indranil Chakraborty, Deboleena Roy, Isha Garg, Aayush Ankit, Kaushik Roy
Abstract The `Internet of Things’ has brought increased demand for AI-based edge computing in applications ranging from healthcare monitoring systems to autonomous vehicles. Quantization is a powerful tool to address the growing computational cost of such applications, and yields significant compression over full-precision networks. However, quantization can result in substantial loss of performance for complex image classification tasks. To address this, we propose a Principal Component Analysis (PCA) driven methodology to identify the important layers of a binary network, and design mixed-precision networks. The proposed Hybrid-Net achieves a more than 10% improvement in classification accuracy over binary networks such as XNOR-Net for ResNet and VGG architectures on CIFAR-100 and ImageNet datasets while still achieving up to 94% of the energy-efficiency of XNOR-Nets. This work furthers the feasibility of using highly compressed neural networks for energy-efficient neural computing in edge devices. |
Tasks Autonomous Vehicles, Dimensionality Reduction, Image Classification, Quantization
Published 2019-06-04
URL https://arxiv.org/abs/1906.01493v2
PDF https://arxiv.org/pdf/1906.01493v2.pdf
PWC https://paperswithcode.com/paper/pca-driven-hybrid-network-design-for-enabling
Repo
Framework

MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement

Title MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement
Authors Szu-Wei Fu, Chien-Feng Liao, Yu Tsao, Shou-De Lin
Abstract Adversarial loss in a conditional generative adversarial network (GAN) is not designed to directly optimize evaluation metrics of a target task, and thus, may not always guide the generator in a GAN to generate data with improved metric scores. To overcome this issue, we propose a novel MetricGAN approach with an aim to optimize the generator with respect to one or multiple evaluation metrics. Moreover, based on MetricGAN, the metric scores of the generated data can also be arbitrarily specified by users. We tested the proposed MetricGAN on a speech enhancement task, which is particularly suitable to verify the proposed approach because there are multiple metrics measuring different aspects of speech signals. Moreover, these metrics are generally complex and could not be fully optimized by Lp or conventional adversarial losses.
Tasks Speech Enhancement
Published 2019-05-13
URL https://arxiv.org/abs/1905.04874v1
PDF https://arxiv.org/pdf/1905.04874v1.pdf
PWC https://paperswithcode.com/paper/metricgan-generative-adversarial-networks
Repo
Framework

Distributed Learning for Channel Allocation Over a Shared Spectrum

Title Distributed Learning for Channel Allocation Over a Shared Spectrum
Authors S. M. Zafaruddin, Ilai Bistritz, Amir Leshem, Dusit Niyato
Abstract Channel allocation is the task of assigning channels to users such that some objective (e.g., sum-rate) is maximized. In centralized networks such as cellular networks, this task is carried by the base station which gathers the channel state information (CSI) from the users and computes the optimal solution. In distributed networks such as ad-hoc and device-to-device (D2D) networks, no base station exists and conveying global CSI between users is costly or simply impractical. When the CSI is time varying and unknown to the users, the users face the challenge of both learning the channel statistics online and converge to a good channel allocation. This introduces a multi-armed bandit (MAB) scenario with multiple decision makers. If two users or more choose the same channel, a collision occurs and they all receive zero reward. We propose a distributed channel allocation algorithm that each user runs and converges to the optimal allocation while achieving an order optimal regret of O\left(\log T\right). The algorithm is based on a carrier sensing multiple access (CSMA) implementation of the distributed auction algorithm. It does not require any exchange of information between users. Users need only to observe a single channel at a time and sense if there is a transmission on that channel, without decoding the transmissions or identifying the transmitting users. We demonstrate the performance of our algorithm using simulated LTE and 5G channels.
Tasks
Published 2019-02-17
URL http://arxiv.org/abs/1902.06353v2
PDF http://arxiv.org/pdf/1902.06353v2.pdf
PWC https://paperswithcode.com/paper/distributed-learning-for-channel-allocation
Repo
Framework

Learning a Curve Guardian for Motorcycles

Title Learning a Curve Guardian for Motorcycles
Authors Simon Hecker, Alexander Liniger, Henrik Maurenbrecher, Dengxin Dai, Luc Van Gool
Abstract Up to 17% of all motorcycle accidents occur when the rider is maneuvering through a curve and the main cause of curve accidents can be attributed to inappropriate speed and wrong intra-lane position of the motorcycle. Existing curve warning systems lack crucial state estimation components and do not scale well. We propose a new type of road curvature warning system for motorcycles, combining the latest advances in computer vision, optimal control and mapping technologies to alleviate these shortcomings. Our contributes are fourfold: 1) we predict the motorcycle’s intra-lane position using a convolutional neural network (CNN), 2) we predict the motorcycle roll angle using a CNN, 3) we use an upgraded controller model that incorporates road incline for a more realistic model and prediction, 4) we design a scale-able system by utilizing HERE Technologies map database to obtain the accurate road geometry of the future path. In addition, we present two datasets that are used for training and evaluating of our system respectively, both datasets will be made publicly available. We test our system on a diverse set of real world scenarios and present a detailed case-study. We show that our system is able to predict more accurate and safer curve trajectories, and consequently warn and improve the safety for motorcyclists.
Tasks
Published 2019-07-12
URL https://arxiv.org/abs/1907.05738v1
PDF https://arxiv.org/pdf/1907.05738v1.pdf
PWC https://paperswithcode.com/paper/learning-a-curve-guardian-for-motorcycles
Repo
Framework

A GA-based feature selection of the EEG signals by classification evaluation: Application in BCI systems

Title A GA-based feature selection of the EEG signals by classification evaluation: Application in BCI systems
Authors Samira Vafay Eslahi, Nader Jafarnia Dabanloo, Keivan Maghooli
Abstract In electroencephalogram (EEG) signal processing, finding the appropriate information from a dataset has been a big challenge for successful signal classification. The feature selection methods make it possible to solve this problem; however, the method selection is still under investigation to find out which feature can perform the best to extract the most proper features of the signal to improve the classification performance. In this study, we use the genetic algorithm (GA), a heuristic searching algorithm, to find the optimum combination of the feature extraction methods and the classifiers, in the brain-computer interface (BCI) applications. A BCI system can be practical if and only if it performs with high accuracy and high speed alongside each other. In the proposed method, GA performs as a searching engine to find the best combination of the features and classifications. The features used here are Katz, Higuchi, Petrosian, Sevcik, and box-counting dimension (BCD) feature extraction methods. These features are applied to the wavelet subbands and are classified with four classifiers such as adaptive neuro-fuzzy inference system (ANFIS), fuzzy k-nearest neighbors (FKNN), support vector machine (SVM) and linear discriminant analysis (LDA). Due to the huge number of features, the GA optimization is used to find the features with the optimum fitness value (FV). Results reveal that Katz fractal feature estimation method with LDA classification has the best FV. Consequently, due to the low computation time of the first Daubechies wavelet transformation in comparison to the original signal, the final selected methods contain the fractal features of the first coefficient of the detail subbands.
Tasks EEG, Feature Selection
Published 2019-01-16
URL http://arxiv.org/abs/1903.02081v1
PDF http://arxiv.org/pdf/1903.02081v1.pdf
PWC https://paperswithcode.com/paper/a-ga-based-feature-selection-of-the-eeg
Repo
Framework
comments powered by Disqus