January 28, 2020

2937 words 14 mins read

Paper Group ANR 966

Bayesian Zero-Shot Learning. LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation. Zero-shot Learning for Audio-based Music Classification and Tagging. Machine Learning Cryptanalysis of a Quantum Random Number Generator. Sub-frame Appearance and 6D Pose Estimation of Fast Moving Objects. Single Clas …

Bayesian Zero-Shot Learning


Title	Bayesian Zero-Shot Learning
Authors	Sarkhan Badirli, Zeynep Akata, Murat Dundar
Abstract	Object classes that surround us have a natural tendency to emerge at varying levels of abstraction. We propose a Bayesian approach to zero-shot learning (ZSL) that introduces the notion of meta-classes and implements a Bayesian hierarchy around these classes to effectively blend data likelihood with local and global priors. Local priors driven by data from seen classes, i.e. classes that are available at training time, become instrumental in recovering unseen classes, i.e. classes that are missing at training time, in a generalized ZSL setting. Hyperparameters of the Bayesian model offer a convenient way to optimize the trade-off between seen and unseen class accuracy in addition to guiding other aspects of model fitting. We conduct experiments on seven benchmark datasets including the large scale ImageNet and show that our model improves the current state of the art in the challenging generalized ZSL setting.
Tasks	Zero-Shot Learning
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09624v2
PDF	https://arxiv.org/pdf/1907.09624v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-zero-shot-learning
Repo
Framework

LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation


Title	LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation
Authors	Keunhong Park, Arsalan Mousavian, Yu Xiang, Dieter Fox
Abstract	Current 6D object pose estimation methods usually require a 3D model for each object. These methods also require additional training in order to incorporate new objects. As a result, they are difficult to scale to a large number of objects and cannot be directly applied to unseen objects. In this work, we propose a novel framework for 6D pose estimation of unseen objects. We design an end-to-end neural network that reconstructs a latent 3D representation of an object using a small number of reference views of the object. Using the learned 3D representation, the network is able to render the object from arbitrary views. Using this neural renderer, we directly optimize for pose given an input image. By training our network with a large number of 3D shapes for reconstruction and rendering, our network generalizes well to unseen objects. We present a new dataset for unseen object pose estimation–MOPED. We evaluate the performance of our method for unseen object pose estimation on MOPED as well as the ModelNet dataset.
Tasks	6D Pose Estimation, 6D Pose Estimation using RGB, Pose Estimation
Published	2019-12-01
URL	https://arxiv.org/abs/1912.00416v2
PDF	https://arxiv.org/pdf/1912.00416v2.pdf
PWC	https://paperswithcode.com/paper/latentfusion-end-to-end-differentiable
Repo
Framework

Zero-shot Learning for Audio-based Music Classification and Tagging


Title	Zero-shot Learning for Audio-based Music Classification and Tagging
Authors	Jeong Choi, Jongpil Lee, Jiyoung Park, Juhan Nam
Abstract	Audio-based music classification and tagging is typically based on categorical supervised learning with a fixed set of labels. This intrinsically cannot handle unseen labels such as newly added music genres or semantic words that users arbitrarily choose for music retrieval. Zero-shot learning can address this problem by leveraging an additional semantic space of labels where side information about the labels is used to unveil the relationship between each other. In this work, we investigate the zero-shot learning in the music domain and organize two different setups of side information. One is using human-labeled attribute information based on Free Music Archive and OpenMIC-2018 datasets. The other is using general word semantic information based on Million Song Dataset and Last.fm tag annotations. Considering a music track is usually multi-labeled in music classification and tagging datasets, we also propose a data split scheme and associated evaluation settings for the multi-label zero-shot learning. Finally, we report experimental results and discuss the effectiveness and new possibilities of zero-shot learning in the music domain.
Tasks	Music Classification, Zero-Shot Learning
Published	2019-07-05
URL	https://arxiv.org/abs/1907.02670v2
PDF	https://arxiv.org/pdf/1907.02670v2.pdf
PWC	https://paperswithcode.com/paper/zero-shot-learning-for-audio-based-music
Repo
Framework

Machine Learning Cryptanalysis of a Quantum Random Number Generator


Title	Machine Learning Cryptanalysis of a Quantum Random Number Generator
Authors	Nhan Duy Truong, Jing Yan Haw, Syed Muhamad Assad, Ping Koy Lam, Omid Kavehei
Abstract	Random number generators (RNGs) that are crucial for cryptographic applications have been the subject of adversarial attacks. These attacks exploit environmental information to predict generated random numbers that are supposed to be truly random and unpredictable. Though quantum random number generators (QRNGs) are based on the intrinsic indeterministic nature of quantum properties, the presence of classical noise in the measurement process compromises the integrity of a QRNG. In this paper, we develop a predictive machine learning (ML) analysis to investigate the impact of deterministic classical noise in different stages of an optical continuous variable QRNG. Our ML model successfully detects inherent correlations when the deterministic noise sources are prominent. After appropriate filtering and randomness extraction processes are introduced, our QRNG system, in turn, demonstrates its robustness against ML. We further demonstrate the robustness of our ML approach by applying it to uniformly distributed random numbers from the QRNG and a congruential RNG. Hence, our result shows that ML has potentials in benchmarking the quality of RNG devices.
Tasks	Cryptanalysis
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02342v2
PDF	https://arxiv.org/pdf/1905.02342v2.pdf
PWC	https://paperswithcode.com/paper/machine-learning-cryptanalysis-of-a-quantum
Repo
Framework

Sub-frame Appearance and 6D Pose Estimation of Fast Moving Objects


Title	Sub-frame Appearance and 6D Pose Estimation of Fast Moving Objects
Authors	Denys Rozumnyi, Jan Kotera, Filip Sroubek, Jiri Matas
Abstract	We propose a novel method that tracks fast moving objects, mainly non-uniform spherical, in full 6 degrees of freedom, estimating simultaneously their 3D motion trajectory, 3D pose and object appearance changes with a time step that is a fraction of the video frame exposure time. The sub-frame object localization and appearance estimation allows realistic temporal super-resolution and precise shape estimation. The method, called TbD-3D (Tracking by Deblatting in 3D) relies on a novel reconstruction algorithm which solves a piece-wise deblurring and matting problem. The 3D rotation is estimated by minimizing the reprojection error. As a second contribution, we present a new challenging dataset with fast moving objects that change their appearance and distance to the camera. High speed camera recordings with zero lag between frame exposures were used to generate videos with different frame rates annotated with ground-truth trajectory and pose.
Tasks	6D Pose Estimation, Deblurring, Object Localization, Pose Estimation, Super-Resolution
Published	2019-11-25
URL	https://arxiv.org/abs/1911.10927v1
PDF	https://arxiv.org/pdf/1911.10927v1.pdf
PWC	https://paperswithcode.com/paper/sub-frame-appearance-and-6d-pose-estimation
Repo
Framework

Single Class Universum-SVM


Title	Single Class Universum-SVM
Authors	Sauptik Dhar, Vladimir Cherkassky
Abstract	This paper extends the idea of Universum learning [1, 2] to single-class learning problems. We propose Single Class Universum-SVM setting that incorporates a priori knowledge (in the form of additional data samples) into the single class estimation problem. These additional data samples or Universum belong to the same application domain as (positive) data samples from a single class (of interest), but they follow a different distribution. Proposed methodology for single class U-SVM is based on the known connection between binary classification and single class learning formulations [3]. Several empirical comparisons are presented to illustrate the utility of the proposed approach.
Tasks
Published	2019-09-21
URL	https://arxiv.org/abs/1909.09862v1
PDF	https://arxiv.org/pdf/1909.09862v1.pdf
PWC	https://paperswithcode.com/paper/190909862
Repo
Framework

Classifying topological sector via machine learning


Title	Classifying topological sector via machine learning
Authors	Masakiyo Kitazawa, Takuya Matsumoto, Yasuhiro Kohno
Abstract	We employ a machine learning technique for an estimate of the topological charge $Q$ of gauge configurations in SU(3) Yang-Mills theory in vacuum. As a first trial, we feed the four-dimensional topological charge density with and without smoothing into the convolutional neural network and train it to estimate the value of $Q$. We find that the trained neural network can estimate the value of $Q$ from the topological charge density at small flow time with high accuracy. Next, we perform the dimensional reduction of the input data as a preprocessing and analyze lower dimensional data by the neural network. We find that the accuracy of the neural network does not have statistically-significant dependence on the dimension of the input data. From this result we argue that the neural network does not find characteristic features responsible for the determination of $Q$ in the higher dimensional space.
Tasks
Published	2019-12-28
URL	https://arxiv.org/abs/1912.12410v1
PDF	https://arxiv.org/pdf/1912.12410v1.pdf
PWC	https://paperswithcode.com/paper/classifying-topological-sector-via-machine
Repo
Framework

Generalized Data Augmentation for Low-Resource Translation


Title	Generalized Data Augmentation for Low-Resource Translation
Authors	Mengzhou Xia, Xiang Kong, Antonios Anastasopoulos, Graham Neubig
Abstract	Translation to or from low-resource languages LRLs poses challenges for machine translation in terms of both adequacy and fluency. Data augmentation utilizing large amounts of monolingual data is regarded as an effective way to alleviate these problems. In this paper, we propose a general framework for data augmentation in low-resource machine translation that not only uses target-side monolingual data, but also pivots through a related high-resource language HRL. Specifically, we experiment with a two-step pivoting method to convert high-resource data to the LRL, making use of available resources to better approximate the true data distribution of the LRL. First, we inject LRL words into HRL sentences through an induced bilingual dictionary. Second, we further edit these modified sentences using a modified unsupervised machine translation framework. Extensive experiments on four low-resource datasets show that under extreme low-resource settings, our data augmentation techniques improve translation quality by up to~1.5 to~8 BLEU points compared to supervised back-translation baselines
Tasks	Data Augmentation, Machine Translation, Unsupervised Machine Translation
Published	2019-06-10
URL	https://arxiv.org/abs/1906.03785v1
PDF	https://arxiv.org/pdf/1906.03785v1.pdf
PWC	https://paperswithcode.com/paper/generalized-data-augmentation-for-low
Repo
Framework

A survey of advances in vision-based vehicle re-identification


Title	A survey of advances in vision-based vehicle re-identification
Authors	Sultan Daud Khan, Habib Ullah
Abstract	Vehicle re-identification (V-reID) has become significantly popular in the community due to its applications and research significance. In particular, the V-reID is an important problem that still faces numerous open challenges. This paper reviews different V-reID methods including sensor based methods, hybrid methods, and vision based methods which are further categorized into hand-crafted feature based methods and deep feature based methods. The vision based methods make the V-reID problem particularly interesting, and our review systematically addresses and evaluates these methods for the first time. We conduct experiments on four comprehensive benchmark datasets and compare the performances of recent hand-crafted feature based methods and deep feature based methods. We present the detail analysis of these methods in terms of mean average precision (mAP) and cumulative matching curve (CMC). These analyses provide objective insight into the strengths and weaknesses of these methods. We also provide the details of different V-reID datasets and critically discuss the challenges and future trends of V-reID methods.
Tasks	Vehicle Re-Identification
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13258v1
PDF	https://arxiv.org/pdf/1905.13258v1.pdf
PWC	https://paperswithcode.com/paper/a-survey-of-advances-in-vision-based-vehicle
Repo
Framework

Addressing database variability in learning from medical data: an ensemble-based approach using convolutional neural networks and a case of study applied to automatic sleep scoring


Title	Addressing database variability in learning from medical data: an ensemble-based approach using convolutional neural networks and a case of study applied to automatic sleep scoring
Authors	Diego Alvarez-Estevez, Isaac Fernández-Varela
Abstract	In this work we examine some of the problems associated with the development of machine learning models with the objective to achieve robust generalization capabilities on common-task multiple-database scenarios. Referred to as the “database variability problem”, we focus on a specific medical domain (sleep staging in sleep medicine) to show the non-triviality of translating the estimated model’s local generalization capabilities into independent external databases. We analyze some of the scalability problems when multiple-database data are used as inputs to train a single learning model. Then, we introduce a novel approach based on an ensemble of local models, and we show its advantages in terms of inter-database generalization performance and data scalability. In addition, we analyze different model configurations and data pre-processing techniques to determine their effects on the overall generalization performance. For this purpose, we carry out experimentation that involves several sleep databases and evaluates different machine learning models based on convolutional neural networks
Tasks
Published	2019-06-16
URL	https://arxiv.org/abs/1906.06666v3
PDF	https://arxiv.org/pdf/1906.06666v3.pdf
PWC	https://paperswithcode.com/paper/dealing-with-the-database-variability-problem
Repo
Framework

A Genetic Algorithm based Kernel-size Selection Approach for a Multi-column Convolutional Neural Network


Title	A Genetic Algorithm based Kernel-size Selection Approach for a Multi-column Convolutional Neural Network
Authors	Animesh Singh, Sandip Saha, Ritesh Sarkhel, Mahantapas Kundu, Mita Nasipuri, Nibaran Das
Abstract	Deep neural network-based architectures give promising results in various domains including pattern recognition. Finding the optimal combination of the hyper-parameters of such a large-sized architecture is tedious and requires a large number of laboratory experiments. But, identifying the optimal combination of a hyper-parameter or appropriate kernel size for a given architecture of deep learning is always a challenging and tedious task. Here, we introduced a genetic algorithm-based technique to reduce the efforts of finding the optimal combination of a hyper-parameter (kernel size) of a convolutional neural network-based architecture. The method is evaluated on three popular datasets of different handwritten Bangla characters and digits. The implementation of the proposed methodology can be found in the following link: https://github.com/DeepQn/GA-Based-Kernel-Size.
Tasks
Published	2019-12-28
URL	https://arxiv.org/abs/1912.12405v2
PDF	https://arxiv.org/pdf/1912.12405v2.pdf
PWC	https://paperswithcode.com/paper/a-genetic-algorithm-based-kernel-size
Repo
Framework

Analysis of Baseline Evolutionary Algorithms for the Packing While Travelling Problem


Title	Analysis of Baseline Evolutionary Algorithms for the Packing While Travelling Problem
Authors	Vahid Roostapour, Mojgan Pourhassan, Frank Neumann
Abstract	The performance of base-line Evolutionary Algorithms (EAs) on combinatorial problems has been studied rigorously. From the theoretical viewpoint, the literature extensively investigates the linear problems, while the theoretical analysis of the non-linear problems is still far behind. In this paper, variations of the Packing While Travelling (PWT) – also known as the non-linear knapsack problem – are studied as an attempt to analyse the behaviour of EAs on non-linear problems from theoretical perspective. We investigate PWT for two cities and $n$ items with correlated weights and profits, using single-objective and multi-objective algorithms. Our results show that RLS_swap, which differs from the classical RLS by having the ability to swap two bits in one iteration, finds the optimal solution in $O(n^3)$ expected time. We also study an enhanced version of GSEMO, which a specific selection operator to deal with exponential population size, and prove that it finds the Pareto front in the same asymptotic expected time. In the case of uniform weights, (1+1)~EA is able to find the optimal solution in expected time $O(n^2\log{(\max{n,p_{\max}})})$, where $p_{\max}$ is the largest profit of the given items. We also perform an experimental analysis to complement our theoretical investigations and provide additional insights into the runtime behavior.
Tasks
Published	2019-02-13
URL	https://arxiv.org/abs/1902.04692v2
PDF	https://arxiv.org/pdf/1902.04692v2.pdf
PWC	https://paperswithcode.com/paper/analysis-of-baseline-evolutionary-algorithms
Repo
Framework

The Ex-Ante View of Recommender System Design


Title	The Ex-Ante View of Recommender System Design
Authors	Guy Aridor, Duarte Goncalves, Shan Sikdar
Abstract	Recommender systems (RS) are traditionally deployed in environments where users are uncertain about their preferences and thus face a problem of choice under uncertainty, but most popular design approaches ignore this fact. We argue that predicting and modeling consumer choice in these contexts can improve the usefulness of RS and reframe the RS problem as providing useful information to help reduce user uncertainty as opposed to simply predicting user preferences. Using a theoretical model, we show how this insight can be utilized to design RS that mitigate negative consequences such as filter bubble and user-homogenization effects as well as to better understand the role that RS play in contributing to these phenomena.
Tasks	Recommendation Systems
Published	2019-04-23
URL	https://arxiv.org/abs/1904.10527v2
PDF	https://arxiv.org/pdf/1904.10527v2.pdf
PWC	https://paperswithcode.com/paper/the-ex-ante-view-of-recommender-system-design
Repo
Framework

Neural Puppet: Generative Layered Cartoon Characters


Title	Neural Puppet: Generative Layered Cartoon Characters
Authors	Omid Poursaeed, Vladimir G. Kim, Eli Shechtman, Jun Saito, Serge Belongie
Abstract	We propose a learning based method for generating new animations of a cartoon character given a few example images. Our method is designed to learn from a traditionally animated sequence, where each frame is drawn by an artist, and thus the input images lack any common structure, correspondences, or labels. We express pose changes as a deformation of a layered 2.5D template mesh, and devise a novel architecture that learns to predict mesh deformations matching the template to a target image. This enables us to extract a common low-dimensional structure from a diverse set of character poses. We combine recent advances in differentiable rendering as well as mesh-aware models to successfully align common template even if only a few character images are available during training. In addition to coarse poses, character appearance also varies due to shading, out-of-plane motions, and artistic effects. We capture these subtle changes by applying an image translation network to refine the mesh rendering, providing an end-to-end model to generate new animations of a character with high visual quality. We demonstrate that our generative model can be used to synthesize in-between frames and to create data-driven deformation. Our template fitting procedure outperforms state-of-the-art generic techniques for detecting image correspondences.
Tasks
Published	2019-10-04
URL	https://arxiv.org/abs/1910.02060v2
PDF	https://arxiv.org/pdf/1910.02060v2.pdf
PWC	https://paperswithcode.com/paper/neural-puppet-generative-layered-cartoon
Repo
Framework

All-in-One Image-Grounded Conversational Agents


Title	All-in-One Image-Grounded Conversational Agents
Authors	Da Ju, Kurt Shuster, Y-Lan Boureau, Jason Weston
Abstract	As single-task accuracy on individual language and image tasks has improved substantially in the last few years, the long-term goal of a generally skilled agent that can both see and talk becomes more feasible to explore. In this work, we focus on leveraging individual language and image tasks, along with resources that incorporate both vision and language towards that objective. We design an architecture that combines state-of-the-art Transformer and ResNeXt modules fed into a novel attentive multimodal module to produce a combined model trained on many tasks. We provide a thorough analysis of the components of the model, and transfer performance when training on one, some, or all of the tasks. Our final models provide a single system that obtains good results on all vision and language tasks considered, and improves the state-of-the-art in image-grounded conversational applications.
Tasks
Published	2019-12-28
URL	https://arxiv.org/abs/1912.12394v2
PDF	https://arxiv.org/pdf/1912.12394v2.pdf
PWC	https://paperswithcode.com/paper/all-in-one-image-grounded-conversational
Repo
Framework