April 3, 2020

3208 words 16 mins read

Paper Group ANR 1

Paper Group ANR 1

Probabilistic Partitive Partitioning (PPP). A Comparative Study of Western and Chinese Classical Music based on Soundscape Models. MatchingGAN: Matching-based Few-shot Image Generation. Anysize GAN: A solution to the image-warping problem. Statistical and Topological Properties of Sliced Probability Divergences. SketchyCOCO: Image Generation from F …

Probabilistic Partitive Partitioning (PPP)

Title Probabilistic Partitive Partitioning (PPP)
Authors Mujahid Sultan
Abstract Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heuristics are applied to cluster the data. Heuristics can be very resource-intensive, if not applied properly. For substantially large data sets computational efficiencies can be achieved by reducing the input space if a minimal loss of information can be achieved. Clustering algorithms, in general, face two common problems: 1) these converge to different settings with different initial conditions and; 2) the number of clusters has to be arbitrarily decided beforehand. This problem has become critical in the realm of big data. Recently, clustering algorithms have emerged which can speedup computations using parallel processing over the grid but face the aforementioned problems. Goals: Our goals are to find methods to cluster data which: 1) guarantee convergence to the same settings irrespective of the initial conditions; 2) eliminate the need to establish the number of clusters beforehand, and 3) can be applied to cluster large datasets. Methods: We introduce a method that combines probabilistic and combinatorial clustering methods to produce repeatable and compact clusters that are not sensitive to initial conditions. This method harnesses the power of k-means (a combinatorial clustering method) to cluster/partition very large dimensional datasets and uses the Gaussian Mixture Model (a probabilistic clustering method) to validate the k-means partitions. Results: We show that this method produces very compact clusters that are not sensitive to initial conditions. This method can be used to identify the most ‘separable’ set in a dataset which increases the ‘clusterability’ of a dataset. This method also eliminates the need to specify the number of clusters in advance.
Published 2020-03-09
URL https://arxiv.org/abs/2003.04372v1
PDF https://arxiv.org/pdf/2003.04372v1.pdf
PWC https://paperswithcode.com/paper/probabilistic-partitive-partitioning-ppp

A Comparative Study of Western and Chinese Classical Music based on Soundscape Models

Title A Comparative Study of Western and Chinese Classical Music based on Soundscape Models
Authors Jianyu Fan, Yi-Hsuan Yang, Kui Dong, Philippe Pasquier
Abstract Whether literally or suggestively, the concept of soundscape is alluded in both modern and ancient music. In this study, we examine whether we can analyze and compare Western and Chinese classical music based on soundscape models. We addressed this question through a comparative study. Specifically, corpora of Western classical music excerpts (WCMED) and Chinese classical music excerpts (CCMED) were curated and annotated with emotional valence and arousal through a crowdsourcing experiment. We used a sound event detection (SED) and soundscape emotion recognition (SER) models with transfer learning to predict the perceived emotion of WCMED and CCMED. The results show that both SER and SED models could be used to analyze Chinese and Western classical music. The fact that SER and SED work better on Chinese classical music emotion recognition provides evidence that certain similarities exist between Chinese classical music and soundscape recordings, which permits transferability between machine learning models.
Tasks Emotion Recognition, Music Emotion Recognition, Sound Event Detection, Transfer Learning
Published 2020-02-20
URL https://arxiv.org/abs/2002.09021v1
PDF https://arxiv.org/pdf/2002.09021v1.pdf
PWC https://paperswithcode.com/paper/a-comparative-study-of-western-and-chinese

MatchingGAN: Matching-based Few-shot Image Generation

Title MatchingGAN: Matching-based Few-shot Image Generation
Authors Yan Hong, Li Niu, Jianfu Zhang, Liqing Zhang
Abstract To generate new images for a given category, most deep generative models require abundant training images from this category, which are often too expensive to acquire. To achieve the goal of generation based on only a few images, we propose matching-based Generative Adversarial Network (GAN) for few-shot generation, which includes a matching generator and a matching discriminator. Matching generator can match random vectors with a few conditional images from the same category and generate new images for this category based on the fused features. The matching discriminator extends conventional GAN discriminator by matching the feature of generated image with the fused feature of conditional images. Extensive experiments on three datasets demonstrate the effectiveness of our proposed method.
Tasks Image Generation
Published 2020-03-07
URL https://arxiv.org/abs/2003.03497v2
PDF https://arxiv.org/pdf/2003.03497v2.pdf
PWC https://paperswithcode.com/paper/matchinggan-matching-based-few-shot-image

Anysize GAN: A solution to the image-warping problem

Title Anysize GAN: A solution to the image-warping problem
Authors Connah Kendrick, David Gillespie, Moi Hoon Yap
Abstract We propose a new type of General Adversarial Network (GAN) to resolve a common issue with Deep Learning. We develop a novel architecture that can be applied to existing latent vector based GAN structures that allows them to generate on-the-fly images of any size. Existing GAN for image generation requires uniform images of matching dimensions. However, publicly available datasets, such as ImageNet contain thousands of different sizes. Resizing image causes deformations and changing the image data, whereas as our network does not require this preprocessing step. We make significant changes to the standard data loading techniques to enable any size image to be loaded for training. We also modify the network in two ways, by adding multiple inputs and a novel dynamic resizing layer. Finally we make adjustments to the discriminator to work on multiple resolutions. These changes can allow multiple resolution datasets to be trained on without any resizing, if memory allows. We validate our results on the ISIC 2019 skin lesion dataset. We demonstrate our method can successfully generate realistic images at different sizes without issue, preserving and understanding spatial relationships, while maintaining feature relationships. We will release the source codes upon paper acceptance.
Tasks Image Generation
Published 2020-03-06
URL https://arxiv.org/abs/2003.03233v1
PDF https://arxiv.org/pdf/2003.03233v1.pdf
PWC https://paperswithcode.com/paper/anysize-gan-a-solution-to-the-image-warping

Statistical and Topological Properties of Sliced Probability Divergences

Title Statistical and Topological Properties of Sliced Probability Divergences
Authors Kimia Nadjahi, Alain Durmus, Lénaïc Chizat, Soheil Kolouri, Shahin Shahrampour, Umut Şimşekli
Abstract The idea of slicing divergences has been proven to be successful when comparing two probability measures in various machine learning applications including generative modeling, and consists in computing the expected value of a `base divergence’ between one-dimensional random projections of the two measures. However, the computational and statistical consequences of such a technique have not yet been well-established. In this paper, we aim at bridging this gap and derive some properties of sliced divergence functions. First, we show that slicing preserves the metric axioms and the weak continuity of the divergence, implying that the sliced divergence will share similar topological properties. We then precise the results in the case where the base divergence belongs to the class of integral probability metrics. On the other hand, we establish that, under mild conditions, the sample complexity of the sliced divergence does not depend on the dimension, even when the base divergence suffers from the curse of dimensionality. We finally apply our general results to the Wasserstein distance and Sinkhorn divergences, and illustrate our theory on both synthetic and real data experiments. |
Published 2020-03-12
URL https://arxiv.org/abs/2003.05783v1
PDF https://arxiv.org/pdf/2003.05783v1.pdf
PWC https://paperswithcode.com/paper/statistical-and-topological-properties-of

SketchyCOCO: Image Generation from Freehand Scene Sketches

Title SketchyCOCO: Image Generation from Freehand Scene Sketches
Authors Chengying Gao, Qi Liu, Qi Xu, Limin Wang, Jianzhuang Liu, Changqing Zou
Abstract We introduce the first method for automatic image generation from scene-level freehand sketches. Our model allows for controllable image generation by specifying the synthesis goal via freehand sketches. The key contribution is an attribute vector bridged Generative Adversarial Network called EdgeGAN, which supports high visual-quality object-level image content generation without using freehand sketches as training data. We have built a large-scale composite dataset called SketchyCOCO to support and evaluate the solution. We validate our approach on the tasks of both object-level and scene-level image generation on SketchyCOCO. Through quantitative, qualitative results, human evaluation and ablation studies, we demonstrate the method’s capacity to generate realistic complex scene-level images from various freehand sketches.
Tasks Image Generation
Published 2020-03-05
URL https://arxiv.org/abs/2003.02683v4
PDF https://arxiv.org/pdf/2003.02683v4.pdf
PWC https://paperswithcode.com/paper/image-generation-from-freehand-scene-sketches

Unbiased variable importance for random forests

Title Unbiased variable importance for random forests
Authors Markus Loecher
Abstract The default variable-importance measure in random Forests, Gini importance, has been shown to suffer from the bias of the underlying Gini-gain splitting criterion. While the alternative permutation importance is generally accepted as a reliable measure of variable importance, it is also computationally demanding and suffers from other shortcomings. We propose a simple solution to the misleading/untrustworthy Gini importance which can be viewed as an overfitting problem: we compute the loss reduction on the out-of-bag instead of the in-bag training samples.
Published 2020-03-04
URL https://arxiv.org/abs/2003.02106v2
PDF https://arxiv.org/pdf/2003.02106v2.pdf
PWC https://paperswithcode.com/paper/unbiased-variable-importance-for-random

Hierarchical Modes Exploring in Generative Adversarial Networks

Title Hierarchical Modes Exploring in Generative Adversarial Networks
Authors Mengxiao Hu, Jinlong Li, Maolin Hu, Tao Hu
Abstract In conditional Generative Adversarial Networks (cGANs), when two different initial noises are concatenated with the same conditional information, the distance between their outputs is relatively smaller, which makes minor modes likely to collapse into large modes. To prevent this happen, we proposed a hierarchical mode exploring method to alleviate mode collapse in cGANs by introducing a diversity measurement into the objective function as the regularization term. We also introduced the Expected Ratios of Expansion (ERE) into the regularization term, by minimizing the sum of differences between the real change of distance and ERE, we can control the diversity of generated images w.r.t specific-level features. We validated the proposed algorithm on four conditional image synthesis tasks including categorical generation, paired and un-paired image translation and text-to-image generation. Both qualitative and quantitative results show that the proposed method is effective in alleviating the mode collapse problem in cGANs, and can control the diversity of output images w.r.t specific-level features.
Tasks Image Generation, Text-to-Image Generation
Published 2020-03-05
URL https://arxiv.org/abs/2003.08752v1
PDF https://arxiv.org/pdf/2003.08752v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-modes-exploring-in-generative

Smart Train Operation Algorithms based on Expert Knowledge and Reinforcement Learning

Title Smart Train Operation Algorithms based on Expert Knowledge and Reinforcement Learning
Authors Rui Zhou, Shiji Song, Anke Xue, Keyou You, Hu Wu
Abstract During recent decades, the automatic train operation (ATO) system has been gradually adopted in many subway systems. On the one hand, it is more intelligent than traditional manual driving; on the other hand, it increases the energy consumption and decreases the riding comfort of the subway system. This paper proposes two smart train operation algorithms based on the combination of expert knowledge and reinforcement learning algorithms. Compared with previous works, smart train operation algorithms can realize the control of continuous action for the subway system and satisfy multiple objectives (the safety, the punctuality, the energy efficiency, and the riding comfort) without using an offline optimized speed profile. Firstly, through analyzing historical data of experienced subway drivers, we summarize the expert knowledge rules and build inference methods to guarantee the riding comfort, the punctuality and the safety of the subway system. Then we develop two algorithms to realize the control of continuous action and to ensure the energy efficiency of train operation. Among them, one is the smart train operation (STO) algorithm based on deep deterministic policy gradient named (STOD) and another is the smart train operation algorithm based on normalized advantage function (STON). Finally, we verify the performance of proposed algorithms via some numerical simulations with the real field data collected from the Yizhuang Line of the Beijing Subway and their performance will be compared with existing ATO algorithms. The results of numerical simulations show that the developed smart train operation systems are better than manual driving and existing ATO algorithms in respect of energy efficiency. In addition, STOD and STON have the ability to adapt to different trip times and different resistance conditions.
Published 2020-03-06
URL https://arxiv.org/abs/2003.03327v1
PDF https://arxiv.org/pdf/2003.03327v1.pdf
PWC https://paperswithcode.com/paper/smart-train-operation-algorithms-based-on

Unlimited Resolution Image Generation with R2D2-GANs

Title Unlimited Resolution Image Generation with R2D2-GANs
Authors Marija Jegorova, Antti Ilari Karjalainen, Jose Vazquez, Timothy M. Hospedales
Abstract In this paper we present a novel simulation technique for generating high quality images of any predefined resolution. This method can be used to synthesize sonar scans of size equivalent to those collected during a full-length mission, with across track resolutions of any chosen magnitude. In essence, our model extends Generative Adversarial Networks (GANs) based architecture into a conditional recursive setting, that facilitates the continuity of the generated images. The data produced is continuous, realistically-looking, and can also be generated at least two times faster than the real speed of acquisition for the sonars with higher resolutions, such as EdgeTech. The seabed topography can be fully controlled by the user. The visual assessment tests demonstrate that humans cannot distinguish the simulated images from real. Moreover, experimental results suggest that in the absence of real data the autonomous recognition systems can benefit greatly from training with the synthetic data, produced by the R2D2-GANs.
Tasks Image Generation
Published 2020-03-02
URL https://arxiv.org/abs/2003.01063v1
PDF https://arxiv.org/pdf/2003.01063v1.pdf
PWC https://paperswithcode.com/paper/unlimited-resolution-image-generation-with

Cross-Spectrum Dual-Subspace Pairing for RGB-infrared Cross-Modality Person Re-Identification

Title Cross-Spectrum Dual-Subspace Pairing for RGB-infrared Cross-Modality Person Re-Identification
Authors Xing Fan, Hao Luo, Chi Zhang, Wei Jiang
Abstract Due to its potential wide applications in video surveillance and other computer vision tasks like tracking, person re-identification (ReID) has become popular and been widely investigated. However, conventional person re-identification can only handle RGB color images, which will fail at dark conditions. Thus RGB-infrared ReID (also known as Infrared-Visible ReID or Visible-Thermal ReID) is proposed. Apart from appearance discrepancy in traditional ReID caused by illumination, pose variations and viewpoint changes, modality discrepancy produced by cameras of the different spectrum also exists, which makes RGB-infrared ReID more difficult. To address this problem, we focus on extracting the shared cross-spectrum features of different modalities. In this paper, a novel multi-spectrum image generation method is proposed and the generated samples are utilized to help the network to find discriminative information for re-identifying the same person across modalities. Another challenge of RGB-infrared ReID is that the intra-person (images from the same person) discrepancy is often larger than the inter-person (images from different persons) discrepancy, so a dual-subspace pairing strategy is proposed to alleviate this problem. Combining those two parts together, we also design a one-stream neural network combining the aforementioned methods to extract compact representations of person images, called Cross-spectrum Dual-subspace Pairing (CDP) model. Furthermore, during the training process, we also propose a Dynamic Hard Spectrum Mining method to automatically mine more hard samples from hard spectrum based on the current model state to further boost the performance. Extensive experimental results on two public datasets, SYSU-MM01 with RGB + near-infrared images and RegDB with RGB + far-infrared images, have demonstrated the efficiency and generality of our proposed method.
Tasks Image Generation, Person Re-Identification
Published 2020-02-29
URL https://arxiv.org/abs/2003.00213v1
PDF https://arxiv.org/pdf/2003.00213v1.pdf
PWC https://paperswithcode.com/paper/cross-spectrum-dual-subspace-pairing-for-rgb

Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems

Title Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems
Authors Nikolay Malkovsky, Vladimir Bataev, Dmitrii Sviridkin, Natalia Kizhaeva, Aleksandr Laptev, Ildar Valiev, Oleg Petrov
Abstract The problem of out of vocabulary words (OOV) is typical for any speech recognition system, hybrid systems are usually constructed to recognize a fixed set of words and rarely can include all the words that will be encountered during exploitation of the system. One of the popular approach to cover OOVs is to use subword units rather then words. Such system can potentially recognize any previously unseen word if the word can be constructed from present subword units, but also non-existing words can be recognized. The other popular approach is to modify HMM part of the system so that it can be easily and effectively expanded with custom set of words we want to add to the system. In this paper we explore different existing methods of this solution on both graph construction and search method levels. We also present a novel vocabulary expansion techniques which solve some common internal subroutine problems regarding recognition graph processing.
Tasks graph construction, Speech Recognition
Published 2020-03-19
URL https://arxiv.org/abs/2003.09024v1
PDF https://arxiv.org/pdf/2003.09024v1.pdf
PWC https://paperswithcode.com/paper/techniques-for-vocabulary-expansion-in-hybrid

A Framework for Democratizing AI

Title A Framework for Democratizing AI
Authors Shakkeel Ahmed, Ravi S. Mula, Soma S. Dhavala
Abstract Machine Learning and Artificial Intelligence are considered an integral part of the Fourth Industrial Revolution. Their impact, and far-reaching consequences, while acknowledged, are yet to be comprehended. These technologies are very specialized, and few organizations and select highly trained professionals have the wherewithal, in terms of money, manpower, and might, to chart the future. However, concentration of power can lead to marginalization, causing severe inequalities. Regulatory agencies and governments across the globe are creating national policies, and laws around these technologies to protect the rights of the digital citizens, as well as to empower them. Even private, not-for-profit organizations are also contributing to democratizing the technologies by making them \emph{accessible} and \emph{affordable}. However, accessibility and affordability are all but a few of the facets of democratizing the field. Others include, but not limited to, \emph{portability}, \emph{explainability}, \emph{credibility}, \emph{fairness}, among others. As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At \texttt{mlsquare}, we are developing scientific tools in this space. Specifically, we introduce an opinionated, extensible, \texttt{Python} framework that provides a single point of interface to a variety of solutions in each of the categories mentioned above. We present the design details, APIs of the framework, reference implementations, road map for development, and guidelines for contributions.
Published 2020-01-01
URL https://arxiv.org/abs/2001.00818v1
PDF https://arxiv.org/pdf/2001.00818v1.pdf
PWC https://paperswithcode.com/paper/a-framework-for-democratizing-ai

Weighting Is Worth the Wait: Bayesian Optimization with Importance Sampling

Title Weighting Is Worth the Wait: Bayesian Optimization with Importance Sampling
Authors Setareh Ariafar, Zelda Mariet, Ehsan Elhamifar, Dana Brooks, Jennifer Dy, Jasper Snoek
Abstract Many contemporary machine learning models require extensive tuning of hyperparameters to perform well. A variety of methods, such as Bayesian optimization, have been developed to automate and expedite this process. However, tuning remains extremely costly as it typically requires repeatedly fully training models. We propose to accelerate the Bayesian optimization approach to hyperparameter tuning for neural networks by taking into account the relative amount of information contributed by each training example. To do so, we leverage importance sampling (IS); this significantly increases the quality of the black-box function evaluations, but also their runtime, and so must be done carefully. Casting hyperparameter search as a multi-task Bayesian optimization problem over both hyperparameters and importance sampling design achieves the best of both worlds: by learning a parameterization of IS that trades-off evaluation complexity and quality, we improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
Published 2020-02-23
URL https://arxiv.org/abs/2002.09927v1
PDF https://arxiv.org/pdf/2002.09927v1.pdf
PWC https://paperswithcode.com/paper/weighting-is-worth-the-wait-bayesian

Deep Joint Transmission-Recognition for Power-Constrained IoT Devices

Title Deep Joint Transmission-Recognition for Power-Constrained IoT Devices
Authors Mikolaj Jankowski, Deniz Gunduz, Krystian Mikolajczyk
Abstract We propose a joint transmission-recognition scheme for efficient inference at the wireless network edge. Our scheme allows for reliable image recognition over wireless channels with significant computational load reduction at the sender side. We incorporate recently proposed deep joint source-channel coding (JSCC) scheme, and combine it with novel filter pruning strategies aimed at reducing the redundant complexity from neural networks. We evaluate our approach on a classification task, and show satisfactory results in both transmission reliability and workload reduction. This is the first work that combines deep JSCC with network pruning and applies it to images classification over wireless network.
Tasks Network Pruning
Published 2020-03-04
URL https://arxiv.org/abs/2003.02027v1
PDF https://arxiv.org/pdf/2003.02027v1.pdf
PWC https://paperswithcode.com/paper/deep-joint-transmission-recognition-for-power
comments powered by Disqus