January 28, 2020

2843 words 14 mins read

Paper Group ANR 800

Paper Group ANR 800

Noisy multi-label semi-supervised dimensionality reduction. a novel cross-lingual voice cloning approach with a few text-free samples. Spectral-GANs for High-Resolution 3D Point-cloud Generation. Boundary-Aware Salient Object Detection via Recurrent Two-Stream Guided Refinement Network. Transferring knowledge from monitored to unmonitored areas for …

Noisy multi-label semi-supervised dimensionality reduction

Title Noisy multi-label semi-supervised dimensionality reduction
Authors Karl Øyvind Mikalsen, Cristina Soguero-Ruiz, Filippo Maria Bianchi, Robert Jenssen
Abstract Noisy labeled data represent a rich source of information that often are easily accessible and cheap to obtain, but label noise might also have many negative consequences if not accounted for. How to fully utilize noisy labels has been studied extensively within the framework of standard supervised machine learning over a period of several decades. However, very little research has been conducted on solving the challenge posed by noisy labels in non-standard settings. This includes situations where only a fraction of the samples are labeled (semi-supervised) and each high-dimensional sample is associated with multiple labels. In this work, we present a novel semi-supervised and multi-label dimensionality reduction method that effectively utilizes information from both noisy multi-labels and unlabeled data. With the proposed Noisy multi-label semi-supervised dimensionality reduction (NMLSDR) method, the noisy multi-labels are denoised and unlabeled data are labeled simultaneously via a specially designed label propagation algorithm. NMLSDR then learns a projection matrix for reducing the dimensionality by maximizing the dependence between the enlarged and denoised multi-label space and the features in the projected space. Extensive experiments on synthetic data, benchmark datasets, as well as a real-world case study, demonstrate the effectiveness of the proposed algorithm and show that it outperforms state-of-the-art multi-label feature extraction algorithms.
Tasks Dimensionality Reduction
Published 2019-02-20
URL http://arxiv.org/abs/1902.07517v1
PDF http://arxiv.org/pdf/1902.07517v1.pdf
PWC https://paperswithcode.com/paper/noisy-multi-label-semi-supervised
Repo
Framework

a novel cross-lingual voice cloning approach with a few text-free samples

Title a novel cross-lingual voice cloning approach with a few text-free samples
Authors Xinyong Zhou, Hao Che, Xiaorui Wang, Lei Xie
Abstract In this paper, we present a cross-lingual voice cloning approach. BN features obtained by SI-ASR model are used as a bridge across speakers and language boundaries. The relationships between text and BN features are modeled by the latent prosody model. The acoustic model learns the translation from BN features to acoustic features. The acoustic model is fine-tuned with a few samples of the target speaker to realize voice cloning. This system can generate speech of arbitrary utterance of target language in cross-lingual speakers’ voice. We verify that with small amount of audio data, our proposed approach can well handle cross-lingual tasks. And in intra-lingual tasks, our proposed approach also performs better than baseline approach in naturalness and similarity.
Tasks
Published 2019-10-29
URL https://arxiv.org/abs/1910.13276v2
PDF https://arxiv.org/pdf/1910.13276v2.pdf
PWC https://paperswithcode.com/paper/191013276
Repo
Framework

Spectral-GANs for High-Resolution 3D Point-cloud Generation

Title Spectral-GANs for High-Resolution 3D Point-cloud Generation
Authors Sameera Ramasinghe, Salman Khan, Nick Barnes, Stephen Gould
Abstract Point-clouds are a popular choice for vision and graphics tasks due to their accurate shape description and direct acquisition from range-scanners. This demands the ability to synthesize and reconstruct high-quality point-clouds. Current deep generative models for 3D data generally work on simplified representations (e.g., voxelized objects) and cannot deal with the inherent redundancy and irregularity in point-clouds. A few recent efforts on 3D point-cloud generation offer limited resolution and their complexity grows with the increase in output resolution. In this paper, we develop a principled approach to synthesize 3D point-clouds using a spectral-domain Generative Adversarial Network (GAN). Our spectral representation is highly structured and allows us to disentangle various frequency bands such that the learning task is simplified for a GAN model. As compared to spatial-domain generative approaches, our formulation allows us to generate arbitrary number of points high-resolution point-clouds with minimal computational overhead. Furthermore, we propose a fully differentiable block to transform from {the} spectral to the spatial domain and back, thereby allowing us to integrate knowledge from well-established spatial models. We demonstrate that Spectral-GAN performs well for point-cloud generation task. Additionally, it can learn {a} highly discriminative representation in an unsupervised fashion and can be used to accurately reconstruct 3D objects.
Tasks Point Cloud Generation
Published 2019-12-04
URL https://arxiv.org/abs/1912.01800v1
PDF https://arxiv.org/pdf/1912.01800v1.pdf
PWC https://paperswithcode.com/paper/spectral-gans-for-high-resolution-3d-point
Repo
Framework

Boundary-Aware Salient Object Detection via Recurrent Two-Stream Guided Refinement Network

Title Boundary-Aware Salient Object Detection via Recurrent Two-Stream Guided Refinement Network
Authors Fangting Lin, Chao Yang, Huizhou Li, Bin Jiang
Abstract Recent deep learning based salient object detection methods which utilize both saliency and boundary features have achieved remarkable performance. However, most of them ignore the complementarity between saliency features and boundary features, thus get worse predictions in scenes with low contrast between foreground and background. To address this issue, we propose a novel Recurrent Two-Stream Guided Refinement Network (RTGRNet) that consists of iterating Two-Stream Guided Refinement Modules (TGRMs). TGRM consists of a Guide Block and two feature streams: saliency and boundary, the Guide Block utilizes the refined features after previous TGRM to further improve the performance of two feature streams in current TGRM. Meanwhile, the low-level integrated features are also utilized as a reference to get better details. Finally, we progressively refine these features by recurrently stacking more TGRMs. Extensive experiments on six public datasets show that our proposed RTGRNet achieves the state-of-the-art performance in salient object detection.
Tasks Object Detection, Salient Object Detection
Published 2019-12-11
URL https://arxiv.org/abs/1912.05236v1
PDF https://arxiv.org/pdf/1912.05236v1.pdf
PWC https://paperswithcode.com/paper/boundary-aware-salient-object-detection-via
Repo
Framework

Transferring knowledge from monitored to unmonitored areas for forecasting parking spaces

Title Transferring knowledge from monitored to unmonitored areas for forecasting parking spaces
Authors Andrei Ionita, André Pomp, Michael Cochez, Tobias Meisen, Stefan Decker
Abstract Smart cities around the world have begun monitoring parking areas in order to estimate available parking spots and help drivers looking for parking. The current results are promising, indeed. However, existing approaches are limited by the high cost of sensors that need to be installed throughout the city in order to achieve an accurate estimation. This work investigates the extension of estimating parking information from areas equipped with sensors to areas where they are missing. To this end, the similarity between city neighborhoods is determined based on background data, i.e., from geographic information systems. Using the derived similarity values, we analyze the adaptation of occupancy rates from monitored- to unmonitored parking areas.
Tasks
Published 2019-08-07
URL https://arxiv.org/abs/1908.03629v1
PDF https://arxiv.org/pdf/1908.03629v1.pdf
PWC https://paperswithcode.com/paper/transferring-knowledge-from-monitored-to
Repo
Framework

Towards a Predictive Patent Analytics and Evaluation Platform

Title Towards a Predictive Patent Analytics and Evaluation Platform
Authors Nebula Alam, Khoi-Nguyen Tran, Sue Ann Chen, John Wagner, Josh Andres, Mukesh Mohania
Abstract The importance of patents is well recognised across many regions of the world. Many patent mining systems have been proposed, but with limited predictive capabilities. In this demo, we showcase how predictive algorithms leveraging the state-of-the-art machine learning and deep learning techniques can be used to improve understanding of patents for inventors, patent evaluators, and business analysts alike. Our demo video is available at http://ibm.biz/ecml2019-demo-patent-analytics
Tasks
Published 2019-10-31
URL https://arxiv.org/abs/1910.14258v1
PDF https://arxiv.org/pdf/1910.14258v1.pdf
PWC https://paperswithcode.com/paper/towards-a-predictive-patent-analytics-and
Repo
Framework

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

Title Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Authors Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, RJ Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran
Abstract We present a multispeaker, multilingual text-to-speech (TTS) synthesis model based on Tacotron that is able to produce high quality speech in multiple languages. Moreover, the model is able to transfer voices across languages, e.g. synthesize fluent Spanish speech using an English speaker’s voice, without training on any bilingual or parallel examples. Such transfer works across distantly related languages, e.g. English and Mandarin. Critical to achieving this result are: 1. using a phonemic input representation to encourage sharing of model capacity across languages, and 2. incorporating an adversarial loss term to encourage the model to disentangle its representation of speaker identity (which is perfectly correlated with language in the training data) from the speech content. Further scaling up the model by training on multiple speakers of each language, and incorporating an autoencoding input to help stabilize attention during training, results in a model which can be used to consistently synthesize intelligible speech for training speakers in all languages seen during training, and in native or foreign accents.
Tasks Speech Synthesis
Published 2019-07-09
URL https://arxiv.org/abs/1907.04448v2
PDF https://arxiv.org/pdf/1907.04448v2.pdf
PWC https://paperswithcode.com/paper/learning-to-speak-fluently-in-a-foreign
Repo
Framework

View-Invariant Probabilistic Embedding for Human Pose

Title View-Invariant Probabilistic Embedding for Human Pose
Authors Jennifer J. Sun, Jiaping Zhao, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Ting Liu
Abstract Depictions of similar human body configurations can vary with changing viewpoints. Using only 2D information, we would like to enable vision algorithms to recognize similarity in human body poses across multiple views. This ability is useful for analyzing body movements and human behaviors in images and videos. In this paper, we propose an approach for learning a compact view-invariant embedding space from 2D joint keypoints alone, without explicitly predicting 3D poses. Since 2D poses are projected from 3D space, they have an inherent ambiguity, which is difficult to represent through a deterministic mapping. Hence, we use probabilistic embeddings to model this input uncertainty. Experimental results show that our embedding model achieves higher accuracy when retrieving similar poses across different camera views, in comparison with 2D-to-3D pose lifting models. We also demonstrate the effectiveness of applying our embeddings to view-invariant action recognition and video alignment.
Tasks Video Alignment
Published 2019-12-02
URL https://arxiv.org/abs/1912.01001v2
PDF https://arxiv.org/pdf/1912.01001v2.pdf
PWC https://paperswithcode.com/paper/view-invariant-probabilistic-embedding-for
Repo
Framework

Towards Universal Languages for Tractable Ontology Mediated Query Answering

Title Towards Universal Languages for Tractable Ontology Mediated Query Answering
Authors Heng Zhang, Yan Zhang, Jia-Huai You, Zhiyong Feng, Guifei Jiang
Abstract An ontology language for ontology mediated query answering (OMQA-language) is universal for a family of OMQA-languages if it is the most expressive one among this family. In this paper, we focus on three families of tractable OMQA-languages, including first-order rewritable languages and languages whose data complexity of the query answering is in AC0 or PTIME. On the negative side, we prove that there is, in general, no universal language for each of these families of languages. On the positive side, we propose a novel property, the locality, to approximate the first-order rewritability, and show that there exists a language of disjunctive embedded dependencies that is universal for the family of OMQA-languages with locality. All of these results apply to OMQA with query languages such as conjunctive queries, unions of conjunctive queries and acyclic conjunctive queries.
Tasks
Published 2019-11-26
URL https://arxiv.org/abs/1911.11359v2
PDF https://arxiv.org/pdf/1911.11359v2.pdf
PWC https://paperswithcode.com/paper/towards-universal-languages-for-tractable
Repo
Framework

On Sampling Random Features From Empirical Leverage Scores: Implementation and Theoretical Guarantees

Title On Sampling Random Features From Empirical Leverage Scores: Implementation and Theoretical Guarantees
Authors Shahin Shahrampour, Soheil Kolouri
Abstract Random features provide a practical framework for large-scale kernel approximation and supervised learning. It has been shown that data-dependent sampling of random features using leverage scores can significantly reduce the number of features required to achieve optimal learning bounds. Leverage scores introduce an optimized distribution for features based on an infinite-dimensional integral operator (depending on input distribution), which is impractical to sample from. Focusing on empirical leverage scores in this paper, we establish an out-of-sample performance bound, revealing an interesting trade-off between the approximated kernel and the eigenvalue decay of another kernel in the domain of random features defined based on data distribution. Our experiments verify that the empirical algorithm consistently outperforms vanilla Monte Carlo sampling, and with a minor modification the method is even competitive to supervised data-dependent kernel learning, without using the output (label) information.
Tasks
Published 2019-03-20
URL http://arxiv.org/abs/1903.08329v1
PDF http://arxiv.org/pdf/1903.08329v1.pdf
PWC https://paperswithcode.com/paper/on-sampling-random-features-from-empirical
Repo
Framework

Learning Classifiers for Domain Adaptation, Zero and Few-Shot Recognition Based on Learning Latent Semantic Parts

Title Learning Classifiers for Domain Adaptation, Zero and Few-Shot Recognition Based on Learning Latent Semantic Parts
Authors Pengkai Zhu, Hanxiao Wang, Venkatesh Saligrama
Abstract In computer vision applications, such as domain adaptation (DA), few shot learning (FSL) and zero-shot learning (ZSL), we encounter new objects and environments, for which insufficient examples exist to allow for training “models from scratch,” and methods that adapt existing models, trained on the presented training environment, to the new scenario are required. We propose a novel visual attribute encoding method that encodes each image as a low-dimensional probability vector composed of prototypical part-type probabilities. The prototypes are learnt to be representative of all training data. At test-time we utilize this encoding as an input to a classifier. At test-time we freeze the encoder and only learn/adapt the classifier component to limited annotated labels in FSL; new semantic attributes in ZSL. We conduct extensive experiments on benchmark datasets. Our method outperforms state-of-art methods trained for the specific contexts (ZSL, FSL, DA).
Tasks Domain Adaptation, Few-Shot Learning, Zero-Shot Learning
Published 2019-01-25
URL https://arxiv.org/abs/1901.09079v3
PDF https://arxiv.org/pdf/1901.09079v3.pdf
PWC https://paperswithcode.com/paper/learning-for-new-visual-environments-with
Repo
Framework

Exploring Structure-Adaptive Graph Learning for Robust Semi-Supervised Classification

Title Exploring Structure-Adaptive Graph Learning for Robust Semi-Supervised Classification
Authors Xiang Gao, Wei Hu, Zongming Guo
Abstract Graph Convolutional Neural Networks (GCNNs) are generalizations of CNNs to graph-structured data, in which convolution is guided by the graph topology. In many cases where graphs are unavailable, existing methods manually construct graphs or learn task-driven adaptive graphs. In this paper, we propose Graph Learning Neural Networks (GLNNs), which exploit the optimization of graphs (the adjacency matrix in particular) from both data and tasks. Leveraging on spectral graph theory, we propose the objective of graph learning from a sparsity constraint, properties of a valid adjacency matrix as well as a graph Laplacian regularizer via maximum a posteriori estimation. The optimization objective is then integrated into the loss function of the GCNN, which adapts the graph topology to not only labels of a specific task but also the input data. Experimental results show that our proposed GLNN outperforms state-of-the-art approaches over widely adopted social network datasets and citation network datasets for semi-supervised classification.
Tasks Node Classification
Published 2019-04-23
URL https://arxiv.org/abs/1904.10146v2
PDF https://arxiv.org/pdf/1904.10146v2.pdf
PWC https://paperswithcode.com/paper/exploring-graph-learning-for-semi-supervised
Repo
Framework
Title The NAI Suite – Drafting and Reasoning over Legal Texts
Authors Tomer Libal, Alexander Steen
Abstract A prototype for automated reasoning over legal texts, called NAI, is presented. As an input, NAI accepts formalized logical representations of such legal texts that can be created and curated using an integrated annotation interface. The prototype supports automated reasoning over the given text representation and multiple quality assurance procedures. The pragmatics of the NAI suite as well its feasibility in practical applications is studied on a fragment of the Smoking Prohibition (Children in Motor Vehicles) (Scotland) Act 2016 of the Scottish Parliament.
Tasks
Published 2019-10-15
URL https://arxiv.org/abs/1910.07004v1
PDF https://arxiv.org/pdf/1910.07004v1.pdf
PWC https://paperswithcode.com/paper/the-nai-suite-drafting-and-reasoning-over
Repo
Framework

Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990

Title Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990
Authors Melvin Wevers
Abstract Contemporary debates on filter bubbles and polarization in public and social media raise the question to what extent news media of the past exhibited biases. This paper specifically examines bias related to gender in six Dutch national newspapers between 1950 and 1990. We measure bias related to gender by comparing local changes in word embedding models trained on newspapers with divergent ideological backgrounds. We demonstrate clear differences in gender bias and changes within and between newspapers over time. In relation to themes such as sexuality and leisure, we see the bias moving toward women, whereas, generally, the bias shifts in the direction of men, despite growing female employment number and feminist movements. Even though Dutch society became less stratified ideologically (depillarization), we found an increasing divergence in gender bias between religious and social-democratic on the one hand and liberal newspapers on the other. Methodologically, this paper illustrates how word embeddings can be used to examine historical language change. Future work will investigate how fine-tuning deep contextualized embedding models, such as ELMO, might be used for similar tasks with greater contextual information.
Tasks Word Embeddings
Published 2019-07-21
URL https://arxiv.org/abs/1907.08922v1
PDF https://arxiv.org/pdf/1907.08922v1.pdf
PWC https://paperswithcode.com/paper/using-word-embeddings-to-examine-gender-bias
Repo
Framework

Optical Fringe Patterns Filtering Based on Multi-Stage Convolution Neural Network

Title Optical Fringe Patterns Filtering Based on Multi-Stage Convolution Neural Network
Authors Bowen Lin, Shujun Fu, Caiming Zhang, Fengling Wang, Shiling Xie, Zhigang Zhao, Yuliang Li
Abstract Optical fringe patterns are often contaminated by speckle noise, making it difficult to accurately and robustly extract their phase fields. To deal with this problem, we propose a filtering method based on deep learning, called optical fringe patterns denoising convolutional neural network (FPD-CNN), for directly removing speckle from the input noisy fringe patterns. Regularization technology is integrated into the design of deep architecture. Specifically, the FPD-CNN method is divided into multiple stages, each stage consists of a set of convolutional layers along with batch normalization and leaky rectified linear unit (Leaky ReLU) activation function. The end-to-end joint training is carried out using the Euclidean loss. Extensive experiments on simulated and experimental optical fringe patterns,especially finer ones with high-density regions, show that the proposed method is competitive with some state-of-the-art denoising techniques in spatial or transform domains, efficiently preserving main features of fringe at a fairly fast speed.
Tasks Denoising
Published 2019-01-02
URL http://arxiv.org/abs/1901.00361v2
PDF http://arxiv.org/pdf/1901.00361v2.pdf
PWC https://paperswithcode.com/paper/optical-fringe-patterns-filtering-based-on
Repo
Framework
comments powered by Disqus