Paper Group ANR 800
Noisy multi-label semi-supervised dimensionality reduction. a novel cross-lingual voice cloning approach with a few text-free samples. Spectral-GANs for High-Resolution 3D Point-cloud Generation. Boundary-Aware Salient Object Detection via Recurrent Two-Stream Guided Refinement Network. Transferring knowledge from monitored to unmonitored areas for …
Noisy multi-label semi-supervised dimensionality reduction
Title | Noisy multi-label semi-supervised dimensionality reduction |
Authors | Karl Øyvind Mikalsen, Cristina Soguero-Ruiz, Filippo Maria Bianchi, Robert Jenssen |
Abstract | Noisy labeled data represent a rich source of information that often are easily accessible and cheap to obtain, but label noise might also have many negative consequences if not accounted for. How to fully utilize noisy labels has been studied extensively within the framework of standard supervised machine learning over a period of several decades. However, very little research has been conducted on solving the challenge posed by noisy labels in non-standard settings. This includes situations where only a fraction of the samples are labeled (semi-supervised) and each high-dimensional sample is associated with multiple labels. In this work, we present a novel semi-supervised and multi-label dimensionality reduction method that effectively utilizes information from both noisy multi-labels and unlabeled data. With the proposed Noisy multi-label semi-supervised dimensionality reduction (NMLSDR) method, the noisy multi-labels are denoised and unlabeled data are labeled simultaneously via a specially designed label propagation algorithm. NMLSDR then learns a projection matrix for reducing the dimensionality by maximizing the dependence between the enlarged and denoised multi-label space and the features in the projected space. Extensive experiments on synthetic data, benchmark datasets, as well as a real-world case study, demonstrate the effectiveness of the proposed algorithm and show that it outperforms state-of-the-art multi-label feature extraction algorithms. |
Tasks | Dimensionality Reduction |
Published | 2019-02-20 |
URL | http://arxiv.org/abs/1902.07517v1 |
http://arxiv.org/pdf/1902.07517v1.pdf | |
PWC | https://paperswithcode.com/paper/noisy-multi-label-semi-supervised |
Repo | |
Framework | |
a novel cross-lingual voice cloning approach with a few text-free samples
Title | a novel cross-lingual voice cloning approach with a few text-free samples |
Authors | Xinyong Zhou, Hao Che, Xiaorui Wang, Lei Xie |
Abstract | In this paper, we present a cross-lingual voice cloning approach. BN features obtained by SI-ASR model are used as a bridge across speakers and language boundaries. The relationships between text and BN features are modeled by the latent prosody model. The acoustic model learns the translation from BN features to acoustic features. The acoustic model is fine-tuned with a few samples of the target speaker to realize voice cloning. This system can generate speech of arbitrary utterance of target language in cross-lingual speakers’ voice. We verify that with small amount of audio data, our proposed approach can well handle cross-lingual tasks. And in intra-lingual tasks, our proposed approach also performs better than baseline approach in naturalness and similarity. |
Tasks | |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13276v2 |
https://arxiv.org/pdf/1910.13276v2.pdf | |
PWC | https://paperswithcode.com/paper/191013276 |
Repo | |
Framework | |
Spectral-GANs for High-Resolution 3D Point-cloud Generation
Title | Spectral-GANs for High-Resolution 3D Point-cloud Generation |
Authors | Sameera Ramasinghe, Salman Khan, Nick Barnes, Stephen Gould |
Abstract | Point-clouds are a popular choice for vision and graphics tasks due to their accurate shape description and direct acquisition from range-scanners. This demands the ability to synthesize and reconstruct high-quality point-clouds. Current deep generative models for 3D data generally work on simplified representations (e.g., voxelized objects) and cannot deal with the inherent redundancy and irregularity in point-clouds. A few recent efforts on 3D point-cloud generation offer limited resolution and their complexity grows with the increase in output resolution. In this paper, we develop a principled approach to synthesize 3D point-clouds using a spectral-domain Generative Adversarial Network (GAN). Our spectral representation is highly structured and allows us to disentangle various frequency bands such that the learning task is simplified for a GAN model. As compared to spatial-domain generative approaches, our formulation allows us to generate arbitrary number of points high-resolution point-clouds with minimal computational overhead. Furthermore, we propose a fully differentiable block to transform from {the} spectral to the spatial domain and back, thereby allowing us to integrate knowledge from well-established spatial models. We demonstrate that Spectral-GAN performs well for point-cloud generation task. Additionally, it can learn {a} highly discriminative representation in an unsupervised fashion and can be used to accurately reconstruct 3D objects. |
Tasks | Point Cloud Generation |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.01800v1 |
https://arxiv.org/pdf/1912.01800v1.pdf | |
PWC | https://paperswithcode.com/paper/spectral-gans-for-high-resolution-3d-point |
Repo | |
Framework | |
Boundary-Aware Salient Object Detection via Recurrent Two-Stream Guided Refinement Network
Title | Boundary-Aware Salient Object Detection via Recurrent Two-Stream Guided Refinement Network |
Authors | Fangting Lin, Chao Yang, Huizhou Li, Bin Jiang |
Abstract | Recent deep learning based salient object detection methods which utilize both saliency and boundary features have achieved remarkable performance. However, most of them ignore the complementarity between saliency features and boundary features, thus get worse predictions in scenes with low contrast between foreground and background. To address this issue, we propose a novel Recurrent Two-Stream Guided Refinement Network (RTGRNet) that consists of iterating Two-Stream Guided Refinement Modules (TGRMs). TGRM consists of a Guide Block and two feature streams: saliency and boundary, the Guide Block utilizes the refined features after previous TGRM to further improve the performance of two feature streams in current TGRM. Meanwhile, the low-level integrated features are also utilized as a reference to get better details. Finally, we progressively refine these features by recurrently stacking more TGRMs. Extensive experiments on six public datasets show that our proposed RTGRNet achieves the state-of-the-art performance in salient object detection. |
Tasks | Object Detection, Salient Object Detection |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05236v1 |
https://arxiv.org/pdf/1912.05236v1.pdf | |
PWC | https://paperswithcode.com/paper/boundary-aware-salient-object-detection-via |
Repo | |
Framework | |
Transferring knowledge from monitored to unmonitored areas for forecasting parking spaces
Title | Transferring knowledge from monitored to unmonitored areas for forecasting parking spaces |
Authors | Andrei Ionita, André Pomp, Michael Cochez, Tobias Meisen, Stefan Decker |
Abstract | Smart cities around the world have begun monitoring parking areas in order to estimate available parking spots and help drivers looking for parking. The current results are promising, indeed. However, existing approaches are limited by the high cost of sensors that need to be installed throughout the city in order to achieve an accurate estimation. This work investigates the extension of estimating parking information from areas equipped with sensors to areas where they are missing. To this end, the similarity between city neighborhoods is determined based on background data, i.e., from geographic information systems. Using the derived similarity values, we analyze the adaptation of occupancy rates from monitored- to unmonitored parking areas. |
Tasks | |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.03629v1 |
https://arxiv.org/pdf/1908.03629v1.pdf | |
PWC | https://paperswithcode.com/paper/transferring-knowledge-from-monitored-to |
Repo | |
Framework | |
Towards a Predictive Patent Analytics and Evaluation Platform
Title | Towards a Predictive Patent Analytics and Evaluation Platform |
Authors | Nebula Alam, Khoi-Nguyen Tran, Sue Ann Chen, John Wagner, Josh Andres, Mukesh Mohania |
Abstract | The importance of patents is well recognised across many regions of the world. Many patent mining systems have been proposed, but with limited predictive capabilities. In this demo, we showcase how predictive algorithms leveraging the state-of-the-art machine learning and deep learning techniques can be used to improve understanding of patents for inventors, patent evaluators, and business analysts alike. Our demo video is available at http://ibm.biz/ecml2019-demo-patent-analytics |
Tasks | |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14258v1 |
https://arxiv.org/pdf/1910.14258v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-predictive-patent-analytics-and |
Repo | |
Framework | |
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Title | Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning |
Authors | Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, RJ Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran |
Abstract | We present a multispeaker, multilingual text-to-speech (TTS) synthesis model based on Tacotron that is able to produce high quality speech in multiple languages. Moreover, the model is able to transfer voices across languages, e.g. synthesize fluent Spanish speech using an English speaker’s voice, without training on any bilingual or parallel examples. Such transfer works across distantly related languages, e.g. English and Mandarin. Critical to achieving this result are: 1. using a phonemic input representation to encourage sharing of model capacity across languages, and 2. incorporating an adversarial loss term to encourage the model to disentangle its representation of speaker identity (which is perfectly correlated with language in the training data) from the speech content. Further scaling up the model by training on multiple speakers of each language, and incorporating an autoencoding input to help stabilize attention during training, results in a model which can be used to consistently synthesize intelligible speech for training speakers in all languages seen during training, and in native or foreign accents. |
Tasks | Speech Synthesis |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04448v2 |
https://arxiv.org/pdf/1907.04448v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-speak-fluently-in-a-foreign |
Repo | |
Framework | |
View-Invariant Probabilistic Embedding for Human Pose
Title | View-Invariant Probabilistic Embedding for Human Pose |
Authors | Jennifer J. Sun, Jiaping Zhao, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Ting Liu |
Abstract | Depictions of similar human body configurations can vary with changing viewpoints. Using only 2D information, we would like to enable vision algorithms to recognize similarity in human body poses across multiple views. This ability is useful for analyzing body movements and human behaviors in images and videos. In this paper, we propose an approach for learning a compact view-invariant embedding space from 2D joint keypoints alone, without explicitly predicting 3D poses. Since 2D poses are projected from 3D space, they have an inherent ambiguity, which is difficult to represent through a deterministic mapping. Hence, we use probabilistic embeddings to model this input uncertainty. Experimental results show that our embedding model achieves higher accuracy when retrieving similar poses across different camera views, in comparison with 2D-to-3D pose lifting models. We also demonstrate the effectiveness of applying our embeddings to view-invariant action recognition and video alignment. |
Tasks | Video Alignment |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.01001v2 |
https://arxiv.org/pdf/1912.01001v2.pdf | |
PWC | https://paperswithcode.com/paper/view-invariant-probabilistic-embedding-for |
Repo | |
Framework | |
Towards Universal Languages for Tractable Ontology Mediated Query Answering
Title | Towards Universal Languages for Tractable Ontology Mediated Query Answering |
Authors | Heng Zhang, Yan Zhang, Jia-Huai You, Zhiyong Feng, Guifei Jiang |
Abstract | An ontology language for ontology mediated query answering (OMQA-language) is universal for a family of OMQA-languages if it is the most expressive one among this family. In this paper, we focus on three families of tractable OMQA-languages, including first-order rewritable languages and languages whose data complexity of the query answering is in AC0 or PTIME. On the negative side, we prove that there is, in general, no universal language for each of these families of languages. On the positive side, we propose a novel property, the locality, to approximate the first-order rewritability, and show that there exists a language of disjunctive embedded dependencies that is universal for the family of OMQA-languages with locality. All of these results apply to OMQA with query languages such as conjunctive queries, unions of conjunctive queries and acyclic conjunctive queries. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11359v2 |
https://arxiv.org/pdf/1911.11359v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-universal-languages-for-tractable |
Repo | |
Framework | |
On Sampling Random Features From Empirical Leverage Scores: Implementation and Theoretical Guarantees
Title | On Sampling Random Features From Empirical Leverage Scores: Implementation and Theoretical Guarantees |
Authors | Shahin Shahrampour, Soheil Kolouri |
Abstract | Random features provide a practical framework for large-scale kernel approximation and supervised learning. It has been shown that data-dependent sampling of random features using leverage scores can significantly reduce the number of features required to achieve optimal learning bounds. Leverage scores introduce an optimized distribution for features based on an infinite-dimensional integral operator (depending on input distribution), which is impractical to sample from. Focusing on empirical leverage scores in this paper, we establish an out-of-sample performance bound, revealing an interesting trade-off between the approximated kernel and the eigenvalue decay of another kernel in the domain of random features defined based on data distribution. Our experiments verify that the empirical algorithm consistently outperforms vanilla Monte Carlo sampling, and with a minor modification the method is even competitive to supervised data-dependent kernel learning, without using the output (label) information. |
Tasks | |
Published | 2019-03-20 |
URL | http://arxiv.org/abs/1903.08329v1 |
http://arxiv.org/pdf/1903.08329v1.pdf | |
PWC | https://paperswithcode.com/paper/on-sampling-random-features-from-empirical |
Repo | |
Framework | |
Learning Classifiers for Domain Adaptation, Zero and Few-Shot Recognition Based on Learning Latent Semantic Parts
Title | Learning Classifiers for Domain Adaptation, Zero and Few-Shot Recognition Based on Learning Latent Semantic Parts |
Authors | Pengkai Zhu, Hanxiao Wang, Venkatesh Saligrama |
Abstract | In computer vision applications, such as domain adaptation (DA), few shot learning (FSL) and zero-shot learning (ZSL), we encounter new objects and environments, for which insufficient examples exist to allow for training “models from scratch,” and methods that adapt existing models, trained on the presented training environment, to the new scenario are required. We propose a novel visual attribute encoding method that encodes each image as a low-dimensional probability vector composed of prototypical part-type probabilities. The prototypes are learnt to be representative of all training data. At test-time we utilize this encoding as an input to a classifier. At test-time we freeze the encoder and only learn/adapt the classifier component to limited annotated labels in FSL; new semantic attributes in ZSL. We conduct extensive experiments on benchmark datasets. Our method outperforms state-of-art methods trained for the specific contexts (ZSL, FSL, DA). |
Tasks | Domain Adaptation, Few-Shot Learning, Zero-Shot Learning |
Published | 2019-01-25 |
URL | https://arxiv.org/abs/1901.09079v3 |
https://arxiv.org/pdf/1901.09079v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-for-new-visual-environments-with |
Repo | |
Framework | |
Exploring Structure-Adaptive Graph Learning for Robust Semi-Supervised Classification
Title | Exploring Structure-Adaptive Graph Learning for Robust Semi-Supervised Classification |
Authors | Xiang Gao, Wei Hu, Zongming Guo |
Abstract | Graph Convolutional Neural Networks (GCNNs) are generalizations of CNNs to graph-structured data, in which convolution is guided by the graph topology. In many cases where graphs are unavailable, existing methods manually construct graphs or learn task-driven adaptive graphs. In this paper, we propose Graph Learning Neural Networks (GLNNs), which exploit the optimization of graphs (the adjacency matrix in particular) from both data and tasks. Leveraging on spectral graph theory, we propose the objective of graph learning from a sparsity constraint, properties of a valid adjacency matrix as well as a graph Laplacian regularizer via maximum a posteriori estimation. The optimization objective is then integrated into the loss function of the GCNN, which adapts the graph topology to not only labels of a specific task but also the input data. Experimental results show that our proposed GLNN outperforms state-of-the-art approaches over widely adopted social network datasets and citation network datasets for semi-supervised classification. |
Tasks | Node Classification |
Published | 2019-04-23 |
URL | https://arxiv.org/abs/1904.10146v2 |
https://arxiv.org/pdf/1904.10146v2.pdf | |
PWC | https://paperswithcode.com/paper/exploring-graph-learning-for-semi-supervised |
Repo | |
Framework | |
The NAI Suite – Drafting and Reasoning over Legal Texts
Title | The NAI Suite – Drafting and Reasoning over Legal Texts |
Authors | Tomer Libal, Alexander Steen |
Abstract | A prototype for automated reasoning over legal texts, called NAI, is presented. As an input, NAI accepts formalized logical representations of such legal texts that can be created and curated using an integrated annotation interface. The prototype supports automated reasoning over the given text representation and multiple quality assurance procedures. The pragmatics of the NAI suite as well its feasibility in practical applications is studied on a fragment of the Smoking Prohibition (Children in Motor Vehicles) (Scotland) Act 2016 of the Scottish Parliament. |
Tasks | |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.07004v1 |
https://arxiv.org/pdf/1910.07004v1.pdf | |
PWC | https://paperswithcode.com/paper/the-nai-suite-drafting-and-reasoning-over |
Repo | |
Framework | |
Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990
Title | Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990 |
Authors | Melvin Wevers |
Abstract | Contemporary debates on filter bubbles and polarization in public and social media raise the question to what extent news media of the past exhibited biases. This paper specifically examines bias related to gender in six Dutch national newspapers between 1950 and 1990. We measure bias related to gender by comparing local changes in word embedding models trained on newspapers with divergent ideological backgrounds. We demonstrate clear differences in gender bias and changes within and between newspapers over time. In relation to themes such as sexuality and leisure, we see the bias moving toward women, whereas, generally, the bias shifts in the direction of men, despite growing female employment number and feminist movements. Even though Dutch society became less stratified ideologically (depillarization), we found an increasing divergence in gender bias between religious and social-democratic on the one hand and liberal newspapers on the other. Methodologically, this paper illustrates how word embeddings can be used to examine historical language change. Future work will investigate how fine-tuning deep contextualized embedding models, such as ELMO, might be used for similar tasks with greater contextual information. |
Tasks | Word Embeddings |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.08922v1 |
https://arxiv.org/pdf/1907.08922v1.pdf | |
PWC | https://paperswithcode.com/paper/using-word-embeddings-to-examine-gender-bias |
Repo | |
Framework | |
Optical Fringe Patterns Filtering Based on Multi-Stage Convolution Neural Network
Title | Optical Fringe Patterns Filtering Based on Multi-Stage Convolution Neural Network |
Authors | Bowen Lin, Shujun Fu, Caiming Zhang, Fengling Wang, Shiling Xie, Zhigang Zhao, Yuliang Li |
Abstract | Optical fringe patterns are often contaminated by speckle noise, making it difficult to accurately and robustly extract their phase fields. To deal with this problem, we propose a filtering method based on deep learning, called optical fringe patterns denoising convolutional neural network (FPD-CNN), for directly removing speckle from the input noisy fringe patterns. Regularization technology is integrated into the design of deep architecture. Specifically, the FPD-CNN method is divided into multiple stages, each stage consists of a set of convolutional layers along with batch normalization and leaky rectified linear unit (Leaky ReLU) activation function. The end-to-end joint training is carried out using the Euclidean loss. Extensive experiments on simulated and experimental optical fringe patterns,especially finer ones with high-density regions, show that the proposed method is competitive with some state-of-the-art denoising techniques in spatial or transform domains, efficiently preserving main features of fringe at a fairly fast speed. |
Tasks | Denoising |
Published | 2019-01-02 |
URL | http://arxiv.org/abs/1901.00361v2 |
http://arxiv.org/pdf/1901.00361v2.pdf | |
PWC | https://paperswithcode.com/paper/optical-fringe-patterns-filtering-based-on |
Repo | |
Framework | |