Paper Group ANR 779
Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification. Is it worth it? Budget-related evaluation metrics for model selection. Categorization of Semantic Roles for Dictionary Definitions. The Random Forest Classifier in WEKA: Discussion and New Developments for Imbalanced Data. Simultaneous Optical Flow …
Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification
Title | Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification |
Authors | Gautam Bhattacharya, Joao Monteiro, Jahangir Alam, Patrick Kenny |
Abstract | This article presents a novel approach for learning domain-invariant speaker embeddings using Generative Adversarial Networks. The main idea is to confuse a domain discriminator so that is can’t tell if embeddings are from the source or target domains. We train several GAN variants using our proposed framework and apply them to the speaker verification task. On the challenging NIST-SRE 2016 dataset, we are able to match the performance of a strong baseline x-vector system. In contrast to the the baseline systems which are dependent on dimensionality reduction (LDA) and an external classifier (PLDA), our proposed speaker embeddings can be scored using simple cosine distance. This is achieved by optimizing our models end-to-end, using an angular margin loss function. Furthermore, we are able to significantly boost verification performance by averaging our different GAN models at the score level, achieving a relative improvement of 7.2% over the baseline. |
Tasks | Dimensionality Reduction, Speaker Verification |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.03063v1 |
http://arxiv.org/pdf/1811.03063v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-speaker-embedding |
Repo | |
Framework | |
Is it worth it? Budget-related evaluation metrics for model selection
Title | Is it worth it? Budget-related evaluation metrics for model selection |
Authors | Filip Klubička, Giancarlo D. Salton, John D. Kelleher |
Abstract | Creating a linguistic resource is often done by using a machine learning model that filters the content that goes through to a human annotator, before going into the final resource. However, budgets are often limited, and the amount of available data exceeds the amount of affordable annotation. In order to optimize the benefit from the invested human work, we argue that deciding on which model one should employ depends not only on generalized evaluation metrics such as F-score, but also on the gain metric. Because the model with the highest F-score may not necessarily have the best sequencing of predicted classes, this may lead to wasting funds on annotating false positives, yielding zero improvement of the linguistic resource. We exemplify our point with a case study, using real data from a task of building a verb-noun idiom dictionary. We show that, given the choice of three systems with varying F-scores, the system with the highest F-score does not yield the highest profits. In other words, in our case the cost-benefit trade off is more favorable for a system with a lower F-score. |
Tasks | Model Selection |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.06998v1 |
http://arxiv.org/pdf/1807.06998v1.pdf | |
PWC | https://paperswithcode.com/paper/is-it-worth-it-budget-related-evaluation |
Repo | |
Framework | |
Categorization of Semantic Roles for Dictionary Definitions
Title | Categorization of Semantic Roles for Dictionary Definitions |
Authors | Vivian S. Silva, Siegfried Handschuh, André Freitas |
Abstract | Understanding the semantic relationships between terms is a fundamental task in natural language processing applications. While structured resources that can express those relationships in a formal way, such as ontologies, are still scarce, a large number of linguistic resources gathering dictionary definitions is becoming available, but understanding the semantic structure of natural language definitions is fundamental to make them useful in semantic interpretation tasks. Based on an analysis of a subset of WordNet’s glosses, we propose a set of semantic roles that compose the semantic structure of a dictionary definition, and show how they are related to the definition’s syntactic configuration, identifying patterns that can be used in the development of information extraction frameworks and semantic models. |
Tasks | |
Published | 2018-06-20 |
URL | http://arxiv.org/abs/1806.07711v1 |
http://arxiv.org/pdf/1806.07711v1.pdf | |
PWC | https://paperswithcode.com/paper/categorization-of-semantic-roles-for |
Repo | |
Framework | |
The Random Forest Classifier in WEKA: Discussion and New Developments for Imbalanced Data
Title | The Random Forest Classifier in WEKA: Discussion and New Developments for Imbalanced Data |
Authors | Mario Amrehn, Firas Mualla, Elli Angelopoulou, Stefan Steidl, Andreas Maier |
Abstract | Data analysis and machine learning have become an integrative part of the modern scientific methodology, providing automated techniques to predict further information based on observations. One of these classification and regression techniques is the random forest approach. Those decision tree based predictors are best known for their good computational performance and scalability. However, in case of severely imbalanced training data, as often seen in medical studies’ data with large control groups, the training algorithm or the sampling process has to be altered in order to improve the prediction quality for minority classes. In this work, a balanced random forest approach for WEKA is proposed. Furthermore, the prediction quality of the unmodified random forest implementation and the new balanced random forest version for WEKA are evaluated against reference implementations in R. Two-class problems on balanced data sets and imbalanced medical studies’ data are investigated. A superior prediction quality using the proposed method for imbalanced data is shown compared to the other three techniques. |
Tasks | |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.08102v2 |
http://arxiv.org/pdf/1812.08102v2.pdf | |
PWC | https://paperswithcode.com/paper/the-random-forest-classifier-in-weka |
Repo | |
Framework | |
Simultaneous Optical Flow and Segmentation (SOFAS) using Dynamic Vision Sensor
Title | Simultaneous Optical Flow and Segmentation (SOFAS) using Dynamic Vision Sensor |
Authors | Timo Stoffregen, Lindsay Kleeman |
Abstract | We present an algorithm (SOFAS) to estimate the optical flow of events generated by a dynamic vision sensor (DVS). Where traditional cameras produce frames at a fixed rate, DVSs produce asynchronous events in response to intensity changes with a high temporal resolution. Our algorithm uses the fact that events are generated by edges in the scene to not only estimate the optical flow but also to simultaneously segment the image into objects which are travelling at the same velocity. This way it is able to avoid the aperture problem which affects other implementations such as Lucas-Kanade. Finally, we show that SOFAS produces more accurate results than traditional optic flow algorithms. |
Tasks | Optical Flow Estimation |
Published | 2018-05-31 |
URL | http://arxiv.org/abs/1805.12326v1 |
http://arxiv.org/pdf/1805.12326v1.pdf | |
PWC | https://paperswithcode.com/paper/simultaneous-optical-flow-and-segmentation |
Repo | |
Framework | |
Knowledge Amalgam: Generating Jokes and Quotes Together
Title | Knowledge Amalgam: Generating Jokes and Quotes Together |
Authors | Bhargav Chippada, Shubajit Saha |
Abstract | Generating humor and quotes are very challenging problems in the field of computational linguistics and are often tackled separately. In this paper, we present a controlled Long Short-Term Memory (LSTM) architecture which is trained with categorical data like jokes and quotes together by passing category as an input along with the sequence of words. The idea is that a single neural net will learn the structure of both jokes and quotes to generate them on demand according to input category. Importantly, we believe the neural net has more knowledge as it’s trained on different datasets and hence will enable it to generate more creative jokes or quotes from the mixture of information. May the network generate a funny inspirational joke! |
Tasks | |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04387v2 |
http://arxiv.org/pdf/1806.04387v2.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-amalgam-generating-jokes-and-quotes |
Repo | |
Framework | |
Multi-agent Deep Reinforcement Learning for Zero Energy Communities
Title | Multi-agent Deep Reinforcement Learning for Zero Energy Communities |
Authors | Amit Prasad, Ivana Dusparic |
Abstract | Advances in renewable energy generation and introduction of the government targets to improve energy efficiency gave rise to a concept of a Zero Energy Building (ZEB). A ZEB is a building whose net energy usage over a year is zero, i.e., its energy use is not larger than its overall renewables generation. A collection of ZEBs forms a Zero Energy Community (ZEC). This paper addresses the problem of energy sharing in such a community. This is different from previously addressed energy sharing between buildings as our focus is on the improvement of community energy status, while traditionally research focused on reducing losses due to transmission and storage, or achieving economic gains. We model this problem in a multi-agent environment and propose a Deep Reinforcement Learning (DRL) based solution. Each building is represented by an intelligent agent that learns over time the appropriate behaviour to share energy. We have evaluated the proposed solution in a multi-agent simulation built using osBrain. Results indicate that with time agents learn to collaborate and learn a policy comparable to the optimal policy, which in turn improves the ZEC’s energy status. Buildings with no renewables preferred to request energy from their neighbours rather than from the supply grid. |
Tasks | |
Published | 2018-10-08 |
URL | https://arxiv.org/abs/1810.03679v2 |
https://arxiv.org/pdf/1810.03679v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-agent-deep-reinforcement-learning-for |
Repo | |
Framework | |
Contrastive Learning of Emoji-based Representations for Resource-Poor Languages
Title | Contrastive Learning of Emoji-based Representations for Resource-Poor Languages |
Authors | Nurendra Choudhary, Rajat Singh, Ishita Bindlish, Manish Shrivastava |
Abstract | The introduction of emojis (or emoticons) in social media platforms has given the users an increased potential for expression. We propose a novel method called Classification of Emojis using Siamese Network Architecture (CESNA) to learn emoji-based representations of resource-poor languages by jointly training them with resource-rich languages using a siamese network. CESNA model consists of twin Bi-directional Long Short-Term Memory Recurrent Neural Networks (Bi-LSTM RNN) with shared parameters joined by a contrastive loss function based on a similarity metric. The model learns the representations of resource-poor and resource-rich language in a common emoji space by using a similarity metric based on the emojis present in sentences from both languages. The model, hence, projects sentences with similar emojis closer to each other and the sentences with different emojis farther from one another. Experiments on large-scale Twitter datasets of resource-rich languages - English and Spanish and resource-poor languages - Hindi and Telugu reveal that CESNA outperforms the state-of-the-art emoji prediction approaches based on distributional semantics, semantic rules, lexicon lists and deep neural network representations without shared parameters. |
Tasks | |
Published | 2018-04-03 |
URL | http://arxiv.org/abs/1804.01855v1 |
http://arxiv.org/pdf/1804.01855v1.pdf | |
PWC | https://paperswithcode.com/paper/contrastive-learning-of-emoji-based |
Repo | |
Framework | |
Improving Transferability of Deep Neural Networks
Title | Improving Transferability of Deep Neural Networks |
Authors | Parijat Dube, Bishwaranjan Bhattacharjee, Elisabeth Petit-Bois, Matthew Hill |
Abstract | Learning from small amounts of labeled data is a challenge in the area of deep learning. This is currently addressed by Transfer Learning where one learns the small data set as a transfer task from a larger source dataset. Transfer Learning can deliver higher accuracy if the hyperparameters and source dataset are chosen well. One of the important parameters is the learning rate for the layers of the neural network. We show through experiments on the ImageNet22k and Oxford Flowers datasets that improvements in accuracy in range of 127% can be obtained by proper choice of learning rates. We also show that the images/label parameter for a dataset can potentially be used to determine optimal learning rates for the layers to get the best overall accuracy. We additionally validate this method on a sample of real-world image classification tasks from a public visual recognition API. |
Tasks | Image Classification, Transfer Learning |
Published | 2018-07-30 |
URL | http://arxiv.org/abs/1807.11459v1 |
http://arxiv.org/pdf/1807.11459v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-transferability-of-deep-neural |
Repo | |
Framework | |
Not just a matter of semantics: the relationship between visual similarity and semantic similarity
Title | Not just a matter of semantics: the relationship between visual similarity and semantic similarity |
Authors | Clemens-Alexander Brust, Joachim Denzler |
Abstract | Knowledge transfer, zero-shot learning and semantic image retrieval are methods that aim at improving accuracy by utilizing semantic information, e.g. from WordNet. It is assumed that this information can augment or replace missing visual data in the form of labeled training images because semantic similarity correlates with visual similarity. This assumption may seem trivial, but is crucial for the application of such semantic methods. Any violation can cause mispredictions. Thus, it is important to examine the visual-semantic relationship for a certain target problem. In this paper, we use five different semantic and visual similarity measures each to thoroughly analyze the relationship without relying too much on any single definition. We postulate and verify three highly consequential hypotheses on the relationship. Our results show that it indeed exists and that WordNet semantic similarity carries more information about visual similarity than just the knowledge of “different classes look different”. They suggest that classification is not the ideal application for semantic methods and that wrong semantic information is much worse than none. |
Tasks | Image Retrieval, Semantic Similarity, Semantic Textual Similarity, Transfer Learning, Zero-Shot Learning |
Published | 2018-11-17 |
URL | https://arxiv.org/abs/1811.07120v2 |
https://arxiv.org/pdf/1811.07120v2.pdf | |
PWC | https://paperswithcode.com/paper/not-just-a-matter-of-semantics-the |
Repo | |
Framework | |
Video Inpainting by Jointly Learning Temporal Structure and Spatial Details
Title | Video Inpainting by Jointly Learning Temporal Structure and Spatial Details |
Authors | Chuan Wang, Haibin Huang, Xiaoguang Han, Jue Wang |
Abstract | We present a new data-driven video inpainting method for recovering missing regions of video frames. A novel deep learning architecture is proposed which contains two sub-networks: a temporal structure inference network and a spatial detail recovering network. The temporal structure inference network is built upon a 3D fully convolutional architecture: it only learns to complete a low-resolution video volume given the expensive computational cost of 3D convolution. The low resolution result provides temporal guidance to the spatial detail recovering network, which performs image-based inpainting with a 2D fully convolutional network to produce recovered video frames in their original resolution. Such two-step network design ensures both the spatial quality of each frame and the temporal coherence across frames. Our method jointly trains both sub-networks in an end-to-end manner. We provide qualitative and quantitative evaluation on three datasets, demonstrating that our method outperforms previous learning-based video inpainting methods. |
Tasks | Video Inpainting |
Published | 2018-06-22 |
URL | http://arxiv.org/abs/1806.08482v2 |
http://arxiv.org/pdf/1806.08482v2.pdf | |
PWC | https://paperswithcode.com/paper/video-inpainting-by-jointly-learning-temporal |
Repo | |
Framework | |
Cross-Modulation Networks for Few-Shot Learning
Title | Cross-Modulation Networks for Few-Shot Learning |
Authors | Hugo Prol, Vincent Dumoulin, Luis Herranz |
Abstract | A family of recent successful approaches to few-shot learning relies on learning an embedding space in which predictions are made by computing similarities between examples. This corresponds to combining information between support and query examples at a very late stage of the prediction pipeline. Inspired by this observation, we hypothesize that there may be benefits to combining the information at various levels of abstraction along the pipeline. We present an architecture called Cross-Modulation Networks which allows support and query examples to interact throughout the feature extraction process via a feature-wise modulation mechanism. We adapt the Matching Networks architecture to take advantage of these interactions and show encouraging initial results on miniImageNet in the 5-way, 1-shot setting, where we close the gap with state-of-the-art. |
Tasks | Few-Shot Learning |
Published | 2018-12-01 |
URL | http://arxiv.org/abs/1812.00273v1 |
http://arxiv.org/pdf/1812.00273v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-modulation-networks-for-few-shot |
Repo | |
Framework | |
Generative Adversarial Network based Autoencoder: Application to fault detection problem for closed loop dynamical systems
Title | Generative Adversarial Network based Autoencoder: Application to fault detection problem for closed loop dynamical systems |
Authors | Indrasis Chakraborty, Rudrasis Chakraborty, Draguna Vrabie |
Abstract | Fault detection problem for closed loop uncertain dynamical systems, is investigated in this paper, using different deep learning based methods. Traditional classifier based method does not perform well, because of the inherent difficulty of detecting system level faults for closed loop dynamical system. Specifically, acting controller in any closed loop dynamical system, works to reduce the effect of system level faults. A novel Generative Adversarial based deep Autoencoder is designed to classify datasets under normal and faulty operating conditions. This proposed network performs significantly well when compared to any available classifier based methods, and moreover, does not require labeled fault incorporated datasets for training purpose. Finally, this aforementioned network’s performance is tested on a high complexity building energy system dataset. |
Tasks | Fault Detection |
Published | 2018-04-15 |
URL | http://arxiv.org/abs/1804.05320v2 |
http://arxiv.org/pdf/1804.05320v2.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-network-based |
Repo | |
Framework | |
Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction
Title | Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction |
Authors | Gagan Choudhury, David Lynch, Gaurav Thakur, Simon Tse |
Abstract | We describe two applications of machine learning in the context of IP/Optical networks. The first one allows agile management of resources at a core IP/Optical network by using machine learning for short-term and long-term prediction of traffic flows and joint global optimization of IP and optical layers using colorless/directionless (CD) flexible ROADMs. Multilayer coordination allows for significant cost savings, flexible new services to meet dynamic capacity needs, and improved robustness by being able to proactively adapt to new traffic patterns and network conditions. The second application is important as we migrate our metro networks to Open ROADM networks, to allow physical routing without the need for detailed knowledge of optical parameters. We discuss a proof-of-concept study, where detailed performance data for wavelengths on a current flexible ROADM network is used for machine learning to predict the optical performance of each wavelength. Both applications can be efficiently implemented by using a SDN (Software Defined Network) controller. |
Tasks | |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07433v2 |
http://arxiv.org/pdf/1804.07433v2.pdf | |
PWC | https://paperswithcode.com/paper/two-use-cases-of-machine-learning-for-sdn |
Repo | |
Framework | |
An Operation Sequence Model for Explainable Neural Machine Translation
Title | An Operation Sequence Model for Explainable Neural Machine Translation |
Authors | Felix Stahlberg, Danielle Saunders, Bill Byrne |
Abstract | We propose to achieve explainable neural machine translation (NMT) by changing the output representation to explain itself. We present a novel approach to NMT which generates the target sentence by monotonically walking through the source sentence. Word reordering is modeled by operations which allow setting markers in the target sentence and move a target-side write head between those markers. In contrast to many modern neural models, our system emits explicit word alignment information which is often crucial to practical machine translation as it improves explainability. Our technique can outperform a plain text system in terms of BLEU score under the recent Transformer architecture on Japanese-English and Portuguese-English, and is within 0.5 BLEU difference on Spanish-English. |
Tasks | Machine Translation, Word Alignment |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09688v1 |
http://arxiv.org/pdf/1808.09688v1.pdf | |
PWC | https://paperswithcode.com/paper/an-operation-sequence-model-for-explainable |
Repo | |
Framework | |