October 17, 2019

2877 words 14 mins read

Paper Group ANR 779

Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification. Is it worth it? Budget-related evaluation metrics for model selection. Categorization of Semantic Roles for Dictionary Definitions. The Random Forest Classifier in WEKA: Discussion and New Developments for Imbalanced Data. Simultaneous Optical Flow …

Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification


Title	Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification
Authors	Gautam Bhattacharya, Joao Monteiro, Jahangir Alam, Patrick Kenny
Abstract	This article presents a novel approach for learning domain-invariant speaker embeddings using Generative Adversarial Networks. The main idea is to confuse a domain discriminator so that is can’t tell if embeddings are from the source or target domains. We train several GAN variants using our proposed framework and apply them to the speaker verification task. On the challenging NIST-SRE 2016 dataset, we are able to match the performance of a strong baseline x-vector system. In contrast to the the baseline systems which are dependent on dimensionality reduction (LDA) and an external classifier (PLDA), our proposed speaker embeddings can be scored using simple cosine distance. This is achieved by optimizing our models end-to-end, using an angular margin loss function. Furthermore, we are able to significantly boost verification performance by averaging our different GAN models at the score level, achieving a relative improvement of 7.2% over the baseline.
Tasks	Dimensionality Reduction, Speaker Verification
Published	2018-11-07
URL	http://arxiv.org/abs/1811.03063v1
PDF	http://arxiv.org/pdf/1811.03063v1.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-speaker-embedding
Repo
Framework


Title	Is it worth it? Budget-related evaluation metrics for model selection
Authors	Filip Klubička, Giancarlo D. Salton, John D. Kelleher
Abstract	Creating a linguistic resource is often done by using a machine learning model that filters the content that goes through to a human annotator, before going into the final resource. However, budgets are often limited, and the amount of available data exceeds the amount of affordable annotation. In order to optimize the benefit from the invested human work, we argue that deciding on which model one should employ depends not only on generalized evaluation metrics such as F-score, but also on the gain metric. Because the model with the highest F-score may not necessarily have the best sequencing of predicted classes, this may lead to wasting funds on annotating false positives, yielding zero improvement of the linguistic resource. We exemplify our point with a case study, using real data from a task of building a verb-noun idiom dictionary. We show that, given the choice of three systems with varying F-scores, the system with the highest F-score does not yield the highest profits. In other words, in our case the cost-benefit trade off is more favorable for a system with a lower F-score.
Tasks	Model Selection
Published	2018-07-18
URL	http://arxiv.org/abs/1807.06998v1
PDF	http://arxiv.org/pdf/1807.06998v1.pdf
PWC	https://paperswithcode.com/paper/is-it-worth-it-budget-related-evaluation
Repo
Framework

Categorization of Semantic Roles for Dictionary Definitions


Title	Categorization of Semantic Roles for Dictionary Definitions
Authors	Vivian S. Silva, Siegfried Handschuh, André Freitas
Abstract	Understanding the semantic relationships between terms is a fundamental task in natural language processing applications. While structured resources that can express those relationships in a formal way, such as ontologies, are still scarce, a large number of linguistic resources gathering dictionary definitions is becoming available, but understanding the semantic structure of natural language definitions is fundamental to make them useful in semantic interpretation tasks. Based on an analysis of a subset of WordNet’s glosses, we propose a set of semantic roles that compose the semantic structure of a dictionary definition, and show how they are related to the definition’s syntactic configuration, identifying patterns that can be used in the development of information extraction frameworks and semantic models.
Tasks
Published	2018-06-20
URL	http://arxiv.org/abs/1806.07711v1
PDF	http://arxiv.org/pdf/1806.07711v1.pdf
PWC	https://paperswithcode.com/paper/categorization-of-semantic-roles-for
Repo
Framework

The Random Forest Classifier in WEKA: Discussion and New Developments for Imbalanced Data


Title	The Random Forest Classifier in WEKA: Discussion and New Developments for Imbalanced Data
Authors	Mario Amrehn, Firas Mualla, Elli Angelopoulou, Stefan Steidl, Andreas Maier
Abstract	Data analysis and machine learning have become an integrative part of the modern scientific methodology, providing automated techniques to predict further information based on observations. One of these classification and regression techniques is the random forest approach. Those decision tree based predictors are best known for their good computational performance and scalability. However, in case of severely imbalanced training data, as often seen in medical studies’ data with large control groups, the training algorithm or the sampling process has to be altered in order to improve the prediction quality for minority classes. In this work, a balanced random forest approach for WEKA is proposed. Furthermore, the prediction quality of the unmodified random forest implementation and the new balanced random forest version for WEKA are evaluated against reference implementations in R. Two-class problems on balanced data sets and imbalanced medical studies’ data are investigated. A superior prediction quality using the proposed method for imbalanced data is shown compared to the other three techniques.
Tasks
Published	2018-12-19
URL	http://arxiv.org/abs/1812.08102v2
PDF	http://arxiv.org/pdf/1812.08102v2.pdf
PWC	https://paperswithcode.com/paper/the-random-forest-classifier-in-weka
Repo
Framework

Simultaneous Optical Flow and Segmentation (SOFAS) using Dynamic Vision Sensor


Title	Simultaneous Optical Flow and Segmentation (SOFAS) using Dynamic Vision Sensor
Authors	Timo Stoffregen, Lindsay Kleeman
Abstract	We present an algorithm (SOFAS) to estimate the optical flow of events generated by a dynamic vision sensor (DVS). Where traditional cameras produce frames at a fixed rate, DVSs produce asynchronous events in response to intensity changes with a high temporal resolution. Our algorithm uses the fact that events are generated by edges in the scene to not only estimate the optical flow but also to simultaneously segment the image into objects which are travelling at the same velocity. This way it is able to avoid the aperture problem which affects other implementations such as Lucas-Kanade. Finally, we show that SOFAS produces more accurate results than traditional optic flow algorithms.
Tasks	Optical Flow Estimation
Published	2018-05-31
URL	http://arxiv.org/abs/1805.12326v1
PDF	http://arxiv.org/pdf/1805.12326v1.pdf
PWC	https://paperswithcode.com/paper/simultaneous-optical-flow-and-segmentation
Repo
Framework

Knowledge Amalgam: Generating Jokes and Quotes Together


Title	Knowledge Amalgam: Generating Jokes and Quotes Together
Authors	Bhargav Chippada, Shubajit Saha
Abstract	Generating humor and quotes are very challenging problems in the field of computational linguistics and are often tackled separately. In this paper, we present a controlled Long Short-Term Memory (LSTM) architecture which is trained with categorical data like jokes and quotes together by passing category as an input along with the sequence of words. The idea is that a single neural net will learn the structure of both jokes and quotes to generate them on demand according to input category. Importantly, we believe the neural net has more knowledge as it’s trained on different datasets and hence will enable it to generate more creative jokes or quotes from the mixture of information. May the network generate a funny inspirational joke!
Tasks
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04387v2
PDF	http://arxiv.org/pdf/1806.04387v2.pdf
PWC	https://paperswithcode.com/paper/knowledge-amalgam-generating-jokes-and-quotes
Repo
Framework

Multi-agent Deep Reinforcement Learning for Zero Energy Communities


Title	Multi-agent Deep Reinforcement Learning for Zero Energy Communities
Authors	Amit Prasad, Ivana Dusparic
Abstract	Advances in renewable energy generation and introduction of the government targets to improve energy efficiency gave rise to a concept of a Zero Energy Building (ZEB). A ZEB is a building whose net energy usage over a year is zero, i.e., its energy use is not larger than its overall renewables generation. A collection of ZEBs forms a Zero Energy Community (ZEC). This paper addresses the problem of energy sharing in such a community. This is different from previously addressed energy sharing between buildings as our focus is on the improvement of community energy status, while traditionally research focused on reducing losses due to transmission and storage, or achieving economic gains. We model this problem in a multi-agent environment and propose a Deep Reinforcement Learning (DRL) based solution. Each building is represented by an intelligent agent that learns over time the appropriate behaviour to share energy. We have evaluated the proposed solution in a multi-agent simulation built using osBrain. Results indicate that with time agents learn to collaborate and learn a policy comparable to the optimal policy, which in turn improves the ZEC’s energy status. Buildings with no renewables preferred to request energy from their neighbours rather than from the supply grid.
Tasks
Published	2018-10-08
URL	https://arxiv.org/abs/1810.03679v2
PDF	https://arxiv.org/pdf/1810.03679v2.pdf
PWC	https://paperswithcode.com/paper/multi-agent-deep-reinforcement-learning-for
Repo
Framework

Contrastive Learning of Emoji-based Representations for Resource-Poor Languages


Title	Contrastive Learning of Emoji-based Representations for Resource-Poor Languages
Authors	Nurendra Choudhary, Rajat Singh, Ishita Bindlish, Manish Shrivastava
Abstract	The introduction of emojis (or emoticons) in social media platforms has given the users an increased potential for expression. We propose a novel method called Classification of Emojis using Siamese Network Architecture (CESNA) to learn emoji-based representations of resource-poor languages by jointly training them with resource-rich languages using a siamese network. CESNA model consists of twin Bi-directional Long Short-Term Memory Recurrent Neural Networks (Bi-LSTM RNN) with shared parameters joined by a contrastive loss function based on a similarity metric. The model learns the representations of resource-poor and resource-rich language in a common emoji space by using a similarity metric based on the emojis present in sentences from both languages. The model, hence, projects sentences with similar emojis closer to each other and the sentences with different emojis farther from one another. Experiments on large-scale Twitter datasets of resource-rich languages - English and Spanish and resource-poor languages - Hindi and Telugu reveal that CESNA outperforms the state-of-the-art emoji prediction approaches based on distributional semantics, semantic rules, lexicon lists and deep neural network representations without shared parameters.
Tasks
Published	2018-04-03
URL	http://arxiv.org/abs/1804.01855v1
PDF	http://arxiv.org/pdf/1804.01855v1.pdf
PWC	https://paperswithcode.com/paper/contrastive-learning-of-emoji-based
Repo
Framework

Improving Transferability of Deep Neural Networks


Title	Improving Transferability of Deep Neural Networks
Authors	Parijat Dube, Bishwaranjan Bhattacharjee, Elisabeth Petit-Bois, Matthew Hill
Abstract	Learning from small amounts of labeled data is a challenge in the area of deep learning. This is currently addressed by Transfer Learning where one learns the small data set as a transfer task from a larger source dataset. Transfer Learning can deliver higher accuracy if the hyperparameters and source dataset are chosen well. One of the important parameters is the learning rate for the layers of the neural network. We show through experiments on the ImageNet22k and Oxford Flowers datasets that improvements in accuracy in range of 127% can be obtained by proper choice of learning rates. We also show that the images/label parameter for a dataset can potentially be used to determine optimal learning rates for the layers to get the best overall accuracy. We additionally validate this method on a sample of real-world image classification tasks from a public visual recognition API.
Tasks	Image Classification, Transfer Learning
Published	2018-07-30
URL	http://arxiv.org/abs/1807.11459v1
PDF	http://arxiv.org/pdf/1807.11459v1.pdf
PWC	https://paperswithcode.com/paper/improving-transferability-of-deep-neural
Repo
Framework

Not just a matter of semantics: the relationship between visual similarity and semantic similarity


Title	Not just a matter of semantics: the relationship between visual similarity and semantic similarity
Authors	Clemens-Alexander Brust, Joachim Denzler
Abstract	Knowledge transfer, zero-shot learning and semantic image retrieval are methods that aim at improving accuracy by utilizing semantic information, e.g. from WordNet. It is assumed that this information can augment or replace missing visual data in the form of labeled training images because semantic similarity correlates with visual similarity. This assumption may seem trivial, but is crucial for the application of such semantic methods. Any violation can cause mispredictions. Thus, it is important to examine the visual-semantic relationship for a certain target problem. In this paper, we use five different semantic and visual similarity measures each to thoroughly analyze the relationship without relying too much on any single definition. We postulate and verify three highly consequential hypotheses on the relationship. Our results show that it indeed exists and that WordNet semantic similarity carries more information about visual similarity than just the knowledge of “different classes look different”. They suggest that classification is not the ideal application for semantic methods and that wrong semantic information is much worse than none.
Tasks	Image Retrieval, Semantic Similarity, Semantic Textual Similarity, Transfer Learning, Zero-Shot Learning
Published	2018-11-17
URL	https://arxiv.org/abs/1811.07120v2
PDF	https://arxiv.org/pdf/1811.07120v2.pdf
PWC	https://paperswithcode.com/paper/not-just-a-matter-of-semantics-the
Repo
Framework

Video Inpainting by Jointly Learning Temporal Structure and Spatial Details


Title	Video Inpainting by Jointly Learning Temporal Structure and Spatial Details
Authors	Chuan Wang, Haibin Huang, Xiaoguang Han, Jue Wang
Abstract	We present a new data-driven video inpainting method for recovering missing regions of video frames. A novel deep learning architecture is proposed which contains two sub-networks: a temporal structure inference network and a spatial detail recovering network. The temporal structure inference network is built upon a 3D fully convolutional architecture: it only learns to complete a low-resolution video volume given the expensive computational cost of 3D convolution. The low resolution result provides temporal guidance to the spatial detail recovering network, which performs image-based inpainting with a 2D fully convolutional network to produce recovered video frames in their original resolution. Such two-step network design ensures both the spatial quality of each frame and the temporal coherence across frames. Our method jointly trains both sub-networks in an end-to-end manner. We provide qualitative and quantitative evaluation on three datasets, demonstrating that our method outperforms previous learning-based video inpainting methods.
Tasks	Video Inpainting
Published	2018-06-22
URL	http://arxiv.org/abs/1806.08482v2
PDF	http://arxiv.org/pdf/1806.08482v2.pdf
PWC	https://paperswithcode.com/paper/video-inpainting-by-jointly-learning-temporal
Repo
Framework

Cross-Modulation Networks for Few-Shot Learning


Title	Cross-Modulation Networks for Few-Shot Learning
Authors	Hugo Prol, Vincent Dumoulin, Luis Herranz
Abstract	A family of recent successful approaches to few-shot learning relies on learning an embedding space in which predictions are made by computing similarities between examples. This corresponds to combining information between support and query examples at a very late stage of the prediction pipeline. Inspired by this observation, we hypothesize that there may be benefits to combining the information at various levels of abstraction along the pipeline. We present an architecture called Cross-Modulation Networks which allows support and query examples to interact throughout the feature extraction process via a feature-wise modulation mechanism. We adapt the Matching Networks architecture to take advantage of these interactions and show encouraging initial results on miniImageNet in the 5-way, 1-shot setting, where we close the gap with state-of-the-art.
Tasks	Few-Shot Learning
Published	2018-12-01
URL	http://arxiv.org/abs/1812.00273v1
PDF	http://arxiv.org/pdf/1812.00273v1.pdf
PWC	https://paperswithcode.com/paper/cross-modulation-networks-for-few-shot
Repo
Framework

Generative Adversarial Network based Autoencoder: Application to fault detection problem for closed loop dynamical systems


Title	Generative Adversarial Network based Autoencoder: Application to fault detection problem for closed loop dynamical systems
Authors	Indrasis Chakraborty, Rudrasis Chakraborty, Draguna Vrabie
Abstract	Fault detection problem for closed loop uncertain dynamical systems, is investigated in this paper, using different deep learning based methods. Traditional classifier based method does not perform well, because of the inherent difficulty of detecting system level faults for closed loop dynamical system. Specifically, acting controller in any closed loop dynamical system, works to reduce the effect of system level faults. A novel Generative Adversarial based deep Autoencoder is designed to classify datasets under normal and faulty operating conditions. This proposed network performs significantly well when compared to any available classifier based methods, and moreover, does not require labeled fault incorporated datasets for training purpose. Finally, this aforementioned network’s performance is tested on a high complexity building energy system dataset.
Tasks	Fault Detection
Published	2018-04-15
URL	http://arxiv.org/abs/1804.05320v2
PDF	http://arxiv.org/pdf/1804.05320v2.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-network-based
Repo
Framework

Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction


Title	Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction
Authors	Gagan Choudhury, David Lynch, Gaurav Thakur, Simon Tse
Abstract	We describe two applications of machine learning in the context of IP/Optical networks. The first one allows agile management of resources at a core IP/Optical network by using machine learning for short-term and long-term prediction of traffic flows and joint global optimization of IP and optical layers using colorless/directionless (CD) flexible ROADMs. Multilayer coordination allows for significant cost savings, flexible new services to meet dynamic capacity needs, and improved robustness by being able to proactively adapt to new traffic patterns and network conditions. The second application is important as we migrate our metro networks to Open ROADM networks, to allow physical routing without the need for detailed knowledge of optical parameters. We discuss a proof-of-concept study, where detailed performance data for wavelengths on a current flexible ROADM network is used for machine learning to predict the optical performance of each wavelength. Both applications can be efficiently implemented by using a SDN (Software Defined Network) controller.
Tasks
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07433v2
PDF	http://arxiv.org/pdf/1804.07433v2.pdf
PWC	https://paperswithcode.com/paper/two-use-cases-of-machine-learning-for-sdn
Repo
Framework

An Operation Sequence Model for Explainable Neural Machine Translation


Title	An Operation Sequence Model for Explainable Neural Machine Translation
Authors	Felix Stahlberg, Danielle Saunders, Bill Byrne
Abstract	We propose to achieve explainable neural machine translation (NMT) by changing the output representation to explain itself. We present a novel approach to NMT which generates the target sentence by monotonically walking through the source sentence. Word reordering is modeled by operations which allow setting markers in the target sentence and move a target-side write head between those markers. In contrast to many modern neural models, our system emits explicit word alignment information which is often crucial to practical machine translation as it improves explainability. Our technique can outperform a plain text system in terms of BLEU score under the recent Transformer architecture on Japanese-English and Portuguese-English, and is within 0.5 BLEU difference on Spanish-English.
Tasks	Machine Translation, Word Alignment
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09688v1
PDF	http://arxiv.org/pdf/1808.09688v1.pdf
PWC	https://paperswithcode.com/paper/an-operation-sequence-model-for-explainable
Repo
Framework