Paper Group AWR 128
Neural Speed Reading with Structural-Jump-LSTM. Warfarin dose estimation on multiple datasets with automated hyperparameter optimisation and a novel software framework. Real-time Deep Learning at the Edge for Scalable Reliability Modeling of Si-MOSFET Power Electronics Converters. Fairness and Missing Values. Stein Point Markov Chain Monte Carlo. C …
Neural Speed Reading with Structural-Jump-LSTM
Title | Neural Speed Reading with Structural-Jump-LSTM |
Authors | Christian Hansen, Casper Hansen, Stephen Alstrup, Jakob Grue Simonsen, Christina Lioma |
Abstract | Recurrent neural networks (RNNs) can model natural language by sequentially ‘reading’ input tokens and outputting a distributed representation of each token. Due to the sequential nature of RNNs, inference time is linearly dependent on the input length, and all inputs are read regardless of their importance. Efforts to speed up this inference, known as ‘neural speed reading’, either ignore or skim over part of the input. We present Structural-Jump-LSTM: the first neural speed reading model to both skip and jump text during inference. The model consists of a standard LSTM and two agents: one capable of skipping single words when reading, and one capable of exploiting punctuation structure (sub-sentence separators (,:), sentence end symbols (.!?), or end of text markers) to jump ahead after reading a word. A comprehensive experimental evaluation of our model against all five state-of-the-art neural reading models shows that Structural-Jump-LSTM achieves the best overall floating point operations (FLOP) reduction (hence is faster), while keeping the same accuracy or even improving it compared to a vanilla LSTM that reads the whole text. |
Tasks | |
Published | 2019-03-20 |
URL | http://arxiv.org/abs/1904.00761v2 |
http://arxiv.org/pdf/1904.00761v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-speed-reading-with-structural-jump-1 |
Repo | https://github.com/Varyn/Neural-Speed-Reading-with-Structural-Jump-LSTM |
Framework | tf |
Warfarin dose estimation on multiple datasets with automated hyperparameter optimisation and a novel software framework
Title | Warfarin dose estimation on multiple datasets with automated hyperparameter optimisation and a novel software framework |
Authors | Gianluca Truda, Patrick Marais |
Abstract | Warfarin is an effective preventative treatment for arterial and venous thromboembolism, but requires individualised dosing due to its narrow therapeutic range and high individual variation. Many statistical and machine learning techniques have been demonstrated in this domain. This study evaluated the accuracy of the most promising algorithms on the International Warfarin Pharmacogenetics Consortium dataset and a novel clinical dataset of South African patients. Support vectors and linear regression were consistently amongst the top performers in both datasets and performed comparably to recent ensemble approaches. We also evaluated the use of genetic programming to design and optimise models without human guidance. Remarkably, these were found to match the performance of the best models hand-crafted by human experts. Finally, we present a novel software framework (Warfit-learn) for standardising future research by leveraging the most successful techniques in preprocessing, imputation, and evaluation–with the goal of making results more reproducible in this domain. |
Tasks | Imputation |
Published | 2019-07-11 |
URL | https://arxiv.org/abs/1907.05363v2 |
https://arxiv.org/pdf/1907.05363v2.pdf | |
PWC | https://paperswithcode.com/paper/warfarin-dose-estimation-on-multiple-datasets |
Repo | https://github.com/gianlucatruda/warfit-learn |
Framework | none |
Real-time Deep Learning at the Edge for Scalable Reliability Modeling of Si-MOSFET Power Electronics Converters
Title | Real-time Deep Learning at the Edge for Scalable Reliability Modeling of Si-MOSFET Power Electronics Converters |
Authors | Mohammadreza Baharani, Mehrdad Biglarbegian, Babak Parkhideh, Hamed Tabkhi |
Abstract | With the significant growth of advanced high-frequency power converters, on-line monitoring and active reliability assessment of power electronic devices are extremely crucial. This article presents a transformative approach, named Deep Learning Reliability Awareness of Converters at the Edge (Deep RACE), for real-time reliability modeling and prediction of high-frequency MOSFET power electronic converters. Deep RACE offers a holistic solution which comprises algorithm advances, and full system integration (from the cloud down to the edge node) to create a near real-time reliability awareness. On the algorithm side, this paper proposes a deep learning algorithmic solution based on stacked LSTM for collective reliability training and inference across collective MOSFET converters based on device resistance changes. Deep RACE also proposes an integrative edge-to-cloud solution to offer a scalable decentralized devices-specific reliability monitoring, awareness, and modeling. The MOSFET convertors are IoT devices which have been empowered with edge real-time deep learning processing capabilities. The proposed Deep RACE solution has been prototyped and implemented through learning from MOSFET data set provided by NASA. Our experimental results show an average miss prediction of $8.9%$ over five different devices which is a much higher accuracy compared to well-known classical approaches (Kalman Filter, and Particle Filter). Deep RACE only requires $26ms$ processing time and $1.87W$ computing power on Edge IoT device. |
Tasks | |
Published | 2019-08-03 |
URL | https://arxiv.org/abs/1908.01244v1 |
https://arxiv.org/pdf/1908.01244v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-deep-learning-at-the-edge-for |
Repo | https://github.com/TeCSAR-UNCC/Deep_RACE |
Framework | tf |
Fairness and Missing Values
Title | Fairness and Missing Values |
Authors | Fernando Martínez-Plumed, Cèsar Ferri, David Nieves, José Hernández-Orallo |
Abstract | The causes underlying unfair decision making are complex, being internalised in different ways by decision makers, other actors dealing with data and models, and ultimately by the individuals being affected by these decisions. One frequent manifestation of all these latent causes arises in the form of missing values: protected groups are more reluctant to give information that could be used against them, delicate information for some groups can be erased by human operators, or data acquisition may simply be less complete and systematic for minority groups. As a result, missing values and bias in data are two phenomena that are tightly coupled. However, most recent techniques, libraries and experimental results dealing with fairness in machine learning have simply ignored missing data. In this paper, we claim that fairness research should not miss the opportunity to deal properly with missing data. To support this claim, (1) we analyse the sources of missing data and bias, and we map the common causes, (2) we find that rows containing missing values are usually fairer than the rest, which should not be treated as the uncomfortable ugly data that different techniques and libraries get rid of at the first occasion, and (3) we study the trade-off between performance and fairness when the rows with missing values are used (either because the technique deals with them directly or by imputation methods). We end the paper with a series of recommended procedures about what to do with missing data when aiming for fair decision making. |
Tasks | Decision Making, Imputation |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12728v1 |
https://arxiv.org/pdf/1905.12728v1.pdf | |
PWC | https://paperswithcode.com/paper/fairness-and-missing-values |
Repo | https://github.com/nandomp/missingFairness |
Framework | none |
Stein Point Markov Chain Monte Carlo
Title | Stein Point Markov Chain Monte Carlo |
Authors | Wilson Ye Chen, Alessandro Barp, François-Xavier Briol, Jackson Gorham, Mark Girolami, Lester Mackey, Chris. J. Oates |
Abstract | An important task in machine learning and statistics is the approximation of a probability measure by an empirical measure supported on a discrete point set. Stein Points are a class of algorithms for this task, which proceed by sequentially minimising a Stein discrepancy between the empirical measure and the target and, hence, require the solution of a non-convex optimisation problem to obtain each new point. This paper removes the need to solve this optimisation problem by, instead, selecting each new point based on a Markov chain sample path. This significantly reduces the computational cost of Stein Points and leads to a suite of algorithms that are straightforward to implement. The new algorithms are illustrated on a set of challenging Bayesian inference problems, and rigorous theoretical guarantees of consistency are established. |
Tasks | Bayesian Inference |
Published | 2019-05-09 |
URL | https://arxiv.org/abs/1905.03673v1 |
https://arxiv.org/pdf/1905.03673v1.pdf | |
PWC | https://paperswithcode.com/paper/190503673 |
Repo | https://github.com/wilson-ye-chen/sp-mcmc |
Framework | none |
Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation
Title | Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation |
Authors | Ke Wang, Hang Hua, Xiaojun Wan |
Abstract | Unsupervised text attribute transfer automatically transforms a text to alter a specific attribute (e.g. sentiment) without using any parallel data, while simultaneously preserving its attribute-independent content. The dominant approaches are trying to model the content-independent attribute separately, e.g., learning different attributes’ representations or using multiple attribute-specific decoders. However, it may lead to inflexibility from the perspective of controlling the degree of transfer or transferring over multiple aspects at the same time. To address the above problems, we propose a more flexible unsupervised text attribute transfer framework which replaces the process of modeling attribute with minimal editing of latent representations based on an attribute classifier. Specifically, we first propose a Transformer-based autoencoder to learn an entangled latent representation for a discrete text, then we transform the attribute transfer task to an optimization problem and propose the Fast-Gradient-Iterative-Modification algorithm to edit the latent representation until conforming to the target attribute. Extensive experimental results demonstrate that our model achieves very competitive performance on three public data sets. Furthermore, we also show that our model can not only control the degree of transfer freely but also allow to transfer over multiple aspects at the same time. |
Tasks | Text Attribute Transfer |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.12926v2 |
https://arxiv.org/pdf/1905.12926v2.pdf | |
PWC | https://paperswithcode.com/paper/controllable-unsupervised-text-attribute |
Repo | https://github.com/mrzjy/controllable-text-attribute-transfer |
Framework | tf |
EmbraceNet: A robust deep learning architecture for multimodal classification
Title | EmbraceNet: A robust deep learning architecture for multimodal classification |
Authors | Jun-Ho Choi, Jong-Seok Lee |
Abstract | Classification using multimodal data arises in many machine learning applications. It is crucial not only to model cross-modal relationship effectively but also to ensure robustness against loss of part of data or modalities. In this paper, we propose a novel deep learning-based multimodal fusion architecture for classification tasks, which guarantees compatibility with any kind of learning models, deals with cross-modal information carefully, and prevents performance degradation due to partial absence of data. We employ two datasets for multimodal classification tasks, build models based on our architecture and other state-of-the-art models, and analyze their performance on various situations. The results show that our architecture outperforms the other multimodal fusion architectures when some parts of data are not available. |
Tasks | |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09078v1 |
http://arxiv.org/pdf/1904.09078v1.pdf | |
PWC | https://paperswithcode.com/paper/embracenet-a-robust-deep-learning |
Repo | https://github.com/idearibosome/embracenet |
Framework | tf |
A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking
Title | A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking |
Authors | Andreas Hanselowski, Christian Stab, Claudia Schulz, Zile Li, Iryna Gurevych |
Abstract | Automated fact-checking based on machine learning is a promising approach to identify false information distributed on the web. In order to achieve satisfactory performance, machine learning methods require a large corpus with reliable annotations for the different tasks in the fact-checking process. Having analyzed existing fact-checking corpora, we found that none of them meets these criteria in full. They are either too small in size, do not provide detailed annotations, or are limited to a single domain. Motivated by this gap, we present a new substantially sized mixed-domain corpus with annotations of good quality for the core fact-checking tasks: document retrieval, evidence extraction, stance detection, and claim validation. To aid future corpus construction, we describe our methodology for corpus creation and annotation, and demonstrate that it results in substantial inter-annotator agreement. As baselines for future research, we perform experiments on our corpus with a number of model architectures that reach high performance in similar problem settings. Finally, to support the development of future models, we provide a detailed error analysis for each of the tasks. Our results show that the realistic, multi-domain setting defined by our data poses new challenges for the existing models, providing opportunities for considerable improvement by future systems. |
Tasks | Stance Detection |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1911.01214v1 |
https://arxiv.org/pdf/1911.01214v1.pdf | |
PWC | https://paperswithcode.com/paper/a-richly-annotated-corpus-for-different-tasks-1 |
Repo | https://github.com/UKPLab/conll2019-snopes-crawling |
Framework | none |
A Fine-grained Sentiment Dataset for Norwegian
Title | A Fine-grained Sentiment Dataset for Norwegian |
Authors | Lilja Øvrelid, Petter Mæhlum, Jeremy Barnes, Erik Velldal |
Abstract | We here introduce NoReCfine, a dataset for fine-grained sentiment analysis in Norwegian, annotated with respect to polar expressions, targets and holders of opinion. The underlying texts are taken from a corpus of professionally authored reviews from multiple news-sources and across a wide variety of domains, including literature, games, music, products, movies and more. We here present a detailed description of this annotation effort. We provide an overview of the developed annotation guidelines, illustrated with examples and present an analysis of inter-annotator agreement. We also report the first experimental results on the dataset, intended as a preliminary benchmark for further experiments. |
Tasks | Sentiment Analysis |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12722v1 |
https://arxiv.org/pdf/1911.12722v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fine-grained-sentiment-dataset-for |
Repo | https://github.com/ltgoslo/norec_fine |
Framework | none |
A PCB Dataset for Defects Detection and Classification
Title | A PCB Dataset for Defects Detection and Classification |
Authors | Weibo Huang, Peng Wei |
Abstract | To coupe with the difficulties in the process of inspection and classification of defects in Printed Circuit Board (PCB), other researchers have proposed many methods. However, few of them published their dataset before, which hindered the introduction and comparison of new methods. In this paper, we published a synthesized PCB dataset containing 1386 images with 6 kinds of defects for the use of detection, classification and registration tasks. Besides, we proposed a reference based method to inspect and trained an end-to-end convolutional neural network to classify the defects. Unlike conventional approaches that require pixel-by-pixel processing, our method firstly locate the defects and then classify them by neural networks, which shows superior performance on our dataset. |
Tasks | |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08204v1 |
http://arxiv.org/pdf/1901.08204v1.pdf | |
PWC | https://paperswithcode.com/paper/a-pcb-dataset-for-defects-detection-and |
Repo | https://github.com/Ironbrotherstyle/PCB-DATASET |
Framework | none |
Bayesian Optimization in Variational Latent Spaces with Dynamic Compression
Title | Bayesian Optimization in Variational Latent Spaces with Dynamic Compression |
Authors | Rika Antonova, Akshara Rai, Tianyu Li, Danica Kragic |
Abstract | Data-efficiency is crucial for autonomous robots to adapt to new tasks and environments. In this work we focus on robotics problems with a budget of only 10-20 trials. This is a very challenging setting even for data-efficient approaches like Bayesian optimization (BO), especially when optimizing higher-dimensional controllers. Simulated trajectories can be used to construct informed kernels for BO. However, previous work employed supervised ways of extracting low-dimensional features for these. We propose a model and architecture for a sequential variational autoencoder that embeds the space of simulated trajectories into a lower-dimensional space of latent paths in an unsupervised way. We further compress the search space for BO by reducing exploration in parts of the state space that are undesirable, without requiring explicit constraints on controller parameters. We validate our approach with hardware experiments on a Daisy hexapod robot and an ABB Yumi manipulator. We also present simulation experiments with further comparisons to several baselines on Daisy and two manipulators. Our experiments indicate the proposed trajectory-based kernel with dynamic compression can offer ultra data-efficient optimization. |
Tasks | |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04796v1 |
https://arxiv.org/pdf/1907.04796v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-optimization-in-variational-latent |
Repo | https://github.com/contactrika/bo-svae-dc |
Framework | pytorch |
SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences
Title | SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences |
Authors | Jens Behley, Martin Garbade, Andres Milioto, Jan Quenzel, Sven Behnke, Cyrill Stachniss, Juergen Gall |
Abstract | Semantic scene understanding is important for various applications. In particular, self-driving cars need a fine-grained understanding of the surfaces and objects in their vicinity. Light detection and ranging (LiDAR) provides precise geometric information about the environment and is thus a part of the sensor suites of almost all self-driving cars. Despite the relevance of semantic scene understanding for this application, there is a lack of a large dataset for this task which is based on an automotive LiDAR. In this paper, we introduce a large dataset to propel research on laser-based semantic segmentation. We annotated all sequences of the KITTI Vision Odometry Benchmark and provide dense point-wise annotations for the complete $360^{o}$ field-of-view of the employed automotive LiDAR. We propose three benchmark tasks based on this dataset: (i) semantic segmentation of point clouds using a single scan, (ii) semantic segmentation using multiple past scans, and (iii) semantic scene completion, which requires to anticipate the semantic scene in the future. We provide baseline experiments and show that there is a need for more sophisticated models to efficiently tackle these tasks. Our dataset opens the door for the development of more advanced methods, but also provides plentiful data to investigate new research directions. |
Tasks | 3D Semantic Segmentation, Scene Understanding, Self-Driving Cars, Semantic Segmentation |
Published | 2019-04-02 |
URL | https://arxiv.org/abs/1904.01416v3 |
https://arxiv.org/pdf/1904.01416v3.pdf | |
PWC | https://paperswithcode.com/paper/a-dataset-for-semantic-segmentation-of-point |
Repo | https://github.com/PRBonn/semantic-kitti-api |
Framework | none |
k-hop Graph Neural Networks
Title | k-hop Graph Neural Networks |
Authors | Giannis Nikolentzos, George Dasoulas, Michalis Vazirgiannis |
Abstract | Graph neural networks (GNNs) have emerged recently as a powerful architecture for learning node and graph representations. Standard GNNs have the same expressive power as the Weisfeiler-Leman test of graph isomorphism in terms of distinguishing non-isomorphic graphs. However, it was recently shown that this test cannot identify fundamental graph properties such as connectivity and triangle freeness. We show that GNNs also suffer from the same limitation. To address this limitation, we propose a more expressive architecture, k-hop GNNs, which updates a node’s representation by aggregating information not only from its direct neighbors, but from its k-hop neighborhood. We show that the proposed architecture can identify fundamental graph properties. We evaluate the proposed architecture on standard node classification and graph classification datasets. Our experimental evaluation confirms our theoretical findings since the proposed model achieves performance better or comparable to standard GNNs and to state-of-the-art algorithms. |
Tasks | Graph Classification, Node Classification |
Published | 2019-07-13 |
URL | https://arxiv.org/abs/1907.06051v1 |
https://arxiv.org/pdf/1907.06051v1.pdf | |
PWC | https://paperswithcode.com/paper/k-hop-graph-neural-networks |
Repo | https://github.com/giannisnik/k-hop-gnns |
Framework | pytorch |
The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification
Title | The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification |
Authors | Khaled Koutini, Hamid Eghbal-zadeh, Matthias Dorfer, Gerhard Widmer |
Abstract | Convolutional Neural Networks (CNNs) have had great success in many machine vision as well as machine audition tasks. Many image recognition network architectures have consequently been adapted for audio processing tasks. However, despite some successes, the performance of many of these did not translate from the image to the audio domain. For example, very deep architectures such as ResNet and DenseNet, which significantly outperform VGG in image recognition, do not perform better in audio processing tasks such as Acoustic Scene Classification (ASC). In this paper, we investigate the reasons why such powerful architectures perform worse in ASC compared to simpler models (e.g., VGG). To this end, we analyse the receptive field (RF) of these CNNs and demonstrate the importance of the RF to the generalization capability of the models. Using our receptive field analysis, we adapt both ResNet and DenseNet, achieving state-of-the-art performance and eventually outperforming the VGG-based models. We introduce systematic ways of adapting the RF in CNNs, and present results on three data sets that show how changing the RF over the time and frequency dimensions affects a model’s performance. Our experimental results show that very small or very large RFs can cause performance degradation, but deep models can be made to generalize well by carefully choosing an appropriate RF size within a certain range. |
Tasks | Acoustic Scene Classification, Scene Classification |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.01803v1 |
https://arxiv.org/pdf/1907.01803v1.pdf | |
PWC | https://paperswithcode.com/paper/the-receptive-field-as-a-regularizer-in-deep |
Repo | https://github.com/kkoutini/cpjku_dcase19 |
Framework | pytorch |
Unsupervised Adversarial Domain Adaptation Based On The Wasserstein Distance For Acoustic Scene Classification
Title | Unsupervised Adversarial Domain Adaptation Based On The Wasserstein Distance For Acoustic Scene Classification |
Authors | Konstantinos Drossos, Paul Magron, Tuomas Virtanen |
Abstract | A challenging problem in deep learning-based machine listening field is the degradation of the performance when using data from unseen conditions. In this paper we focus on the acoustic scene classification (ASC) task and propose an adversarial deep learning method to allow adapting an acoustic scene classification system to deal with a new acoustic channel resulting from data captured with a different recording device. We build upon the theoretical model of H{\Delta}H-distance and previous adversarial discriminative deep learning method for ASC unsupervised domain adaptation, and we present an adversarial training based method using the Wasserstein distance. We improve the state-of-the-art mean accuracy on the data from the unseen conditions from 32% to 45%, using the TUT Acoustic Scenes dataset. |
Tasks | Acoustic Scene Classification, Domain Adaptation, Scene Classification, Unsupervised Domain Adaptation |
Published | 2019-04-24 |
URL | https://arxiv.org/abs/1904.10678v2 |
https://arxiv.org/pdf/1904.10678v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-adversarial-domain-adaptation-1 |
Repo | https://github.com/dr-costas/undaw |
Framework | pytorch |