January 29, 2020

2972 words 14 mins read

Paper Group ANR 706

Paper Group ANR 706

Constructing the Matrix Multilayer Perceptron and its Application to the VAE. Scalable Deep Unsupervised Clustering with Concrete GMVAEs. A Test Suite and Manual Evaluation of Document-Level NMT at WMT19. Teaching Responsible Data Science: Charting New Pedagogical Territory. Using Neural Networks for Relation Extraction from Biomedical Literature. …

Constructing the Matrix Multilayer Perceptron and its Application to the VAE

Title Constructing the Matrix Multilayer Perceptron and its Application to the VAE
Authors Jalil Taghia, Maria Bånkestad, Fredrik Lindsten, Thomas B. Schön
Abstract Like most learning algorithms, the multilayer perceptrons (MLP) is designed to learn a vector of parameters from data. However, in certain scenarios we are interested in learning structured parameters (predictions) in the form of symmetric positive definite matrices. Here, we introduce a variant of the MLP, referred to as the matrix MLP, that is specialized at learning symmetric positive definite matrices. We also present an application of the model within the context of the variational autoencoder (VAE). Our formulation of the VAE extends the vanilla formulation to the cases where the recognition and the generative networks can be from the parametric family of distributions with dense covariance matrices. Two specific examples are discussed in more detail: the dense covariance Gaussian and its generalization, the power exponential distribution. Our new developments are illustrated using both synthetic and real data.
Tasks
Published 2019-02-04
URL http://arxiv.org/abs/1902.01182v1
PDF http://arxiv.org/pdf/1902.01182v1.pdf
PWC https://paperswithcode.com/paper/constructing-the-matrix-multilayer-perceptron
Repo
Framework

Scalable Deep Unsupervised Clustering with Concrete GMVAEs

Title Scalable Deep Unsupervised Clustering with Concrete GMVAEs
Authors Mark Collier, Hector Urdiales
Abstract Discrete random variables are natural components of probabilistic clustering models. A number of VAE variants with discrete latent variables have been developed. Training such methods requires marginalizing over the discrete latent variables, causing training time complexity to be linear in the number clusters. By applying a continuous relaxation to the discrete variables in these methods we can achieve a reduction in the training time complexity to be constant in the number of clusters used. We demonstrate that in practice for one such method, the Gaussian Mixture VAE, the use of a continuous relaxation has no negative effect on the quality of the clustering but provides a substantial reduction in training time, reducing training time on CIFAR-100 with 20 clusters from 47 hours to less than 6 hours.
Tasks
Published 2019-09-18
URL https://arxiv.org/abs/1909.08994v1
PDF https://arxiv.org/pdf/1909.08994v1.pdf
PWC https://paperswithcode.com/paper/scalable-deep-unsupervised-clustering-with
Repo
Framework

A Test Suite and Manual Evaluation of Document-Level NMT at WMT19

Title A Test Suite and Manual Evaluation of Document-Level NMT at WMT19
Authors Kateřina Rysová, Magdaléna Rysová, Tomáš Musil, Lucie Poláková, Ondřej Bojar
Abstract As the quality of machine translation rises and neural machine translation (NMT) is moving from sentence to document level translations, it is becoming increasingly difficult to evaluate the output of translation systems. We provide a test suite for WMT19 aimed at assessing discourse phenomena of MT systems participating in the News Translation Task. We have manually checked the outputs and identified types of translation errors that are relevant to document-level translation.
Tasks Machine Translation
Published 2019-08-08
URL https://arxiv.org/abs/1908.03043v1
PDF https://arxiv.org/pdf/1908.03043v1.pdf
PWC https://paperswithcode.com/paper/a-test-suite-and-manual-evaluation-of
Repo
Framework

Teaching Responsible Data Science: Charting New Pedagogical Territory

Title Teaching Responsible Data Science: Charting New Pedagogical Territory
Authors Julia Stoyanovich, Armanda Lewis
Abstract Although numerous ethics courses are available, with many focusing specifically on technology and computer ethics, pedagogical approaches employed in these courses rely exclusively on texts rather than on software development or data analysis. Technical students often consider these courses unimportant and a distraction from the “real” material. To develop instructional materials and methodologies that are thoughtful and engaging, we must strive for balance: between texts and coding, between critique and solution, and between cutting-edge research and practical applicability. Finding such balance is particularly difficult in the nascent field of responsible data science (RDS), where we are only starting to understand how to interface between the intrinsically different methodologies of engineering and social sciences. In this paper we recount a recent experience in developing and teaching an RDS course to graduate and advanced undergraduate students in data science. We then dive into an area that is critically important to RDS – transparency and interpretability of machine-assisted decision-making, and tie this area to the needs of emerging RDS curricula. Recounting our own experience, and leveraging literature on pedagogical methods in data science and beyond, we propose the notion of an “object-to-interpret-with”. We link this notion to “nutritional labels” – a family of interpretability tools that are gaining popularity in RDS research and practice. With this work we aim to contribute to the nascent area of RDS education, and to inspire others in the community to come together to develop a deeper theoretical understanding of the pedagogical needs of RDS, and contribute concrete educational materials and methodologies that others can use. All course materials are publicly available at https://dataresponsibly.github.io/courses.
Tasks Decision Making
Published 2019-12-23
URL https://arxiv.org/abs/1912.10564v1
PDF https://arxiv.org/pdf/1912.10564v1.pdf
PWC https://paperswithcode.com/paper/teaching-responsible-data-science-charting
Repo
Framework

Using Neural Networks for Relation Extraction from Biomedical Literature

Title Using Neural Networks for Relation Extraction from Biomedical Literature
Authors Diana Sousa, Andre Lamurias, Francisco M. Couto
Abstract Using different sources of information to support automated extracting of relations between biomedical concepts contributes to the development of our understanding of biological systems. The primary comprehensive source of these relations is biomedical literature. Several relation extraction approaches have been proposed to identify relations between concepts in biomedical literature, namely using neural networks algorithms. The use of multichannel architectures composed of multiple data representations, as in deep neural networks, is leading to state-of-the-art results. The right combination of data representations can eventually lead us to even higher evaluation scores in relation extraction tasks. Thus, biomedical ontologies play a fundamental role by providing semantic and ancestry information about an entity. The incorporation of biomedical ontologies has already been proved to enhance previous state-of-the-art results.
Tasks Relation Extraction
Published 2019-05-27
URL https://arxiv.org/abs/1905.11391v1
PDF https://arxiv.org/pdf/1905.11391v1.pdf
PWC https://paperswithcode.com/paper/using-neural-networks-for-relation-extraction
Repo
Framework

Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA

Title Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA
Authors Yichen Jiang, Mohit Bansal
Abstract Multi-hop question answering requires a model to connect multiple pieces of evidence scattered in a long context to answer the question. In this paper, we show that in the multi-hop HotpotQA (Yang et al., 2018) dataset, the examples often contain reasoning shortcuts through which models can directly locate the answer by word-matching the question with a sentence in the context. We demonstrate this issue by constructing adversarial documents that create contradicting answers to the shortcut but do not affect the validity of the original answer. The performance of strong baseline models drops significantly on our adversarial evaluation, indicating that they are indeed exploiting the shortcuts rather than performing multi-hop reasoning. After adversarial training, the baseline’s performance improves but is still limited on the adversarial evaluation. Hence, we use a control unit that dynamically attends to the question at different reasoning hops to guide the model’s multi-hop reasoning. We show that this 2-hop model trained on the regular data is more robust to the adversaries than the baseline model. After adversarial training, this 2-hop model not only achieves improvements over its counterpart trained on regular data, but also outperforms the adversarially-trained 1-hop baseline. We hope that these insights and initial improvements will motivate the development of new models that combine explicit compositional reasoning with adversarial training.
Tasks Question Answering
Published 2019-06-17
URL https://arxiv.org/abs/1906.07132v1
PDF https://arxiv.org/pdf/1906.07132v1.pdf
PWC https://paperswithcode.com/paper/avoiding-reasoning-shortcuts-adversarial
Repo
Framework

Today Me, Tomorrow Thee: Efficient Resource Allocation in Competitive Settings using Karma Games

Title Today Me, Tomorrow Thee: Efficient Resource Allocation in Competitive Settings using Karma Games
Authors Andrea Censi, Saverio Bolognani, Julian G. Zilly, Shima Sadat Mousavi, Emilio Frazzoli
Abstract We present a new type of coordination mechanism among multiple agents for the allocation of a finite resource, such as the allocation of time slots for passing an intersection. We consider the setting where we associate one counter to each agent, which we call karma value, and where there is an established mechanism to decide resource allocation based on agents exchanging karma. The idea is that agents might be inclined to pass on using resources today, in exchange for karma, which will make it easier for them to claim the resource use in the future. To understand whether such a system might work robustly, we only design the protocol and not the agents’ policies. We take a game-theoretic perspective and compute policies corresponding to Nash equilibria for the game. We find, surprisingly, that the Nash equilibria for a society of self-interested agents are very close in social welfare to a centralized cooperative solution. These results suggest that many resource allocation problems can have a simple, elegant, and robust solution, assuming the availability of a karma accounting mechanism.
Tasks
Published 2019-07-22
URL https://arxiv.org/abs/1907.09198v1
PDF https://arxiv.org/pdf/1907.09198v1.pdf
PWC https://paperswithcode.com/paper/today-me-tomorrow-thee-efficient-resource
Repo
Framework

Deep Learning Algorithms for Coronary Artery Plaque Characterisation from CCTA Scans

Title Deep Learning Algorithms for Coronary Artery Plaque Characterisation from CCTA Scans
Authors Felix Denzinger, Michael Wels, Katharina Breininger, Anika Reidelshöfer, Joachim Eckert, Michael Sühling, Axel Schmermund, Andreas Maier
Abstract Analysing coronary artery plaque segments with respect to their functional significance and therefore their influence to patient management in a non-invasive setup is an important subject of current research. In this work we compare and improve three deep learning algorithms for this task: A 3D recurrent convolutional neural network (RCNN), a 2D multi-view ensemble approach based on texture analysis, and a newly proposed 2.5D approach. Current state of the art methods utilising fluid dynamics based fractional flow reserve (FFR) simulation reach an AUC of up to 0.93 for the task of predicting an abnormal invasive FFR value. For the comparable task of predicting revascularisation decision, we are able to improve the performance in terms of AUC of both existing approaches with the proposed modifications, specifically from 0.80 to 0.90 for the 3D-RCNN, and from 0.85 to 0.90 for the multi-view texture-based ensemble. The newly proposed 2.5D approach achieves comparable results with an AUC of 0.90.
Tasks Texture Classification
Published 2019-12-13
URL https://arxiv.org/abs/1912.06417v1
PDF https://arxiv.org/pdf/1912.06417v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-algorithms-for-coronary-artery
Repo
Framework

Generalization in Deep Networks: The Role of Distance from Initialization

Title Generalization in Deep Networks: The Role of Distance from Initialization
Authors Vaishnavh Nagarajan, J. Zico Kolter
Abstract Why does training deep neural networks using stochastic gradient descent (SGD) result in a generalization error that does not worsen with the number of parameters in the network? To answer this question, we advocate a notion of effective model capacity that is dependent on {\em a given random initialization of the network} and not just the training algorithm and the data distribution. We provide empirical evidences that demonstrate that the model capacity of SGD-trained deep networks is in fact restricted through implicit regularization of {\em the $\ell_2$ distance from the initialization}. We also provide theoretical arguments that further highlight the need for initialization-dependent notions of model capacity. We leave as open questions how and why distance from initialization is regularized, and whether it is sufficient to explain generalization.
Tasks
Published 2019-01-07
URL http://arxiv.org/abs/1901.01672v2
PDF http://arxiv.org/pdf/1901.01672v2.pdf
PWC https://paperswithcode.com/paper/generalization-in-deep-networks-the-role-of
Repo
Framework

Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge

Title Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge
Authors Hossein Zeinali, Themos Stafylakis, Georgia Athanasopoulou, Johan Rohdin, Ioannis Gkinis, Lukáš Burget, Jan “Honza’’ Černocký
Abstract In this paper, we present the system description of the joint efforts of Brno University of Technology (BUT) and Omilia – Conversational Intelligence for the ASVSpoof2019 Spoofing and Countermeasures Challenge. The primary submission for Physical access (PA) is a fusion of two VGG networks, trained on single and two-channels features. For Logical access (LA), our primary system is a fusion of VGG and the recently introduced SincNet architecture. The results on PA show that the proposed networks yield very competitive performance in all conditions and achieved 86:% relative improvement compared to the official baseline. On the other hand, the results on LA showed that although the proposed architecture and training strategy performs very well on certain spoofing attacks, it fails to generalize to certain attacks that are unseen during training.
Tasks
Published 2019-07-13
URL https://arxiv.org/abs/1907.12908v1
PDF https://arxiv.org/pdf/1907.12908v1.pdf
PWC https://paperswithcode.com/paper/detecting-spoofing-attacks-using-vgg-and
Repo
Framework

Machine Learning-Based Adaptive Receive Filtering: Proof-of-Concept on an SDR Platform

Title Machine Learning-Based Adaptive Receive Filtering: Proof-of-Concept on an SDR Platform
Authors Matthias Mehlhose, Daniyal Amir Awany, Renato L. G. Cavalcante, Martin Kurras, Slawomir Stanczak
Abstract Conventional multiuser detection techniques either require a large number of antennas at the receiver for a desired performance, or they are too complex for practical implementation. Moreover, many of these techniques, such as successive interference cancellation (SIC), suffer from errors in parameter estimation (user channels, covariance matrix, noise variance, etc.) that is performed before detection of user data symbols. As an alternative to conventional methods, this paper proposes and demonstrates a low-complexity practical Machine Learning (ML) based receiver that achieves similar (and at times better) performance to the SIC receiver. The proposed receiver does not require parameter estimation; instead it uses supervised learning to detect the user modulation symbols directly. We perform comparisons with minimum mean square error (MMSE) and SIC receivers in terms of symbol error rate (SER) and complexity.
Tasks
Published 2019-11-11
URL https://arxiv.org/abs/1911.04291v1
PDF https://arxiv.org/pdf/1911.04291v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-based-adaptive-receive
Repo
Framework

Physical Integrity Attack Detection of Surveillance Camera with Deep Learning Based Video Frame Interpolation

Title Physical Integrity Attack Detection of Surveillance Camera with Deep Learning Based Video Frame Interpolation
Authors Jonathan Pan
Abstract Surveillance cameras, which is a form of Cyber Physical System, are deployed extensively to provide visual surveillance monitoring of activities of interest or anomalies. However, these cameras are at risks of physical security attacks against their physical attributes or configuration like tampering of their recording coverage, camera positions or recording configurations like focus and zoom factors. Such adversarial alteration of physical configuration could also be invoked through cyber security attacks against the camera’s software vulnerabilities to administratively change the camera’s physical configuration settings. When such Cyber Physical attacks occur, they affect the integrity of the targeted cameras that would in turn render these cameras ineffective in fulfilling the intended security functions. There is a significant measure of research work in detection mechanisms of cyber-attacks against these Cyber Physical devices, however it is understudied area with such mechanisms against integrity attacks on physical configuration. This research proposes the use of the novel use of deep learning algorithms to detect such physical attacks originating from cyber or physical spaces. Additionally, we proposed the novel use of deep learning-based video frame interpolation for such detection that has comparatively better performance to other anomaly detectors in spatiotemporal environments.
Tasks Video Frame Interpolation
Published 2019-06-15
URL https://arxiv.org/abs/1906.06475v1
PDF https://arxiv.org/pdf/1906.06475v1.pdf
PWC https://paperswithcode.com/paper/physical-integrity-attack-detection-of
Repo
Framework

High dimensional regression for regenerative time-series: an application to road traffic modeling

Title High dimensional regression for regenerative time-series: an application to road traffic modeling
Authors Mohammed Bouchouia, François Portier
Abstract This paper investigates statistical models for road traffic modeling. The proposed methodology considers road traffic as a (i) highdimensional time-series for which (ii) regeneration occurs at the end of each day. Since (ii), prediction is based on a daily modeling of the road traffic using a vector autoregressive model that combines linearly the past observations of the day. Considering (i), the learning algorithm follows from an l1-penalization of the regression coefficients. Excess risk bounds are established under the high-dimensional framework in which the number of road sections goes to infinity with the number of observed days. Considering floating car data observed in an urban area, the approach is compared to state-of-the-art methods including neural networks. In addition of being very competitive in terms of prediction, it enables to identify the most determinant sections of the road network.
Tasks Time Series
Published 2019-10-24
URL https://arxiv.org/abs/1910.11095v3
PDF https://arxiv.org/pdf/1910.11095v3.pdf
PWC https://paperswithcode.com/paper/high-dimensional-regression-for-regenerative
Repo
Framework

Active Generative Adversarial Network for Image Classification

Title Active Generative Adversarial Network for Image Classification
Authors Quan Kong, Bin Tong, Martin Klinkigt, Yuki Watanabe, Naoto Akira, Tomokazu Murakami
Abstract Sufficient supervised information is crucial for any machine learning models to boost performance. However, labeling data is expensive and sometimes difficult to obtain. Active learning is an approach to acquire annotations for data from a human oracle by selecting informative samples with a high probability to enhance performance. In recent emerging studies, a generative adversarial network (GAN) has been integrated with active learning to generate good candidates to be presented to the oracle. In this paper, we propose a novel model that is able to obtain labels for data in a cheaper manner without the need to query an oracle. In the model, a novel reward for each sample is devised to measure the degree of uncertainty, which is obtained from a classifier trained with existing labeled data. This reward is used to guide a conditional GAN to generate informative samples with a higher probability for a certain label. With extensive evaluations, we have confirmed the effectiveness of the model, showing that the generated samples are capable of improving the classification performance in popular image classification tasks.
Tasks Active Learning, Image Classification
Published 2019-06-17
URL https://arxiv.org/abs/1906.07133v1
PDF https://arxiv.org/pdf/1906.07133v1.pdf
PWC https://paperswithcode.com/paper/active-generative-adversarial-network-for
Repo
Framework

Why Didn’t You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models

Title Why Didn’t You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models
Authors Varun Kumar, Alison Smith-Renner, Leah Findlater, Kevin Seppi, Jordan Boyd-Graber
Abstract To address the lack of comparative evaluation of Human-in-the-Loop Topic Modeling (HLTM) systems, we implement and evaluate three contrasting HLTM modeling approaches using simulation experiments. These approaches extend previously proposed frameworks, including constraints and informed prior-based methods. Users should have a sense of control in HLTM systems, so we propose a control metric to measure whether refinement operations’ results match users’ expectations. Informed prior-based methods provide better control than constraints, but constraints yield higher quality topics.
Tasks Topic Models
Published 2019-05-23
URL https://arxiv.org/abs/1905.09864v2
PDF https://arxiv.org/pdf/1905.09864v2.pdf
PWC https://paperswithcode.com/paper/why-didnt-you-listen-to-me-comparing-user
Repo
Framework
comments powered by Disqus