January 29, 2020

2972 words 14 mins read

Paper Group ANR 706

Constructing the Matrix Multilayer Perceptron and its Application to the VAE. Scalable Deep Unsupervised Clustering with Concrete GMVAEs. A Test Suite and Manual Evaluation of Document-Level NMT at WMT19. Teaching Responsible Data Science: Charting New Pedagogical Territory. Using Neural Networks for Relation Extraction from Biomedical Literature. …

Constructing the Matrix Multilayer Perceptron and its Application to the VAE


Title	Constructing the Matrix Multilayer Perceptron and its Application to the VAE
Authors	Jalil Taghia, Maria Bånkestad, Fredrik Lindsten, Thomas B. Schön
Abstract	Like most learning algorithms, the multilayer perceptrons (MLP) is designed to learn a vector of parameters from data. However, in certain scenarios we are interested in learning structured parameters (predictions) in the form of symmetric positive definite matrices. Here, we introduce a variant of the MLP, referred to as the matrix MLP, that is specialized at learning symmetric positive definite matrices. We also present an application of the model within the context of the variational autoencoder (VAE). Our formulation of the VAE extends the vanilla formulation to the cases where the recognition and the generative networks can be from the parametric family of distributions with dense covariance matrices. Two specific examples are discussed in more detail: the dense covariance Gaussian and its generalization, the power exponential distribution. Our new developments are illustrated using both synthetic and real data.
Tasks
Published	2019-02-04
URL	http://arxiv.org/abs/1902.01182v1
PDF	http://arxiv.org/pdf/1902.01182v1.pdf
PWC	https://paperswithcode.com/paper/constructing-the-matrix-multilayer-perceptron
Repo
Framework

Scalable Deep Unsupervised Clustering with Concrete GMVAEs


Title	Scalable Deep Unsupervised Clustering with Concrete GMVAEs
Authors	Mark Collier, Hector Urdiales
Abstract	Discrete random variables are natural components of probabilistic clustering models. A number of VAE variants with discrete latent variables have been developed. Training such methods requires marginalizing over the discrete latent variables, causing training time complexity to be linear in the number clusters. By applying a continuous relaxation to the discrete variables in these methods we can achieve a reduction in the training time complexity to be constant in the number of clusters used. We demonstrate that in practice for one such method, the Gaussian Mixture VAE, the use of a continuous relaxation has no negative effect on the quality of the clustering but provides a substantial reduction in training time, reducing training time on CIFAR-100 with 20 clusters from 47 hours to less than 6 hours.
Tasks
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08994v1
PDF	https://arxiv.org/pdf/1909.08994v1.pdf
PWC	https://paperswithcode.com/paper/scalable-deep-unsupervised-clustering-with
Repo
Framework

A Test Suite and Manual Evaluation of Document-Level NMT at WMT19


Title	A Test Suite and Manual Evaluation of Document-Level NMT at WMT19
Authors	Kateřina Rysová, Magdaléna Rysová, Tomáš Musil, Lucie Poláková, Ondřej Bojar
Abstract	As the quality of machine translation rises and neural machine translation (NMT) is moving from sentence to document level translations, it is becoming increasingly difficult to evaluate the output of translation systems. We provide a test suite for WMT19 aimed at assessing discourse phenomena of MT systems participating in the News Translation Task. We have manually checked the outputs and identified types of translation errors that are relevant to document-level translation.
Tasks	Machine Translation
Published	2019-08-08
URL	https://arxiv.org/abs/1908.03043v1
PDF	https://arxiv.org/pdf/1908.03043v1.pdf
PWC	https://paperswithcode.com/paper/a-test-suite-and-manual-evaluation-of
Repo
Framework

Teaching Responsible Data Science: Charting New Pedagogical Territory


Title	Teaching Responsible Data Science: Charting New Pedagogical Territory
Authors	Julia Stoyanovich, Armanda Lewis
Abstract	Although numerous ethics courses are available, with many focusing specifically on technology and computer ethics, pedagogical approaches employed in these courses rely exclusively on texts rather than on software development or data analysis. Technical students often consider these courses unimportant and a distraction from the “real” material. To develop instructional materials and methodologies that are thoughtful and engaging, we must strive for balance: between texts and coding, between critique and solution, and between cutting-edge research and practical applicability. Finding such balance is particularly difficult in the nascent field of responsible data science (RDS), where we are only starting to understand how to interface between the intrinsically different methodologies of engineering and social sciences. In this paper we recount a recent experience in developing and teaching an RDS course to graduate and advanced undergraduate students in data science. We then dive into an area that is critically important to RDS – transparency and interpretability of machine-assisted decision-making, and tie this area to the needs of emerging RDS curricula. Recounting our own experience, and leveraging literature on pedagogical methods in data science and beyond, we propose the notion of an “object-to-interpret-with”. We link this notion to “nutritional labels” – a family of interpretability tools that are gaining popularity in RDS research and practice. With this work we aim to contribute to the nascent area of RDS education, and to inspire others in the community to come together to develop a deeper theoretical understanding of the pedagogical needs of RDS, and contribute concrete educational materials and methodologies that others can use. All course materials are publicly available at https://dataresponsibly.github.io/courses.
Tasks	Decision Making
Published	2019-12-23
URL	https://arxiv.org/abs/1912.10564v1
PDF	https://arxiv.org/pdf/1912.10564v1.pdf
PWC	https://paperswithcode.com/paper/teaching-responsible-data-science-charting
Repo
Framework

Using Neural Networks for Relation Extraction from Biomedical Literature


Title	Using Neural Networks for Relation Extraction from Biomedical Literature
Authors	Diana Sousa, Andre Lamurias, Francisco M. Couto
Abstract	Using different sources of information to support automated extracting of relations between biomedical concepts contributes to the development of our understanding of biological systems. The primary comprehensive source of these relations is biomedical literature. Several relation extraction approaches have been proposed to identify relations between concepts in biomedical literature, namely using neural networks algorithms. The use of multichannel architectures composed of multiple data representations, as in deep neural networks, is leading to state-of-the-art results. The right combination of data representations can eventually lead us to even higher evaluation scores in relation extraction tasks. Thus, biomedical ontologies play a fundamental role by providing semantic and ancestry information about an entity. The incorporation of biomedical ontologies has already been proved to enhance previous state-of-the-art results.
Tasks	Relation Extraction
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11391v1
PDF	https://arxiv.org/pdf/1905.11391v1.pdf
PWC	https://paperswithcode.com/paper/using-neural-networks-for-relation-extraction
Repo
Framework

Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA


Title	Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA
Authors	Yichen Jiang, Mohit Bansal
Abstract	Multi-hop question answering requires a model to connect multiple pieces of evidence scattered in a long context to answer the question. In this paper, we show that in the multi-hop HotpotQA (Yang et al., 2018) dataset, the examples often contain reasoning shortcuts through which models can directly locate the answer by word-matching the question with a sentence in the context. We demonstrate this issue by constructing adversarial documents that create contradicting answers to the shortcut but do not affect the validity of the original answer. The performance of strong baseline models drops significantly on our adversarial evaluation, indicating that they are indeed exploiting the shortcuts rather than performing multi-hop reasoning. After adversarial training, the baseline’s performance improves but is still limited on the adversarial evaluation. Hence, we use a control unit that dynamically attends to the question at different reasoning hops to guide the model’s multi-hop reasoning. We show that this 2-hop model trained on the regular data is more robust to the adversaries than the baseline model. After adversarial training, this 2-hop model not only achieves improvements over its counterpart trained on regular data, but also outperforms the adversarially-trained 1-hop baseline. We hope that these insights and initial improvements will motivate the development of new models that combine explicit compositional reasoning with adversarial training.
Tasks	Question Answering
Published	2019-06-17
URL	https://arxiv.org/abs/1906.07132v1
PDF	https://arxiv.org/pdf/1906.07132v1.pdf
PWC	https://paperswithcode.com/paper/avoiding-reasoning-shortcuts-adversarial
Repo
Framework

Today Me, Tomorrow Thee: Efficient Resource Allocation in Competitive Settings using Karma Games


Title	Today Me, Tomorrow Thee: Efficient Resource Allocation in Competitive Settings using Karma Games
Authors	Andrea Censi, Saverio Bolognani, Julian G. Zilly, Shima Sadat Mousavi, Emilio Frazzoli
Abstract	We present a new type of coordination mechanism among multiple agents for the allocation of a finite resource, such as the allocation of time slots for passing an intersection. We consider the setting where we associate one counter to each agent, which we call karma value, and where there is an established mechanism to decide resource allocation based on agents exchanging karma. The idea is that agents might be inclined to pass on using resources today, in exchange for karma, which will make it easier for them to claim the resource use in the future. To understand whether such a system might work robustly, we only design the protocol and not the agents’ policies. We take a game-theoretic perspective and compute policies corresponding to Nash equilibria for the game. We find, surprisingly, that the Nash equilibria for a society of self-interested agents are very close in social welfare to a centralized cooperative solution. These results suggest that many resource allocation problems can have a simple, elegant, and robust solution, assuming the availability of a karma accounting mechanism.
Tasks
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09198v1
PDF	https://arxiv.org/pdf/1907.09198v1.pdf
PWC	https://paperswithcode.com/paper/today-me-tomorrow-thee-efficient-resource
Repo
Framework

Deep Learning Algorithms for Coronary Artery Plaque Characterisation from CCTA Scans


Title	Deep Learning Algorithms for Coronary Artery Plaque Characterisation from CCTA Scans
Authors	Felix Denzinger, Michael Wels, Katharina Breininger, Anika Reidelshöfer, Joachim Eckert, Michael Sühling, Axel Schmermund, Andreas Maier
Abstract	Analysing coronary artery plaque segments with respect to their functional significance and therefore their influence to patient management in a non-invasive setup is an important subject of current research. In this work we compare and improve three deep learning algorithms for this task: A 3D recurrent convolutional neural network (RCNN), a 2D multi-view ensemble approach based on texture analysis, and a newly proposed 2.5D approach. Current state of the art methods utilising fluid dynamics based fractional flow reserve (FFR) simulation reach an AUC of up to 0.93 for the task of predicting an abnormal invasive FFR value. For the comparable task of predicting revascularisation decision, we are able to improve the performance in terms of AUC of both existing approaches with the proposed modifications, specifically from 0.80 to 0.90 for the 3D-RCNN, and from 0.85 to 0.90 for the multi-view texture-based ensemble. The newly proposed 2.5D approach achieves comparable results with an AUC of 0.90.
Tasks	Texture Classification
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06417v1
PDF	https://arxiv.org/pdf/1912.06417v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-algorithms-for-coronary-artery
Repo
Framework

Generalization in Deep Networks: The Role of Distance from Initialization


Title	Generalization in Deep Networks: The Role of Distance from Initialization
Authors	Vaishnavh Nagarajan, J. Zico Kolter
Abstract	Why does training deep neural networks using stochastic gradient descent (SGD) result in a generalization error that does not worsen with the number of parameters in the network? To answer this question, we advocate a notion of effective model capacity that is dependent on {\em a given random initialization of the network} and not just the training algorithm and the data distribution. We provide empirical evidences that demonstrate that the model capacity of SGD-trained deep networks is in fact restricted through implicit regularization of {\em the $\ell_2$ distance from the initialization}. We also provide theoretical arguments that further highlight the need for initialization-dependent notions of model capacity. We leave as open questions how and why distance from initialization is regularized, and whether it is sufficient to explain generalization.
Tasks
Published	2019-01-07
URL	http://arxiv.org/abs/1901.01672v2
PDF	http://arxiv.org/pdf/1901.01672v2.pdf
PWC	https://paperswithcode.com/paper/generalization-in-deep-networks-the-role-of
Repo
Framework

Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge


Title	Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge
Authors	Hossein Zeinali, Themos Stafylakis, Georgia Athanasopoulou, Johan Rohdin, Ioannis Gkinis, Lukáš Burget, Jan “Honza’’ Černocký
Abstract	In this paper, we present the system description of the joint efforts of Brno University of Technology (BUT) and Omilia – Conversational Intelligence for the ASVSpoof2019 Spoofing and Countermeasures Challenge. The primary submission for Physical access (PA) is a fusion of two VGG networks, trained on single and two-channels features. For Logical access (LA), our primary system is a fusion of VGG and the recently introduced SincNet architecture. The results on PA show that the proposed networks yield very competitive performance in all conditions and achieved 86:% relative improvement compared to the official baseline. On the other hand, the results on LA showed that although the proposed architecture and training strategy performs very well on certain spoofing attacks, it fails to generalize to certain attacks that are unseen during training.
Tasks
Published	2019-07-13
URL	https://arxiv.org/abs/1907.12908v1
PDF	https://arxiv.org/pdf/1907.12908v1.pdf
PWC	https://paperswithcode.com/paper/detecting-spoofing-attacks-using-vgg-and
Repo
Framework

Machine Learning-Based Adaptive Receive Filtering: Proof-of-Concept on an SDR Platform


Title	Machine Learning-Based Adaptive Receive Filtering: Proof-of-Concept on an SDR Platform
Authors	Matthias Mehlhose, Daniyal Amir Awany, Renato L. G. Cavalcante, Martin Kurras, Slawomir Stanczak
Abstract	Conventional multiuser detection techniques either require a large number of antennas at the receiver for a desired performance, or they are too complex for practical implementation. Moreover, many of these techniques, such as successive interference cancellation (SIC), suffer from errors in parameter estimation (user channels, covariance matrix, noise variance, etc.) that is performed before detection of user data symbols. As an alternative to conventional methods, this paper proposes and demonstrates a low-complexity practical Machine Learning (ML) based receiver that achieves similar (and at times better) performance to the SIC receiver. The proposed receiver does not require parameter estimation; instead it uses supervised learning to detect the user modulation symbols directly. We perform comparisons with minimum mean square error (MMSE) and SIC receivers in terms of symbol error rate (SER) and complexity.
Tasks
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04291v1
PDF	https://arxiv.org/pdf/1911.04291v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-based-adaptive-receive
Repo
Framework

Physical Integrity Attack Detection of Surveillance Camera with Deep Learning Based Video Frame Interpolation


Title	Physical Integrity Attack Detection of Surveillance Camera with Deep Learning Based Video Frame Interpolation
Authors	Jonathan Pan
Abstract	Surveillance cameras, which is a form of Cyber Physical System, are deployed extensively to provide visual surveillance monitoring of activities of interest or anomalies. However, these cameras are at risks of physical security attacks against their physical attributes or configuration like tampering of their recording coverage, camera positions or recording configurations like focus and zoom factors. Such adversarial alteration of physical configuration could also be invoked through cyber security attacks against the camera’s software vulnerabilities to administratively change the camera’s physical configuration settings. When such Cyber Physical attacks occur, they affect the integrity of the targeted cameras that would in turn render these cameras ineffective in fulfilling the intended security functions. There is a significant measure of research work in detection mechanisms of cyber-attacks against these Cyber Physical devices, however it is understudied area with such mechanisms against integrity attacks on physical configuration. This research proposes the use of the novel use of deep learning algorithms to detect such physical attacks originating from cyber or physical spaces. Additionally, we proposed the novel use of deep learning-based video frame interpolation for such detection that has comparatively better performance to other anomaly detectors in spatiotemporal environments.
Tasks	Video Frame Interpolation
Published	2019-06-15
URL	https://arxiv.org/abs/1906.06475v1
PDF	https://arxiv.org/pdf/1906.06475v1.pdf
PWC	https://paperswithcode.com/paper/physical-integrity-attack-detection-of
Repo
Framework

High dimensional regression for regenerative time-series: an application to road traffic modeling


Title	High dimensional regression for regenerative time-series: an application to road traffic modeling
Authors	Mohammed Bouchouia, François Portier
Abstract	This paper investigates statistical models for road traffic modeling. The proposed methodology considers road traffic as a (i) highdimensional time-series for which (ii) regeneration occurs at the end of each day. Since (ii), prediction is based on a daily modeling of the road traffic using a vector autoregressive model that combines linearly the past observations of the day. Considering (i), the learning algorithm follows from an l1-penalization of the regression coefficients. Excess risk bounds are established under the high-dimensional framework in which the number of road sections goes to infinity with the number of observed days. Considering floating car data observed in an urban area, the approach is compared to state-of-the-art methods including neural networks. In addition of being very competitive in terms of prediction, it enables to identify the most determinant sections of the road network.
Tasks	Time Series
Published	2019-10-24
URL	https://arxiv.org/abs/1910.11095v3
PDF	https://arxiv.org/pdf/1910.11095v3.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-regression-for-regenerative
Repo
Framework

Active Generative Adversarial Network for Image Classification


Title	Active Generative Adversarial Network for Image Classification
Authors	Quan Kong, Bin Tong, Martin Klinkigt, Yuki Watanabe, Naoto Akira, Tomokazu Murakami
Abstract	Sufficient supervised information is crucial for any machine learning models to boost performance. However, labeling data is expensive and sometimes difficult to obtain. Active learning is an approach to acquire annotations for data from a human oracle by selecting informative samples with a high probability to enhance performance. In recent emerging studies, a generative adversarial network (GAN) has been integrated with active learning to generate good candidates to be presented to the oracle. In this paper, we propose a novel model that is able to obtain labels for data in a cheaper manner without the need to query an oracle. In the model, a novel reward for each sample is devised to measure the degree of uncertainty, which is obtained from a classifier trained with existing labeled data. This reward is used to guide a conditional GAN to generate informative samples with a higher probability for a certain label. With extensive evaluations, we have confirmed the effectiveness of the model, showing that the generated samples are capable of improving the classification performance in popular image classification tasks.
Tasks	Active Learning, Image Classification
Published	2019-06-17
URL	https://arxiv.org/abs/1906.07133v1
PDF	https://arxiv.org/pdf/1906.07133v1.pdf
PWC	https://paperswithcode.com/paper/active-generative-adversarial-network-for
Repo
Framework

Why Didn’t You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models


Title	Why Didn’t You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models
Authors	Varun Kumar, Alison Smith-Renner, Leah Findlater, Kevin Seppi, Jordan Boyd-Graber
Abstract	To address the lack of comparative evaluation of Human-in-the-Loop Topic Modeling (HLTM) systems, we implement and evaluate three contrasting HLTM modeling approaches using simulation experiments. These approaches extend previously proposed frameworks, including constraints and informed prior-based methods. Users should have a sense of control in HLTM systems, so we propose a control metric to measure whether refinement operations’ results match users’ expectations. Informed prior-based methods provide better control than constraints, but constraints yield higher quality topics.
Tasks	Topic Models
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09864v2
PDF	https://arxiv.org/pdf/1905.09864v2.pdf
PWC	https://paperswithcode.com/paper/why-didnt-you-listen-to-me-comparing-user
Repo
Framework