Paper Group ANR 706
Constructing the Matrix Multilayer Perceptron and its Application to the VAE. Scalable Deep Unsupervised Clustering with Concrete GMVAEs. A Test Suite and Manual Evaluation of Document-Level NMT at WMT19. Teaching Responsible Data Science: Charting New Pedagogical Territory. Using Neural Networks for Relation Extraction from Biomedical Literature. …
Constructing the Matrix Multilayer Perceptron and its Application to the VAE
Title | Constructing the Matrix Multilayer Perceptron and its Application to the VAE |
Authors | Jalil Taghia, Maria Bånkestad, Fredrik Lindsten, Thomas B. Schön |
Abstract | Like most learning algorithms, the multilayer perceptrons (MLP) is designed to learn a vector of parameters from data. However, in certain scenarios we are interested in learning structured parameters (predictions) in the form of symmetric positive definite matrices. Here, we introduce a variant of the MLP, referred to as the matrix MLP, that is specialized at learning symmetric positive definite matrices. We also present an application of the model within the context of the variational autoencoder (VAE). Our formulation of the VAE extends the vanilla formulation to the cases where the recognition and the generative networks can be from the parametric family of distributions with dense covariance matrices. Two specific examples are discussed in more detail: the dense covariance Gaussian and its generalization, the power exponential distribution. Our new developments are illustrated using both synthetic and real data. |
Tasks | |
Published | 2019-02-04 |
URL | http://arxiv.org/abs/1902.01182v1 |
http://arxiv.org/pdf/1902.01182v1.pdf | |
PWC | https://paperswithcode.com/paper/constructing-the-matrix-multilayer-perceptron |
Repo | |
Framework | |
Scalable Deep Unsupervised Clustering with Concrete GMVAEs
Title | Scalable Deep Unsupervised Clustering with Concrete GMVAEs |
Authors | Mark Collier, Hector Urdiales |
Abstract | Discrete random variables are natural components of probabilistic clustering models. A number of VAE variants with discrete latent variables have been developed. Training such methods requires marginalizing over the discrete latent variables, causing training time complexity to be linear in the number clusters. By applying a continuous relaxation to the discrete variables in these methods we can achieve a reduction in the training time complexity to be constant in the number of clusters used. We demonstrate that in practice for one such method, the Gaussian Mixture VAE, the use of a continuous relaxation has no negative effect on the quality of the clustering but provides a substantial reduction in training time, reducing training time on CIFAR-100 with 20 clusters from 47 hours to less than 6 hours. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08994v1 |
https://arxiv.org/pdf/1909.08994v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-deep-unsupervised-clustering-with |
Repo | |
Framework | |
A Test Suite and Manual Evaluation of Document-Level NMT at WMT19
Title | A Test Suite and Manual Evaluation of Document-Level NMT at WMT19 |
Authors | Kateřina Rysová, Magdaléna Rysová, Tomáš Musil, Lucie Poláková, Ondřej Bojar |
Abstract | As the quality of machine translation rises and neural machine translation (NMT) is moving from sentence to document level translations, it is becoming increasingly difficult to evaluate the output of translation systems. We provide a test suite for WMT19 aimed at assessing discourse phenomena of MT systems participating in the News Translation Task. We have manually checked the outputs and identified types of translation errors that are relevant to document-level translation. |
Tasks | Machine Translation |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.03043v1 |
https://arxiv.org/pdf/1908.03043v1.pdf | |
PWC | https://paperswithcode.com/paper/a-test-suite-and-manual-evaluation-of |
Repo | |
Framework | |
Teaching Responsible Data Science: Charting New Pedagogical Territory
Title | Teaching Responsible Data Science: Charting New Pedagogical Territory |
Authors | Julia Stoyanovich, Armanda Lewis |
Abstract | Although numerous ethics courses are available, with many focusing specifically on technology and computer ethics, pedagogical approaches employed in these courses rely exclusively on texts rather than on software development or data analysis. Technical students often consider these courses unimportant and a distraction from the “real” material. To develop instructional materials and methodologies that are thoughtful and engaging, we must strive for balance: between texts and coding, between critique and solution, and between cutting-edge research and practical applicability. Finding such balance is particularly difficult in the nascent field of responsible data science (RDS), where we are only starting to understand how to interface between the intrinsically different methodologies of engineering and social sciences. In this paper we recount a recent experience in developing and teaching an RDS course to graduate and advanced undergraduate students in data science. We then dive into an area that is critically important to RDS – transparency and interpretability of machine-assisted decision-making, and tie this area to the needs of emerging RDS curricula. Recounting our own experience, and leveraging literature on pedagogical methods in data science and beyond, we propose the notion of an “object-to-interpret-with”. We link this notion to “nutritional labels” – a family of interpretability tools that are gaining popularity in RDS research and practice. With this work we aim to contribute to the nascent area of RDS education, and to inspire others in the community to come together to develop a deeper theoretical understanding of the pedagogical needs of RDS, and contribute concrete educational materials and methodologies that others can use. All course materials are publicly available at https://dataresponsibly.github.io/courses. |
Tasks | Decision Making |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.10564v1 |
https://arxiv.org/pdf/1912.10564v1.pdf | |
PWC | https://paperswithcode.com/paper/teaching-responsible-data-science-charting |
Repo | |
Framework | |
Using Neural Networks for Relation Extraction from Biomedical Literature
Title | Using Neural Networks for Relation Extraction from Biomedical Literature |
Authors | Diana Sousa, Andre Lamurias, Francisco M. Couto |
Abstract | Using different sources of information to support automated extracting of relations between biomedical concepts contributes to the development of our understanding of biological systems. The primary comprehensive source of these relations is biomedical literature. Several relation extraction approaches have been proposed to identify relations between concepts in biomedical literature, namely using neural networks algorithms. The use of multichannel architectures composed of multiple data representations, as in deep neural networks, is leading to state-of-the-art results. The right combination of data representations can eventually lead us to even higher evaluation scores in relation extraction tasks. Thus, biomedical ontologies play a fundamental role by providing semantic and ancestry information about an entity. The incorporation of biomedical ontologies has already been proved to enhance previous state-of-the-art results. |
Tasks | Relation Extraction |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11391v1 |
https://arxiv.org/pdf/1905.11391v1.pdf | |
PWC | https://paperswithcode.com/paper/using-neural-networks-for-relation-extraction |
Repo | |
Framework | |
Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA
Title | Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA |
Authors | Yichen Jiang, Mohit Bansal |
Abstract | Multi-hop question answering requires a model to connect multiple pieces of evidence scattered in a long context to answer the question. In this paper, we show that in the multi-hop HotpotQA (Yang et al., 2018) dataset, the examples often contain reasoning shortcuts through which models can directly locate the answer by word-matching the question with a sentence in the context. We demonstrate this issue by constructing adversarial documents that create contradicting answers to the shortcut but do not affect the validity of the original answer. The performance of strong baseline models drops significantly on our adversarial evaluation, indicating that they are indeed exploiting the shortcuts rather than performing multi-hop reasoning. After adversarial training, the baseline’s performance improves but is still limited on the adversarial evaluation. Hence, we use a control unit that dynamically attends to the question at different reasoning hops to guide the model’s multi-hop reasoning. We show that this 2-hop model trained on the regular data is more robust to the adversaries than the baseline model. After adversarial training, this 2-hop model not only achieves improvements over its counterpart trained on regular data, but also outperforms the adversarially-trained 1-hop baseline. We hope that these insights and initial improvements will motivate the development of new models that combine explicit compositional reasoning with adversarial training. |
Tasks | Question Answering |
Published | 2019-06-17 |
URL | https://arxiv.org/abs/1906.07132v1 |
https://arxiv.org/pdf/1906.07132v1.pdf | |
PWC | https://paperswithcode.com/paper/avoiding-reasoning-shortcuts-adversarial |
Repo | |
Framework | |
Today Me, Tomorrow Thee: Efficient Resource Allocation in Competitive Settings using Karma Games
Title | Today Me, Tomorrow Thee: Efficient Resource Allocation in Competitive Settings using Karma Games |
Authors | Andrea Censi, Saverio Bolognani, Julian G. Zilly, Shima Sadat Mousavi, Emilio Frazzoli |
Abstract | We present a new type of coordination mechanism among multiple agents for the allocation of a finite resource, such as the allocation of time slots for passing an intersection. We consider the setting where we associate one counter to each agent, which we call karma value, and where there is an established mechanism to decide resource allocation based on agents exchanging karma. The idea is that agents might be inclined to pass on using resources today, in exchange for karma, which will make it easier for them to claim the resource use in the future. To understand whether such a system might work robustly, we only design the protocol and not the agents’ policies. We take a game-theoretic perspective and compute policies corresponding to Nash equilibria for the game. We find, surprisingly, that the Nash equilibria for a society of self-interested agents are very close in social welfare to a centralized cooperative solution. These results suggest that many resource allocation problems can have a simple, elegant, and robust solution, assuming the availability of a karma accounting mechanism. |
Tasks | |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.09198v1 |
https://arxiv.org/pdf/1907.09198v1.pdf | |
PWC | https://paperswithcode.com/paper/today-me-tomorrow-thee-efficient-resource |
Repo | |
Framework | |
Deep Learning Algorithms for Coronary Artery Plaque Characterisation from CCTA Scans
Title | Deep Learning Algorithms for Coronary Artery Plaque Characterisation from CCTA Scans |
Authors | Felix Denzinger, Michael Wels, Katharina Breininger, Anika Reidelshöfer, Joachim Eckert, Michael Sühling, Axel Schmermund, Andreas Maier |
Abstract | Analysing coronary artery plaque segments with respect to their functional significance and therefore their influence to patient management in a non-invasive setup is an important subject of current research. In this work we compare and improve three deep learning algorithms for this task: A 3D recurrent convolutional neural network (RCNN), a 2D multi-view ensemble approach based on texture analysis, and a newly proposed 2.5D approach. Current state of the art methods utilising fluid dynamics based fractional flow reserve (FFR) simulation reach an AUC of up to 0.93 for the task of predicting an abnormal invasive FFR value. For the comparable task of predicting revascularisation decision, we are able to improve the performance in terms of AUC of both existing approaches with the proposed modifications, specifically from 0.80 to 0.90 for the 3D-RCNN, and from 0.85 to 0.90 for the multi-view texture-based ensemble. The newly proposed 2.5D approach achieves comparable results with an AUC of 0.90. |
Tasks | Texture Classification |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06417v1 |
https://arxiv.org/pdf/1912.06417v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-algorithms-for-coronary-artery |
Repo | |
Framework | |
Generalization in Deep Networks: The Role of Distance from Initialization
Title | Generalization in Deep Networks: The Role of Distance from Initialization |
Authors | Vaishnavh Nagarajan, J. Zico Kolter |
Abstract | Why does training deep neural networks using stochastic gradient descent (SGD) result in a generalization error that does not worsen with the number of parameters in the network? To answer this question, we advocate a notion of effective model capacity that is dependent on {\em a given random initialization of the network} and not just the training algorithm and the data distribution. We provide empirical evidences that demonstrate that the model capacity of SGD-trained deep networks is in fact restricted through implicit regularization of {\em the $\ell_2$ distance from the initialization}. We also provide theoretical arguments that further highlight the need for initialization-dependent notions of model capacity. We leave as open questions how and why distance from initialization is regularized, and whether it is sufficient to explain generalization. |
Tasks | |
Published | 2019-01-07 |
URL | http://arxiv.org/abs/1901.01672v2 |
http://arxiv.org/pdf/1901.01672v2.pdf | |
PWC | https://paperswithcode.com/paper/generalization-in-deep-networks-the-role-of |
Repo | |
Framework | |
Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge
Title | Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge |
Authors | Hossein Zeinali, Themos Stafylakis, Georgia Athanasopoulou, Johan Rohdin, Ioannis Gkinis, Lukáš Burget, Jan “Honza’’ Černocký |
Abstract | In this paper, we present the system description of the joint efforts of Brno University of Technology (BUT) and Omilia – Conversational Intelligence for the ASVSpoof2019 Spoofing and Countermeasures Challenge. The primary submission for Physical access (PA) is a fusion of two VGG networks, trained on single and two-channels features. For Logical access (LA), our primary system is a fusion of VGG and the recently introduced SincNet architecture. The results on PA show that the proposed networks yield very competitive performance in all conditions and achieved 86:% relative improvement compared to the official baseline. On the other hand, the results on LA showed that although the proposed architecture and training strategy performs very well on certain spoofing attacks, it fails to generalize to certain attacks that are unseen during training. |
Tasks | |
Published | 2019-07-13 |
URL | https://arxiv.org/abs/1907.12908v1 |
https://arxiv.org/pdf/1907.12908v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-spoofing-attacks-using-vgg-and |
Repo | |
Framework | |
Machine Learning-Based Adaptive Receive Filtering: Proof-of-Concept on an SDR Platform
Title | Machine Learning-Based Adaptive Receive Filtering: Proof-of-Concept on an SDR Platform |
Authors | Matthias Mehlhose, Daniyal Amir Awany, Renato L. G. Cavalcante, Martin Kurras, Slawomir Stanczak |
Abstract | Conventional multiuser detection techniques either require a large number of antennas at the receiver for a desired performance, or they are too complex for practical implementation. Moreover, many of these techniques, such as successive interference cancellation (SIC), suffer from errors in parameter estimation (user channels, covariance matrix, noise variance, etc.) that is performed before detection of user data symbols. As an alternative to conventional methods, this paper proposes and demonstrates a low-complexity practical Machine Learning (ML) based receiver that achieves similar (and at times better) performance to the SIC receiver. The proposed receiver does not require parameter estimation; instead it uses supervised learning to detect the user modulation symbols directly. We perform comparisons with minimum mean square error (MMSE) and SIC receivers in terms of symbol error rate (SER) and complexity. |
Tasks | |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04291v1 |
https://arxiv.org/pdf/1911.04291v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-based-adaptive-receive |
Repo | |
Framework | |
Physical Integrity Attack Detection of Surveillance Camera with Deep Learning Based Video Frame Interpolation
Title | Physical Integrity Attack Detection of Surveillance Camera with Deep Learning Based Video Frame Interpolation |
Authors | Jonathan Pan |
Abstract | Surveillance cameras, which is a form of Cyber Physical System, are deployed extensively to provide visual surveillance monitoring of activities of interest or anomalies. However, these cameras are at risks of physical security attacks against their physical attributes or configuration like tampering of their recording coverage, camera positions or recording configurations like focus and zoom factors. Such adversarial alteration of physical configuration could also be invoked through cyber security attacks against the camera’s software vulnerabilities to administratively change the camera’s physical configuration settings. When such Cyber Physical attacks occur, they affect the integrity of the targeted cameras that would in turn render these cameras ineffective in fulfilling the intended security functions. There is a significant measure of research work in detection mechanisms of cyber-attacks against these Cyber Physical devices, however it is understudied area with such mechanisms against integrity attacks on physical configuration. This research proposes the use of the novel use of deep learning algorithms to detect such physical attacks originating from cyber or physical spaces. Additionally, we proposed the novel use of deep learning-based video frame interpolation for such detection that has comparatively better performance to other anomaly detectors in spatiotemporal environments. |
Tasks | Video Frame Interpolation |
Published | 2019-06-15 |
URL | https://arxiv.org/abs/1906.06475v1 |
https://arxiv.org/pdf/1906.06475v1.pdf | |
PWC | https://paperswithcode.com/paper/physical-integrity-attack-detection-of |
Repo | |
Framework | |
High dimensional regression for regenerative time-series: an application to road traffic modeling
Title | High dimensional regression for regenerative time-series: an application to road traffic modeling |
Authors | Mohammed Bouchouia, François Portier |
Abstract | This paper investigates statistical models for road traffic modeling. The proposed methodology considers road traffic as a (i) highdimensional time-series for which (ii) regeneration occurs at the end of each day. Since (ii), prediction is based on a daily modeling of the road traffic using a vector autoregressive model that combines linearly the past observations of the day. Considering (i), the learning algorithm follows from an l1-penalization of the regression coefficients. Excess risk bounds are established under the high-dimensional framework in which the number of road sections goes to infinity with the number of observed days. Considering floating car data observed in an urban area, the approach is compared to state-of-the-art methods including neural networks. In addition of being very competitive in terms of prediction, it enables to identify the most determinant sections of the road network. |
Tasks | Time Series |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.11095v3 |
https://arxiv.org/pdf/1910.11095v3.pdf | |
PWC | https://paperswithcode.com/paper/high-dimensional-regression-for-regenerative |
Repo | |
Framework | |
Active Generative Adversarial Network for Image Classification
Title | Active Generative Adversarial Network for Image Classification |
Authors | Quan Kong, Bin Tong, Martin Klinkigt, Yuki Watanabe, Naoto Akira, Tomokazu Murakami |
Abstract | Sufficient supervised information is crucial for any machine learning models to boost performance. However, labeling data is expensive and sometimes difficult to obtain. Active learning is an approach to acquire annotations for data from a human oracle by selecting informative samples with a high probability to enhance performance. In recent emerging studies, a generative adversarial network (GAN) has been integrated with active learning to generate good candidates to be presented to the oracle. In this paper, we propose a novel model that is able to obtain labels for data in a cheaper manner without the need to query an oracle. In the model, a novel reward for each sample is devised to measure the degree of uncertainty, which is obtained from a classifier trained with existing labeled data. This reward is used to guide a conditional GAN to generate informative samples with a higher probability for a certain label. With extensive evaluations, we have confirmed the effectiveness of the model, showing that the generated samples are capable of improving the classification performance in popular image classification tasks. |
Tasks | Active Learning, Image Classification |
Published | 2019-06-17 |
URL | https://arxiv.org/abs/1906.07133v1 |
https://arxiv.org/pdf/1906.07133v1.pdf | |
PWC | https://paperswithcode.com/paper/active-generative-adversarial-network-for |
Repo | |
Framework | |
Why Didn’t You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models
Title | Why Didn’t You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models |
Authors | Varun Kumar, Alison Smith-Renner, Leah Findlater, Kevin Seppi, Jordan Boyd-Graber |
Abstract | To address the lack of comparative evaluation of Human-in-the-Loop Topic Modeling (HLTM) systems, we implement and evaluate three contrasting HLTM modeling approaches using simulation experiments. These approaches extend previously proposed frameworks, including constraints and informed prior-based methods. Users should have a sense of control in HLTM systems, so we propose a control metric to measure whether refinement operations’ results match users’ expectations. Informed prior-based methods provide better control than constraints, but constraints yield higher quality topics. |
Tasks | Topic Models |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09864v2 |
https://arxiv.org/pdf/1905.09864v2.pdf | |
PWC | https://paperswithcode.com/paper/why-didnt-you-listen-to-me-comparing-user |
Repo | |
Framework | |