Paper Group ANR 485
HEAD-QA: A Healthcare Dataset for Complex Reasoning. Statistical Agnostic Mapping: a Framework in Neuroimaging based on Concentration Inequalities. Modelling Instance-Level Annotator Reliability for Natural Language Labelling Tasks. SGD: Decentralized Byzantine Resilience. EPNAS: Efficient Progressive Neural Architecture Search. Sensor Fusion: Gate …
HEAD-QA: A Healthcare Dataset for Complex Reasoning
Title | HEAD-QA: A Healthcare Dataset for Complex Reasoning |
Authors | David Vilares, Carlos Gómez-Rodríguez |
Abstract | We present HEAD-QA, a multi-choice question answering testbed to encourage research on complex reasoning. The questions come from exams to access a specialized position in the Spanish healthcare system, and are challenging even for highly specialized humans. We then consider monolingual (Spanish) and cross-lingual (to English) experiments with information retrieval and neural techniques. We show that: (i) HEAD-QA challenges current methods, and (ii) the results lag well behind human performance, demonstrating its usefulness as a benchmark for future work. |
Tasks | Information Retrieval, Question Answering |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04701v1 |
https://arxiv.org/pdf/1906.04701v1.pdf | |
PWC | https://paperswithcode.com/paper/head-qa-a-healthcare-dataset-for-complex |
Repo | |
Framework | |
Statistical Agnostic Mapping: a Framework in Neuroimaging based on Concentration Inequalities
Title | Statistical Agnostic Mapping: a Framework in Neuroimaging based on Concentration Inequalities |
Authors | J M Gorriz, SiPBA Group, CAM neuroscience |
Abstract | In the 70s a novel branch of statistics emerged focusing its effort in selecting a function in the pattern recognition problem, which fulfils a definite relationship between the quality of the approximation and its complexity. These data-driven approaches are mainly devoted to problems of estimating dependencies with limited sample sizes and comprise all the empirical out-of sample generalization approaches, e.g. cross validation (CV) approaches. Although the latter are \emph{not designed for testing competing hypothesis or comparing different models} in neuroimaging, there are a number of theoretical developments within this theory which could be employed to derive a Statistical Agnostic (non-parametric) Mapping (SAM) at voxel or multi-voxel level. Moreover, SAMs could relieve i) the problem of instability in limited sample sizes when estimating the actual risk via the CV approaches, e.g. large error bars, and provide ii) an alternative way of Family-wise-error (FWE) corrected p-value maps in inferential statistics for hypothesis testing. In this sense, we propose a novel framework in neuroimaging based on concentration inequalities, which results in (i) a rigorous development for model validation with a small sample/dimension ratio, and (ii) a less-conservative procedure than FWE p-value correction, to determine the brain significance maps from the inferences made using small upper bounds of the actual risk. |
Tasks | |
Published | 2019-12-27 |
URL | https://arxiv.org/abs/1912.12274v1 |
https://arxiv.org/pdf/1912.12274v1.pdf | |
PWC | https://paperswithcode.com/paper/statistical-agnostic-mapping-a-framework-in |
Repo | |
Framework | |
Modelling Instance-Level Annotator Reliability for Natural Language Labelling Tasks
Title | Modelling Instance-Level Annotator Reliability for Natural Language Labelling Tasks |
Authors | Maolin Li, Arvid Fahlström Myrman, Tingting Mu, Sophia Ananiadou |
Abstract | When constructing models that learn from noisy labels produced by multiple annotators, it is important to accurately estimate the reliability of annotators. Annotators may provide labels of inconsistent quality due to their varying expertise and reliability in a domain. Previous studies have mostly focused on estimating each annotator’s overall reliability on the entire annotation task. However, in practice, the reliability of an annotator may depend on each specific instance. Only a limited number of studies have investigated modelling per-instance reliability and these only considered binary labels. In this paper, we propose an unsupervised model which can handle both binary and multi-class labels. It can automatically estimate the per-instance reliability of each annotator and the correct label for each instance. We specify our model as a probabilistic model which incorporates neural networks to model the dependency between latent variables and instances. For evaluation, the proposed method is applied to both synthetic and real data, including two labelling tasks: text classification and textual entailment. Experimental results demonstrate our novel method can not only accurately estimate the reliability of annotators across different instances, but also achieve superior performance in predicting the correct labels and detecting the least reliable annotators compared to state-of-the-art baselines. |
Tasks | Natural Language Inference, Text Classification |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.04981v1 |
https://arxiv.org/pdf/1905.04981v1.pdf | |
PWC | https://paperswithcode.com/paper/modelling-instance-level-annotator |
Repo | |
Framework | |
SGD: Decentralized Byzantine Resilience
Title | SGD: Decentralized Byzantine Resilience |
Authors | El-Mahdi El-Mhamdi, Rachid Guerraoui, Arsany Guirguis, Sebastien Rouault |
Abstract | The size of the datasets available today leads to distribute Machine Learning (ML) tasks. An SGD–based optimization is for instance typically carried out by two categories of participants: parameter servers and workers. Some of these nodes can sometimes behave arbitrarily (called \emph{Byzantine} and caused by corrupt/bogus data/machines), impacting the accuracy of the entire learning activity. Several approaches recently studied how to tolerate Byzantine workers, while assuming honest and trusted parameter servers. In order to achieve total ML robustness, we introduce GuanYu, the first algorithm (to the best of our knowledge) to handle Byzantine parameter servers as well as Byzantine workers. We prove that GuanYu ensures convergence against $\frac{1}{3}$ Byzantine parameter servers and $\frac{1}{3}$ Byzantine workers, which is optimal in asynchronous networks (GuanYu does also tolerate unbounded communication delays, i.e.\ asynchrony). To prove the Byzantine resilience of GuanYu, we use a contraction argument, leveraging geometric properties of the median in high dimensional spaces to prevent (with probability 1) any drift on the models within each of the non-Byzantine servers. % To convey its practicality, we implemented GuanYu using the low-level TensorFlow APIs and deployed it in a distributed setup using the CIFAR-10 dataset. The overhead of tolerating Byzantine participants, compared to a vanilla TensorFlow deployment that is vulnerable to a single Byzantine participant, is around 30% in terms of throughput (model updates per second) - while maintaining the same convergence rate (model updates required to reach some accuracy). |
Tasks | |
Published | 2019-05-05 |
URL | https://arxiv.org/abs/1905.03853v1 |
https://arxiv.org/pdf/1905.03853v1.pdf | |
PWC | https://paperswithcode.com/paper/190503853 |
Repo | |
Framework | |
EPNAS: Efficient Progressive Neural Architecture Search
Title | EPNAS: Efficient Progressive Neural Architecture Search |
Authors | Yanqi Zhou, Peng Wang, Sercan Arik, Haonan Yu, Syed Zawad, Feng Yan, Greg Diamos |
Abstract | In this paper, we propose Efficient Progressive Neural Architecture Search (EPNAS), a neural architecture search (NAS) that efficiently handles large search space through a novel progressive search policy with performance prediction based on REINFORCE~\cite{Williams.1992.PG}. EPNAS is designed to search target networks in parallel, which is more scalable on parallel systems such as GPU/TPU clusters. More importantly, EPNAS can be generalized to architecture search with multiple resource constraints, \eg, model size, compute complexity or intensity, which is crucial for deployment in widespread platforms such as mobile and cloud. We compare EPNAS against other state-of-the-art (SoTA) network architectures (\eg, MobileNetV2~\cite{mobilenetv2}) and efficient NAS algorithms (\eg, ENAS~\cite{pham2018efficient}, and PNAS~\cite{Liu2017b}) on image recognition tasks using CIFAR10 and ImageNet. On both datasets, EPNAS is superior \wrt architecture searching speed and recognition accuracy. |
Tasks | Neural Architecture Search |
Published | 2019-07-07 |
URL | https://arxiv.org/abs/1907.04648v1 |
https://arxiv.org/pdf/1907.04648v1.pdf | |
PWC | https://paperswithcode.com/paper/epnas-efficient-progressive-neural |
Repo | |
Framework | |
Sensor Fusion: Gated Recurrent Fusion to Learn Driving Behavior from Temporal Multimodal Data
Title | Sensor Fusion: Gated Recurrent Fusion to Learn Driving Behavior from Temporal Multimodal Data |
Authors | Athma Narayanan, Avinash Siravuru, Behzad Dariush |
Abstract | The Tactical Driver Behavior modeling problem requires understanding of driver actions in complicated urban scenarios from a rich multi modal signals including video, LiDAR and CAN bus data streams. However, the majority of deep learning research is focused either on learning the vehicle/environment state (sensor fusion) or the driver policy (from temporal data), but not both. Learning both tasks end-to-end offers the richest distillation of knowledge, but presents challenges in formulation and successful training. In this work, we propose promising first steps in this direction. Inspired by the gating mechanisms in LSTM, we propose gated recurrent fusion units (GRFU) that learn fusion weighting and temporal weighting simultaneously. We demonstrate it’s superior performance over multimodal and temporal baselines in supervised regression and classification tasks, all in the realm of autonomous navigation. We note a 10% improvement in the mAP score over state-of-the-art for tactical driver behavior classification in HDD dataset and a 20% drop in overall Mean squared error for steering action regression on TORCS dataset. |
Tasks | Autonomous Navigation, Sensor Fusion |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00628v2 |
https://arxiv.org/pdf/1910.00628v2.pdf | |
PWC | https://paperswithcode.com/paper/temporal-multimodal-fusion-for-driver |
Repo | |
Framework | |
Prediction stability as a criterion in active learning
Title | Prediction stability as a criterion in active learning |
Authors | Junyu Liu, Xiang Li, Jin Wang, Jiqiang Zhou, Jianxiong Shen |
Abstract | Recent breakthroughs made by deep learning rely heavily on large number of annotated samples. To overcome this shortcoming, active learning is a possible solution. Beside the previous active learning algorithms that only adopted information after training, we propose a new class of method based on the information during training, named sequential-based method. An specific criterion of active learning called prediction stability is proposed to prove the feasibility of sequential-based methods. Experiments are made on CIFAR-10 and CIFAR-100, and the results indicates that prediction stability is effective and works well on fewer-labeled datasets. Prediction stability reaches the accuracy of traditional acquisition functions like entropy on CIFAR-10, and notably outperforms them on CIFAR-100. |
Tasks | Active Learning |
Published | 2019-10-27 |
URL | https://arxiv.org/abs/1910.12246v1 |
https://arxiv.org/pdf/1910.12246v1.pdf | |
PWC | https://paperswithcode.com/paper/prediction-stability-as-a-criterion-in-active |
Repo | |
Framework | |
A general method for regularizing tensor decomposition methods via pseudo-data
Title | A general method for regularizing tensor decomposition methods via pseudo-data |
Authors | Omer Gottesman, Weiwei Pan, Finale Doshi-Velez |
Abstract | Tensor decomposition methods allow us to learn the parameters of latent variable models through decomposition of low-order moments of data. A significant limitation of these algorithms is that there exists no general method to regularize them, and in the past regularization has mostly been performed using bespoke modifications to the algorithms, tailored for the particular form of the desired regularizer. We present a general method of regularizing tensor decomposition methods which can be used for any likelihood model that is learnable using tensor decomposition methods and any differentiable regularization function by supplementing the training data with pseudo-data. The pseudo-data is optimized to balance two terms: being as close as possible to the true data and enforcing the desired regularization. On synthetic, semi-synthetic and real data, we demonstrate that our method can improve inference accuracy and regularize for a broad range of goals including transfer learning, sparsity, interpretability, and orthogonality of the learned parameters. |
Tasks | Latent Variable Models, Transfer Learning |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10424v1 |
https://arxiv.org/pdf/1905.10424v1.pdf | |
PWC | https://paperswithcode.com/paper/a-general-method-for-regularizing-tensor |
Repo | |
Framework | |
The Truth and Nothing but the Truth: Multimodal Analysis for Deception Detection
Title | The Truth and Nothing but the Truth: Multimodal Analysis for Deception Detection |
Authors | Mimansa Jaiswal, Sairam Tabibu, Rajiv Bajpai |
Abstract | We propose a data-driven method for automatic deception detection in real-life trial data using visual and verbal cues. Using OpenFace with facial action unit recognition, we analyze the movement of facial features of the witness when posed with questions and the acoustic patterns using OpenSmile. We then perform a lexical analysis on the spoken words, emphasizing the use of pauses and utterance breaks, feeding that to a Support Vector Machine to test deceit or truth prediction. We then try out a method to incorporate utterance-based fusion of visual and lexical analysis, using string based matching. |
Tasks | Deception Detection, Facial Action Unit Detection, Lexical Analysis |
Published | 2019-03-11 |
URL | http://arxiv.org/abs/1903.04484v1 |
http://arxiv.org/pdf/1903.04484v1.pdf | |
PWC | https://paperswithcode.com/paper/the-truth-and-nothing-but-the-truth |
Repo | |
Framework | |
A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings
Title | A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings |
Authors | Wei Yang, Wei Lu, Vincent W. Zheng |
Abstract | Learning word embeddings has received a significant amount of attention recently. Often, word embeddings are learned in an unsupervised manner from a large collection of text. The genre of the text typically plays an important role in the effectiveness of the resulting embeddings. How to effectively train word embedding models using data from different domains remains a problem that is underexplored. In this paper, we present a simple yet effective method for learning word embeddings based on text from different domains. We demonstrate the effectiveness of our approach through extensive experiments on various down-stream NLP tasks. |
Tasks | Learning Word Embeddings, Word Embeddings |
Published | 2019-02-01 |
URL | http://arxiv.org/abs/1902.00184v1 |
http://arxiv.org/pdf/1902.00184v1.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-regularization-based-algorithm-for |
Repo | |
Framework | |
Transformed Subspace Clustering
Title | Transformed Subspace Clustering |
Authors | Jyoti Maggu, Angshul Majumdar, Emilie Chouzenoux |
Abstract | Subspace clustering assumes that the data is sepa-rable into separate subspaces. Such a simple as-sumption, does not always hold. We assume that, even if the raw data is not separable into subspac-es, one can learn a representation (transform coef-ficients) such that the learnt representation is sep-arable into subspaces. To achieve the intended goal, we embed subspace clustering techniques (locally linear manifold clustering, sparse sub-space clustering and low rank representation) into transform learning. The entire formulation is jointly learnt; giving rise to a new class of meth-ods called transformed subspace clustering (TSC). In order to account for non-linearity, ker-nelized extensions of TSC are also proposed. To test the performance of the proposed techniques, benchmarking is performed on image clustering and document clustering datasets. Comparison with state-of-the-art clustering techniques shows that our formulation improves upon them. |
Tasks | Image Clustering |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.04734v1 |
https://arxiv.org/pdf/1912.04734v1.pdf | |
PWC | https://paperswithcode.com/paper/transformed-subspace-clustering |
Repo | |
Framework | |
Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences
Title | Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences |
Authors | Shizhe Chen, Bei Liu, Jianlong Fu, Ruihua Song, Qin Jin, Pingping Lin, Xiaoyu Qi, Chunting Wang, Jin Zhou |
Abstract | A storyboard is a sequence of images to illustrate a story containing multiple sentences, which has been a key process to create different story products. In this paper, we tackle a new multimedia task of automatic storyboard creation to facilitate this process and inspire human artists. Inspired by the fact that our understanding of languages is based on our past experience, we propose a novel inspire-and-create framework with a story-to-image retriever that selects relevant cinematic images for inspiration and a storyboard creator that further refines and renders images to improve the relevancy and visual consistency. The proposed retriever dynamically employs contextual information in the story with hierarchical attentions and applies dense visual-semantic matching to accurately retrieve and ground images. The creator then employs three rendering steps to increase the flexibility of retrieved images, which include erasing irrelevant regions, unifying styles of images and substituting consistent characters. We carry out extensive experiments on both in-domain and out-of-domain visual story datasets. The proposed model achieves better quantitative performance than the state-of-the-art baselines for storyboard creation. Qualitative visualizations and user studies further verify that our approach can create high-quality storyboards even for stories in the wild. |
Tasks | |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10460v1 |
https://arxiv.org/pdf/1911.10460v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-storyboard-artist-visualizing-stories |
Repo | |
Framework | |
Improving Supervised Phase Identification Through the Theory of Information Losses
Title | Improving Supervised Phase Identification Through the Theory of Information Losses |
Authors | Brandon Foggo, Nanpeng Yu |
Abstract | This paper considers the problem of Phase Identification in power distribution systems. In particular, it focuses on improving supervised learning accuracies by focusing on exploiting some of the problem’s information theoretic properties. This focus, along with recent advances in Information Theoretic Machine Learning (ITML), helps us to create two new techniques. The first transforms a bound on information losses into a data selection technique. This is important because phase identification data labels are difficult to obtain in practice. The second interprets the properties of distribution systems in the terms of ITML. This allows us to obtain an improvement in the representation learned by any classifier applied to the problem. We tested these two techniques experimentally on real datasets and have found that they yield phenomenal performance in every case. In the most extreme case, they improve phase identification accuracy from $51.7%$ to $97.3%$. Furthermore, since many problems share the physical properties of phase identification exploited in this paper, the techniques can be applied to a wide range of similar problems. |
Tasks | |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01484v1 |
https://arxiv.org/pdf/1911.01484v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-supervised-phase-identification |
Repo | |
Framework | |
Revisiting Explicit Negation in Answer Set Programming
Title | Revisiting Explicit Negation in Answer Set Programming |
Authors | Felicidad Aguado, Pedro Cabalar, Jorge Fandinno, David Pearce, Gilberto Perez, Concepcion Vidal |
Abstract | A common feature in Answer Set Programming is the use of a second negation, stronger than default negation and sometimes called explicit, strong or classical negation. This explicit negation is normally used in front of atoms, rather than allowing its use as a regular operator. In this paper we consider the arbitrary combination of explicit negation with nested expressions, as those defined by Lifschitz, Tang and Turner. We extend the concept of reduct for this new syntax and then prove that it can be captured by an extension of Equilibrium Logic with this second negation. We study some properties of this variant and compare to the already known combination of Equilibrium Logic with Nelson’s strong negation. Under consideration for acceptance in TPLP. |
Tasks | |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1907.11467v1 |
https://arxiv.org/pdf/1907.11467v1.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-explicit-negation-in-answer-set |
Repo | |
Framework | |
Personalized Treatment for Coronary Artery Disease Patients: A Machine Learning Approach
Title | Personalized Treatment for Coronary Artery Disease Patients: A Machine Learning Approach |
Authors | Dimitris Bertsimas, Agni Orfanoudaki, Rory B. Weiner |
Abstract | Current clinical practice guidelines for managing Coronary Artery Disease (CAD) account for general cardiovascular risk factors. However, they do not present a framework that considers personalized patient-specific characteristics. Using the electronic health records of 21,460 patients, we created data-driven models for personalized CAD management that significantly improve health outcomes relative to the standard of care. We develop binary classifiers to detect whether a patient will experience an adverse event due to CAD within a 10-year time frame. Combining the patients’ medical history and clinical examination results, we achieve 81.5% AUC. For each treatment, we also create a series of regression models that are based on different supervised machine learning algorithms. We are able to estimate with average R squared = 0.801 the time from diagnosis to a potential adverse event (TAE) and gain accurate approximations of the counterfactual treatment effects. Leveraging combinations of these models, we present ML4CAD, a novel personalized prescriptive algorithm. Considering the recommendations of multiple predictive models at once, ML4CAD identifies for every patient the therapy with the best expected outcome using a voting mechanism. We evaluate its performance by measuring the prescription effectiveness and robustness under alternative ground truths. We show that our methodology improves the expected TAE upon the current baseline by 24.11%, increasing it from 4.56 to 5.66 years. The algorithm performs particularly well for the male (24.3% improvement) and Hispanic (58.41% improvement) subpopulations. Finally, we create an interactive interface, providing physicians with an intuitive, accurate, readily implementable, and effective tool. |
Tasks | |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.08483v1 |
https://arxiv.org/pdf/1910.08483v1.pdf | |
PWC | https://paperswithcode.com/paper/personalized-treatment-for-coronary-artery |
Repo | |
Framework | |