Paper Group ANR 519
Joint Learning of Semantic Alignment and Object Landmark Detection. Analysing Russian Trolls via NLP tools. QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions. Complexity Results and Algorithms for Bipolar Argumentation. Unsupervised Object Segmentation with Explicit Localization Module. A data-efficient geometrically inspired pol …
Joint Learning of Semantic Alignment and Object Landmark Detection
Title | Joint Learning of Semantic Alignment and Object Landmark Detection |
Authors | Sangryul Jeon, Dongbo Min, Seungryong Kim, Kwanghoon Sohn |
Abstract | Convolutional neural networks (CNNs) based approaches for semantic alignment and object landmark detection have improved their performance significantly. Current efforts for the two tasks focus on addressing the lack of massive training data through weakly- or unsupervised learning frameworks. In this paper, we present a joint learning approach for obtaining dense correspondences and discovering object landmarks from semantically similar images. Based on the key insight that the two tasks can mutually provide supervisions to each other, our networks accomplish this through a joint loss function that alternatively imposes a consistency constraint between the two tasks, thereby boosting the performance and addressing the lack of training data in a principled manner. To the best of our knowledge, this is the first attempt to address the lack of training data for the two tasks through the joint learning. To further improve the robustness of our framework, we introduce a probabilistic learning formulation that allows only reliable matches to be used in the joint learning process. With the proposed method, state-of-the-art performance is attained on several standard benchmarks for semantic matching and landmark detection, including a newly introduced dataset, JLAD, which contains larger number of challenging image pairs than existing datasets. |
Tasks | |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.00754v1 |
https://arxiv.org/pdf/1910.00754v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-learning-of-semantic-alignment-and |
Repo | |
Framework | |
Analysing Russian Trolls via NLP tools
Title | Analysing Russian Trolls via NLP tools |
Authors | Bokun Kong |
Abstract | The fifty-eighth American presidential election in 2016 still arouse fierce controversyat present. A portion of politicians as well as medium and voters believe that theRussian government interfered with the election of 2016 by controlling malicioussocial media accounts on twitter, such as trolls and bots accounts. Both of them willbroadcast fake news, derail the conversations about election, and mislead people.Therefore, this paper will focus on analysing some of the twitter dataset about theelection of 2016 by using NLP methods and looking for some interesting patterns ofwhether the Russian government interfered with the election or not. We apply topicmodel on the given twitter dataset to extract some interesting topics and analysethe meaning, then we implement supervised topic model to retrieve the relationshipbetween topics to category which is left troll or right troll, and analyse the pattern.Additionally, we will do sentiment analysis to analyse the attitude of the tweet. Afterextracting typical tweets from interesting topic, sentiment analysis offers the ability toknow whether the tweet supports this topic or not. Based on comprehensive analysisand evaluation, we find interesting patterns of the dataset as well as some meaningfultopics. |
Tasks | Sentiment Analysis |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.11067v1 |
https://arxiv.org/pdf/1911.11067v1.pdf | |
PWC | https://paperswithcode.com/paper/analysing-russian-trolls-via-nlp-tools |
Repo | |
Framework | |
QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions
Title | QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions |
Authors | Oyvind Tafjord, Matt Gardner, Kevin Lin, Peter Clark |
Abstract | We introduce the first open-domain dataset, called QuaRTz, for reasoning about textual qualitative relationships. QuaRTz contains general qualitative statements, e.g., “A sunscreen with a higher SPF protects the skin longer.", twinned with 3864 crowdsourced situated questions, e.g., “Billy is wearing sunscreen with a lower SPF than Lucy. Who will be best protected from the sun?", plus annotations of the properties being compared. Unlike previous datasets, the general knowledge is textual and not tied to a fixed set of relationships, and tests a system’s ability to comprehend and apply textual qualitative knowledge in a novel setting. We find state-of-the-art results are substantially (20%) below human performance, presenting an open challenge to the NLP community. |
Tasks | |
Published | 2019-09-08 |
URL | https://arxiv.org/abs/1909.03553v1 |
https://arxiv.org/pdf/1909.03553v1.pdf | |
PWC | https://paperswithcode.com/paper/quartz-an-open-domain-dataset-of-qualitative |
Repo | |
Framework | |
Complexity Results and Algorithms for Bipolar Argumentation
Title | Complexity Results and Algorithms for Bipolar Argumentation |
Authors | Amin Karamlou, Kristijonas Čyras, Francesca Toni |
Abstract | Bipolar Argumentation Frameworks (BAFs) admit several interpretations of the support relation and diverging definitions of semantics. Recently, several classes of BAFs have been captured as instances of bipolar Assumption-Based Argumentation, a class of Assumption-Based Argumentation (ABA). In this paper, we establish the complexity of bipolar ABA, and consequently of several classes of BAFs. In addition to the standard five complexity problems, we analyse the rarely-addressed extension enumeration problem too. We also advance backtracking-driven algorithms for enumerating extensions of bipolar ABA frameworks, and consequently of BAFs under several interpretations. We prove soundness and completeness of our algorithms, describe their implementation and provide a scalability evaluation. We thus contribute to the study of the as yet uninvestigated complexity problems of (variously interpreted) BAFs as well as of bipolar ABA, and provide the lacking implementations thereof. |
Tasks | |
Published | 2019-03-05 |
URL | http://arxiv.org/abs/1903.01964v1 |
http://arxiv.org/pdf/1903.01964v1.pdf | |
PWC | https://paperswithcode.com/paper/complexity-results-and-algorithms-for-bipolar |
Repo | |
Framework | |
Unsupervised Object Segmentation with Explicit Localization Module
Title | Unsupervised Object Segmentation with Explicit Localization Module |
Authors | Weitang Liu, Lifeng Wei, James Sharpnack, John D. Owens |
Abstract | In this paper, we propose a novel architecture that iteratively discovers and segments out the objects of a scene based on the image reconstruction quality. Different from other approaches, our model uses an explicit localization module that localizes objects of the scene based on the pixel-level reconstruction qualities at each iteration, where simpler objects tend to be reconstructed better at earlier iterations and thus are segmented out first. We show that our localization module improves the quality of the segmentation, especially on a challenging background. |
Tasks | Image Reconstruction, Semantic Segmentation |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09228v1 |
https://arxiv.org/pdf/1911.09228v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-object-segmentation-with |
Repo | |
Framework | |
A data-efficient geometrically inspired polynomial kernel for robot inverse dynamics
Title | A data-efficient geometrically inspired polynomial kernel for robot inverse dynamics |
Authors | Alberto Dalla Libera, Ruggero Carli |
Abstract | In this paper, we introduce a novel data-driven inverse dynamics estimator based on Gaussian Process Regression. Driven by the fact that the inverse dynamics can be described as a polynomial function on a suitable input space, we propose the use of a novel kernel, called Geometrically Inspired Polynomial Kernel (GIP). The resulting estimator behaves similarly to model-based approaches as concerns data efficiency. Indeed, we proved that the GIP kernel defines a finite-dimensional Reproducing Kernel Hilbert Space that contains the inverse dynamics function computed through the Rigid Body Dynamics. The proposed kernel is based on the recently introduced Multiplicative Polynomial Kernel, a redefinition of the classical polynomial kernel equipped with a set of parameters that allows for a higher regularization. We tested the proposed approach in a simulated environment, and also in real experiments with a UR10 robot. The obtained results confirm that, compared to other data-driven estimators, the proposed approach is more data-efficient and exhibits better generalization properties. Instead, with respect to model-based estimators, our approach requires less prior information and is not affected by model bias. |
Tasks | |
Published | 2019-04-30 |
URL | https://arxiv.org/abs/1904.13317v4 |
https://arxiv.org/pdf/1904.13317v4.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-geometrically-inspired-polynomial |
Repo | |
Framework | |
Approximate Cross-Validation in High Dimensions with Guarantees
Title | Approximate Cross-Validation in High Dimensions with Guarantees |
Authors | William Stephenson, Tamara Broderick |
Abstract | Leave-one-out cross validation (LOOCV) can be particularly accurate among cross validation (CV) variants for estimating out-of-sample error. But it is expensive to re-fit a model $N$ times for a dataset of size $N$. Previous work has shown that approximations to LOOCV can be both fast and accurate – when the unknown parameter is of small, fixed dimension. However, these approximations incur a running time roughly cubic in dimension – and we show that, even when computed perfectly, their accuracy dramatically deteriorates in high dimensions. Authors have suggested many potential and seemingly intuitive solutions, but these methods have not yet been systematically evaluated or compared. In our analysis, we find that all but one perform so poorly as to be unusable for approximating LOOCV. Crucially, though, we are able to show, both empirically and theoretically, that one approximation can perform well in high dimensions – in cases where the high-dimensional parameter exhibits sparsity. Under interpretable assumptions, our theory demonstrates that the problem can be reduced to working within an empirically recovered (small) support. The corresponding algorithm is straightforward to implement, and we prove that its running time and error depend on the (small) support size even when the full parameter dimension is large. |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13657v3 |
https://arxiv.org/pdf/1905.13657v3.pdf | |
PWC | https://paperswithcode.com/paper/sparse-approximate-cross-validation-for-high |
Repo | |
Framework | |
DAWN: Dynamic Adversarial Watermarking of Neural Networks
Title | DAWN: Dynamic Adversarial Watermarking of Neural Networks |
Authors | Sebastian Szyller, Buse Gul Atli, Samuel Marchal, N. Asokan |
Abstract | Training machine learning (ML) models is expensive in terms of computational power, amounts of labeled data and human expertise. Thus, ML models constitute intellectual property (IP) and business value for their owners. Embedding digital watermarks during model training allows a model owner to later identify their models in case of theft or misuse. However, model functionality can also be stolen via model extraction, where an adversary trains a surrogate model using results returned from a prediction API of the original model. Recent work has shown that model extraction is a realistic threat. Existing watermarking schemes are ineffective against IP theft via model extraction since it is the adversary who trains the surrogate model. In this paper, we introduce DAWN (Dynamic Adversarial Watermarking of Neural Networks), the first approach to use watermarking to deter model extraction IP theft. Unlike prior watermarking schemes, DAWN does not impose changes to the training process but it operates at the prediction API of the protected model, by dynamically changing the responses for a small subset of queries (e.g., <0.5%) from API clients. This set is a watermark that will be embedded in case a client uses its queries to train a surrogate model. We show that DAWN is resilient against two state-of-the-art model extraction attacks, effectively watermarking all extracted surrogate models, allowing model owners to reliably demonstrate ownership (with confidence $>1- 2^{-64}$), incurring negligible loss of prediction accuracy (0.03-0.5%). |
Tasks | |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00830v3 |
https://arxiv.org/pdf/1906.00830v3.pdf | |
PWC | https://paperswithcode.com/paper/190600830 |
Repo | |
Framework | |
Deep learning analysis of coronary arteries in cardiac CT angiography for detection of patients requiring invasive coronary angiography
Title | Deep learning analysis of coronary arteries in cardiac CT angiography for detection of patients requiring invasive coronary angiography |
Authors | Majd Zreik, Robbert W. van Hamersvelt, Nadieh Khalili, Jelmer M. Wolterink, Michiel Voskuil, Max A. Viergever, Tim Leiner, Ivana Išgum |
Abstract | In patients with obstructive coronary artery disease, the functional significance of a coronary artery stenosis needs to be determined to guide treatment. This is typically established through fractional flow reserve (FFR) measurement, performed during invasive coronary angiography (ICA). We present a method for automatic and non-invasive detection of patients requiring ICA, employing deep unsupervised analysis of complete coronary arteries in cardiac CT angiography (CCTA) images. We retrospectively collected CCTA scans of 187 patients, 137 of them underwent invasive FFR measurement in 192 different coronary arteries. These FFR measurements served as a reference standard for the functional significance of the coronary stenosis. The centerlines of the coronary arteries were extracted and used to reconstruct straightened multi-planar reformatted (MPR) volumes. To automatically identify arteries with functionally significant stenosis that require ICA, each MPR volume was encoded into a fixed number of encodings using two disjoint 3D and 1D convolutional autoencoders performing spatial and sequential encodings, respectively. Thereafter, these encodings were employed to classify arteries using a support vector machine classifier. The detection of coronary arteries requiring invasive evaluation, evaluated using repeated cross-validation experiments, resulted in an area under the receiver operating characteristic curve of $0.81 \pm 0.02$ on the artery-level, and $0.87 \pm 0.02$ on the patient-level. The results demonstrate the feasibility of automatic non-invasive detection of patients that require ICA and possibly subsequent coronary artery intervention. This could potentially reduce the number of patients that unnecessarily undergo ICA. |
Tasks | |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04419v2 |
https://arxiv.org/pdf/1906.04419v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-analysis-of-cardiac-ct |
Repo | |
Framework | |
Probabilistic Radiomics: Ambiguous Diagnosis with Controllable Shape Analysis
Title | Probabilistic Radiomics: Ambiguous Diagnosis with Controllable Shape Analysis |
Authors | Jiancheng Yang, Rongyao Fang, Bingbing Ni, Yamin Li, Yi Xu, Linguo Li |
Abstract | Radiomics analysis has achieved great success in recent years. However, conventional Radiomics analysis suffers from insufficiently expressive hand-crafted features. Recently, emerging deep learning techniques, e.g., convolutional neural networks (CNNs), dominate recent research in Computer-Aided Diagnosis (CADx). Unfortunately, as black-box predictors, we argue that CNNs are “diagnosing” voxels (or pixels), rather than lesions; in other words, visual saliency from a trained CNN is not necessarily concentrated on the lesions. On the other hand, classification in clinical applications suffers from inherent ambiguities: radiologists may produce diverse diagnosis on challenging cases. To this end, we propose a controllable and explainable {\em Probabilistic Radiomics} framework, by combining the Radiomics analysis and probabilistic deep learning. In our framework, 3D CNN feature is extracted upon lesion region only, then encoded into lesion representation, by a controllable Non-local Shape Analysis Module (NSAM) based on self-attention. Inspired from variational auto-encoders (VAEs), an Ambiguity PriorNet is used to approximate the ambiguity distribution over human experts. The final diagnosis is obtained by combining the ambiguity prior sample and lesion representation, and the whole network named $DenseSharp^{+}$ is end-to-end trainable. We apply the proposed method on lung nodule diagnosis on LIDC-IDRI database to validate its effectiveness. |
Tasks | |
Published | 2019-10-20 |
URL | https://arxiv.org/abs/1910.08878v1 |
https://arxiv.org/pdf/1910.08878v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-radiomics-ambiguous-diagnosis |
Repo | |
Framework | |
Autoregressive Models for Sequences of Graphs
Title | Autoregressive Models for Sequences of Graphs |
Authors | Daniele Zambon, Daniele Grattarola, Lorenzo Livi, Cesare Alippi |
Abstract | This paper proposes an autoregressive (AR) model for sequences of graphs, which generalises traditional AR models. A first novelty consists in formalising the AR model for a very general family of graphs, characterised by a variable topology, and attributes associated with nodes and edges. A graph neural network (GNN) is also proposed to learn the AR function associated with the graph-generating process (GGP), and subsequently predict the next graph in a sequence. The proposed method is compared with four baselines on synthetic GGPs, denoting a significantly better performance on all considered problems. |
Tasks | |
Published | 2019-03-18 |
URL | http://arxiv.org/abs/1903.07299v1 |
http://arxiv.org/pdf/1903.07299v1.pdf | |
PWC | https://paperswithcode.com/paper/autoregressive-models-for-sequences-of-graphs |
Repo | |
Framework | |
Temporal-difference learning for nonlinear value function approximation in the lazy training regime
Title | Temporal-difference learning for nonlinear value function approximation in the lazy training regime |
Authors | Andrea Agazzi, Jianfeng Lu |
Abstract | We discuss the approximation of the value function for infinite-horizon discounted Markov Decision Processes (MDP) with nonlinear functions trained with Temporal-Difference (TD) learning algorithm. We consider this problem under a certain scaling of the approximating function, leading to a regime called lazy training. In this regime the parameters of the model vary only slightly during the learning process, a feature that has recently been observed in the training of neural networks, where the scaling we study arises naturally, implicit in the initialization of their parameters. Both in the under- and over-parametrized frameworks, we prove exponential convergence to local, respectively global minimizers of the above algorithm in the lazy training regime. We then give examples of such convergence results in the case of models that diverge if trained with non-lazy TD learning, and in the case of neural networks. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.10917v2 |
https://arxiv.org/pdf/1905.10917v2.pdf | |
PWC | https://paperswithcode.com/paper/temporal-difference-learning-for-nonlinear |
Repo | |
Framework | |
Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting
Title | Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting |
Authors | Maria De-Arteaga, Alexey Romanov, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, Adam Tauman Kalai |
Abstract | We present a large-scale study of gender bias in occupation classification, a task where the use of machine learning may lead to negative outcomes on peoples’ lives. We analyze the potential allocation harms that can result from semantic representation bias. To do so, we study the impact on occupation classification of including explicit gender indicators—such as first names and pronouns—in different semantic representations of online biographies. Additionally, we quantify the bias that remains when these indicators are “scrubbed,” and describe proxy behavior that occurs in the absence of explicit gender indicators. As we demonstrate, differences in true positive rates between genders are correlated with existing gender imbalances in occupations, which may compound these imbalances. |
Tasks | |
Published | 2019-01-27 |
URL | http://arxiv.org/abs/1901.09451v1 |
http://arxiv.org/pdf/1901.09451v1.pdf | |
PWC | https://paperswithcode.com/paper/bias-in-bios-a-case-study-of-semantic |
Repo | |
Framework | |
Merging versus Ensembling in Multi-Study Machine Learning: Theoretical Insight from Random Effects
Title | Merging versus Ensembling in Multi-Study Machine Learning: Theoretical Insight from Random Effects |
Authors | Zoe Guan, Giovanni Parmigiani, Prasad Patil |
Abstract | A critical decision point when training predictors using multiple studies is whether these studies should be combined or treated separately. We compare two multi-study learning approaches in the presence of potential heterogeneity in predictor-outcome relationships across datasets. We consider 1) merging all of the datasets and training a single learner, and 2) cross-study learning, which involves training a separate learner on each dataset and combining the resulting predictions. In a linear regression setting, we show analytically and confirm via simulation that merging yields lower prediction error than cross-study learning when the predictor-outcome relationships are relatively homogeneous across studies. However, as heterogeneity increases, there exists a transition point beyond which cross-study learning outperforms merging. We provide analytic expressions for the transition point in various scenarios and study asymptotic properties. |
Tasks | |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07382v1 |
https://arxiv.org/pdf/1905.07382v1.pdf | |
PWC | https://paperswithcode.com/paper/merging-versus-ensembling-in-multi-study |
Repo | |
Framework | |
Neural Bug Finding: A Study of Opportunities and Challenges
Title | Neural Bug Finding: A Study of Opportunities and Challenges |
Authors | Andrew Habib, Michael Pradel |
Abstract | Static analysis is one of the most widely adopted techniques to find software bugs before code is put in production. Designing and implementing effective and efficient static analyses is difficult and requires high expertise, which results in only a few experts able to write such analyses. This paper explores the opportunities and challenges of an alternative way of creating static bug detectors: neural bug finding. The basic idea is to formulate bug detection as a classification problem, and to address this problem with neural networks trained on examples of buggy and non-buggy code. We systematically study the effectiveness of this approach based on code examples labeled by a state-of-the-art, static bug detector. Our results show that neural bug finding is surprisingly effective for some bug patterns, sometimes reaching a precision and recall of over 80%, but also that it struggles to understand some program properties obvious to a traditional analysis. A qualitative analysis of the results provides insights into why neural bug finders sometimes work and sometimes do not work. We also identify pitfalls in selecting the code examples used to train and validate neural bug finders, and propose an algorithm for selecting effective training data. |
Tasks | |
Published | 2019-06-01 |
URL | https://arxiv.org/abs/1906.00307v1 |
https://arxiv.org/pdf/1906.00307v1.pdf | |
PWC | https://paperswithcode.com/paper/190600307 |
Repo | |
Framework | |