January 27, 2020

3261 words 16 mins read

Paper Group ANR 1259

Method and System for Image Analysis to Detect Cancer. Three Branches: Detecting Actions With Richer Features. Submission to ActivityNet Challenge 2019: Task B Spatio-temporal Action Localization. Fixing Bias in Reconstruction-based Anomaly Detection with Lipschitz Discriminators. Using machine learning to construct velocity fields from OH-PLIF ima …

Method and System for Image Analysis to Detect Cancer


Title	Method and System for Image Analysis to Detect Cancer
Authors	Waleed A. Yousef, Ahmed A. Abouelkahire, Deyaaeldeen Almahallawi, Omar S. Marzouk, Sameh K. Mohamed, Waleed A. Mustafa, Omar M. Osama, Ali A. Saleh, Naglaa M. Abdelrazek
Abstract	Breast cancer is the most common cancer and is the leading cause of cancer death among women worldwide. Detection of breast cancer, while it is still small and confined to the breast, provides the best chance of effective treatment. Computer Aided Detection (CAD) systems that detect cancer from mammograms will help in reducing the human errors that lead to missing breast carcinoma. Literature is rich of scientific papers for methods of CAD design, yet with no complete system architecture to deploy those methods. On the other hand, commercial CADs are developed and deployed only to vendors’ mammography machines with no availability to public access. This paper presents a complete CAD; it is complete since it combines, on a hand, the rigor of algorithm design and assessment (method), and, on the other hand, the implementation and deployment of a system architecture for public accessibility (system). (1) We develop a novel algorithm for image enhancement so that mammograms acquired from any digital mammography machine look qualitatively of the same clarity to radiologists’ inspection; and is quantitatively standardized for the detection algorithms. (2) We develop novel algorithms for masses and microcalcifications detection with accuracy superior to both literature results and the majority of approved commercial systems. (3) We design, implement, and deploy a system architecture that is computationally effective to allow for deploying these algorithms to cloud for public access.
Tasks	Image Enhancement
Published	2019-08-26
URL	https://arxiv.org/abs/1908.10661v1
PDF	https://arxiv.org/pdf/1908.10661v1.pdf
PWC	https://paperswithcode.com/paper/method-and-system-for-image-analysis-to
Repo
Framework

Three Branches: Detecting Actions With Richer Features


Title	Three Branches: Detecting Actions With Richer Features
Authors	Jin Xia, Jiajun Tang, Cewu Lu
Abstract	We present our three branch solutions for International Challenge on Activity Recognition at CVPR2019. This model seeks to fuse richer information of global video clip, short human attention and long-term human activity into a unified model. We have participated in two tasks: Task A, the Kinetics challenge and Task B, spatio-temporal action localization challenge. For Kinetics, we achieve 21.59% error rate. For the AVA challenge, our final model obtains 32.49% mAP on the test sets, which outperforms all submissions to the AVA challenge at CVPR 2018 for more than 10% mAP. As the future work, we will introduce human activity knowledge, which is a new dataset including key information of human activity.
Tasks	Action Localization, Activity Recognition, Spatio-Temporal Action Localization, Temporal Action Localization
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04519v1
PDF	https://arxiv.org/pdf/1908.04519v1.pdf
PWC	https://paperswithcode.com/paper/three-branches-detecting-actions-with-richer
Repo
Framework

Submission to ActivityNet Challenge 2019: Task B Spatio-temporal Action Localization


Title	Submission to ActivityNet Challenge 2019: Task B Spatio-temporal Action Localization
Authors	Chunfei Ma, Joonhyang Choi, Byeongwon Lee, Seungji Yang
Abstract	This technical report present an overview of our system proposed for the spatio-temporal action localization(SAL) task in ActivityNet Challenge 2019. Unlike previous two-streams-based works, we focus on exploring the end-to-end trainable architecture using only RGB sequential images. To this end, we employ a previously proposed simple yet effective two-branches network called SlowFast Networks which is capable of capturing both short- and long-term spatiotemporal features. Moreover, to handle the severe class imbalance and overfitting problems, we propose a correlation-preserving data augmentation method and a random label subsampling method which have been proven to be able to reduce overfitting and improve the performance.
Tasks	Action Localization, Data Augmentation, Spatio-Temporal Action Localization, Temporal Action Localization
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10837v1
PDF	https://arxiv.org/pdf/1907.10837v1.pdf
PWC	https://paperswithcode.com/paper/submission-to-activitynet-challenge-2019-task
Repo
Framework

Fixing Bias in Reconstruction-based Anomaly Detection with Lipschitz Discriminators


Title	Fixing Bias in Reconstruction-based Anomaly Detection with Lipschitz Discriminators
Authors	Alexander Tong, Roozbah Yousefzadeh, Guy Wolf, Smita Krishnaswamy
Abstract	Anomaly detection is a problem of great interest in medicine, finance, and other fields where error and fraud need to be detected and corrected. Existing deep learning methods for anomaly detection almost universally rely on autoencoder reconstruction error. Here we show that this approach exhibits intrinsic biases, which can lead to undesirable results. We show that reconstruction-based methods are sensitive to outliers and are biased towards reconstructing any outliers in the training data, and outliers within the convex hull of the data. For these reasons, we introduce a new discriminator-based unsupervised Lipschitz anomaly discriminator (LAD), which does not suffer as much from these biases. We train a Wasserstein discriminator, similar to the ones used in GANs, to detect the difference between the training data and corruptions of the training data. We show that this procedure successfully detects unseen anomalies with guarantees on those that have a certain Wasserstein distance from the data or corrupted training set. These corrections allow us to surpass state of the art anomaly detection methods on MNIST and perform comparably on CIFAR10.
Tasks	Anomaly Detection
Published	2019-05-26
URL	https://arxiv.org/abs/1905.10710v2
PDF	https://arxiv.org/pdf/1905.10710v2.pdf
PWC	https://paperswithcode.com/paper/a-lipschitz-constrained-anomaly-discriminator
Repo
Framework

Using machine learning to construct velocity fields from OH-PLIF images


Title	Using machine learning to construct velocity fields from OH-PLIF images
Authors	Shivam Barwey, Malik Hassanaly, Venkat Raman, Adam Steinberg
Abstract	This work utilizes data-driven methods to morph a series of time-resolved experimental OH-PLIF images into corresponding three-component planar PIV fields in the closed domain of a premixed swirl combustor. The task is carried out with a fully convolutional network, which is a type of convolutional neural network (CNN) used in many applications in machine learning, alongside an existing experimental dataset which consists of simultaneous OH-PLIF and PIV measurements in both attached and detached flame regimes. Two types of models are compared: 1) a global CNN which is trained using images from the entire domain, and 2) a set of local CNNs, which are trained only on individual sections of the domain. The locally trained models show improvement in creating mappings in the detached regime over the global models. A comparison between model performance in attached and detached regimes shows that the CNNs are much more accurate across the board in creating velocity fields for attached flames. Inclusion of time history in the PLIF input resulted in small noticeable improvement on average, which could imply a greater physical role of instantaneous spatial correlations in the decoding process over temporal dependencies from the perspective of the CNN. Additionally, the performance of local models trained to produce mappings in one section of the domain is tested on other, unexplored sections of the domain. Interestingly, local CNN performance on unseen domain regions revealed the models’ ability to utilize symmetry and antisymmetry in the velocity field. Ultimately, this work shows the powerful ability of the CNN to decode the three-dimensional PIV fields from input OH-PLIF images, providing a potential groundwork for a very useful tool for experimental configurations in which accessibility of forms of simultaneous measurements are limited.
Tasks
Published	2019-09-22
URL	https://arxiv.org/abs/1909.13669v1
PDF	https://arxiv.org/pdf/1909.13669v1.pdf
PWC	https://paperswithcode.com/paper/using-machine-learning-to-construct-velocity
Repo
Framework

SECNLP: A Survey of Embeddings in Clinical Natural Language Processing


Title	SECNLP: A Survey of Embeddings in Clinical Natural Language Processing
Authors	Kalyan KS, S Sangeetha
Abstract	Traditional representations like Bag of words are high dimensional, sparse and ignore the order as well as syntactic and semantic information. Distributed vector representations or embeddings map variable length text to dense fixed length vectors as well as capture the prior knowledge which can transferred to downstream tasks. Even though embedding has become de facto standard for representations in deep learning based NLP tasks in both general and clinical domains, there is no survey paper which presents a detailed review of embeddings in Clinical Natural Language Processing. In this survey paper, we discuss various medical corpora and their characteristics, medical codes and present a brief overview as well as comparison of popular embeddings models. We classify clinical embeddings into nine types and discuss each embedding type in detail. We discuss various evaluation methods followed by possible solutions to various challenges in clinical embeddings. Finally, we conclude with some of the future directions which will advance the research in clinical embeddings.
Tasks
Published	2019-03-04
URL	http://arxiv.org/abs/1903.01039v3
PDF	http://arxiv.org/pdf/1903.01039v3.pdf
PWC	https://paperswithcode.com/paper/secnlp-a-survey-of-embeddings-in-clinical
Repo
Framework

Automated Monitoring Cropland Using Remote Sensing Data: Challenges and Opportunities for Machine Learning


Title	Automated Monitoring Cropland Using Remote Sensing Data: Challenges and Opportunities for Machine Learning
Authors	Xiaowei Jia, Ankush Khandelwal, Vipin Kumar
Abstract	This paper provides an overview of how recent advances in machine learning and the availability of data from earth observing satellites can dramatically improve our ability to automatically map croplands over long period and over large regions. It discusses three applications in the domain of crop monitoring where ML approaches are beginning to show great promise. For each application, it highlights machine learning challenges, proposed approaches, and recent results. The paper concludes with discussion of major challenges that need to be addressed before ML approaches will reach their full potential for this problem of great societal relevance.
Tasks
Published	2019-04-08
URL	http://arxiv.org/abs/1904.04329v1
PDF	http://arxiv.org/pdf/1904.04329v1.pdf
PWC	https://paperswithcode.com/paper/automated-monitoring-cropland-using-remote
Repo
Framework

A Graph-structured Dataset for Wikipedia Research


Title	A Graph-structured Dataset for Wikipedia Research
Authors	Nicolas Aspert, Volodymyr Miz, Benjamin Ricaud, Pierre Vandergheynst
Abstract	Wikipedia is a rich and invaluable source of information. Its central place on the Web makes it a particularly interesting object of study for scientists. Researchers from different domains used various complex datasets related to Wikipedia to study language, social behavior, knowledge organization, and network theory. While being a scientific treasure, the large size of the dataset hinders pre-processing and may be a challenging obstacle for potential new studies. This issue is particularly acute in scientific domains where researchers may not be technically and data processing savvy. On one hand, the size of Wikipedia dumps is large. It makes the parsing and extraction of relevant information cumbersome. On the other hand, the API is straightforward to use but restricted to a relatively small number of requests. The middle ground is at the mesoscopic scale when researchers need a subset of Wikipedia ranging from thousands to hundreds of thousands of pages but there exists no efficient solution at this scale. In this work, we propose an efficient data structure to make requests and access subnetworks of Wikipedia pages and categories. We provide convenient tools for accessing and filtering viewership statistics or “pagecounts” of Wikipedia web pages. The dataset organization leverages principles of graph databases that allows rapid and intuitive access to subgraphs of Wikipedia articles and categories. The dataset and deployment guidelines are available on the LTS2 website \url{https://lts2.epfl.ch/Datasets/Wikipedia/}.
Tasks
Published	2019-03-20
URL	http://arxiv.org/abs/1903.08597v1
PDF	http://arxiv.org/pdf/1903.08597v1.pdf
PWC	https://paperswithcode.com/paper/a-graph-structured-dataset-for-wikipedia
Repo
Framework

OD-GCN: Object Detection Boosted by Knowledge GCN


Title	OD-GCN: Object Detection Boosted by Knowledge GCN
Authors	Zheng Liu, Zidong Jiang, Wei Feng, Hui Feng
Abstract	Classical CNN based object detection methods only extract the objects’ image features, but do not consider the high-level relationship among objects in context. In this article, the graph convolutional networks (GCN) is integrated into the object detection framework to exploit the benefit of category relationship among objects, which is able to provide extra confidence for any pre-trained object detection model in our framework. In experiments, we test several popular base detection models on COCO dataset. The results show promising improvement on mAP by 1-5pp. In addition, visualized analysis reveals the benchmark improvement is quite reasonable in human’s opinion.
Tasks	Object Detection
Published	2019-08-06
URL	https://arxiv.org/abs/1908.04385v3
PDF	https://arxiv.org/pdf/1908.04385v3.pdf
PWC	https://paperswithcode.com/paper/od-gcn-object-detection-by-knowledge-graph
Repo
Framework

Experience-Embedded Visual Foresight


Title	Experience-Embedded Visual Foresight
Authors	Lin Yen-Chen, Maria Bauza, Phillip Isola
Abstract	Visual foresight gives an agent a window into the future, which it can use to anticipate events before they happen and plan strategic behavior. Although impressive results have been achieved on video prediction in constrained settings, these models fail to generalize when confronted with unfamiliar real-world objects. In this paper, we tackle the generalization problem via fast adaptation, where we train a prediction model to quickly adapt to the observed visual dynamics of a novel object. Our method, Experience-embedded Visual Foresight (EVF), jointly learns a fast adaptation module, which encodes observed trajectories of the new object into a vector embedding, and a visual prediction model, which conditions on this embedding to generate physically plausible predictions. For evaluation, we compare our method against baselines on video prediction and benchmark its utility on two real-world control tasks. We show that our method is able to quickly adapt to new visual dynamics and achieves lower error than the baselines when manipulating novel objects.
Tasks	Video Prediction
Published	2019-11-12
URL	https://arxiv.org/abs/1911.05071v2
PDF	https://arxiv.org/pdf/1911.05071v2.pdf
PWC	https://paperswithcode.com/paper/experience-embedded-visual-foresight
Repo
Framework

Entity Abstraction in Visual Model-Based Reinforcement Learning


Title	Entity Abstraction in Visual Model-Based Reinforcement Learning
Authors	Rishi Veerapaneni, John D. Co-Reyes, Michael Chang, Michael Janner, Chelsea Finn, Jiajun Wu, Joshua B. Tenenbaum, Sergey Levine
Abstract	This paper tests the hypothesis that modeling a scene in terms of entities and their local interactions, as opposed to modeling the scene globally, provides a significant benefit in generalizing to physical tasks in a combinatorial space the learner has not encountered before. We present object-centric perception, prediction, and planning (OP3), which to the best of our knowledge is the first entity-centric dynamic latent variable framework for model-based reinforcement learning that acquires entity representations from raw visual observations without supervision and uses them to predict and plan. OP3 enforces entity-abstraction – symmetric processing of each entity representation with the same locally-scoped function – which enables it to scale to model different numbers and configurations of objects from those in training. Our approach to solving the key technical challenge of grounding these entity representations to actual objects in the environment is to frame this variable binding problem as an inference problem, and we developing an interactive inference algorithm that uses temporal continuity and interactive feedback to bind information about object properties to the entity variables. On block-stacking tasks, OP3 generalizes to novel block configurations and more objects than observed during training, outperforming an oracle model that assumes access to object supervision and achieving two to three times better accuracy than a state-of-the-art video prediction model.
Tasks	Video Prediction
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12827v3
PDF	https://arxiv.org/pdf/1910.12827v3.pdf
PWC	https://paperswithcode.com/paper/entity-abstraction-in-visual-model-based
Repo
Framework


Title	Semi-Supervised Tensor Factorization for Node Classification in Complex Social Networks
Authors	Georgios Katsimpras, Georgios Paliouras
Abstract	This paper proposes a method to guide tensor factorization, using class labels. Furthermore, it shows the advantages of using the proposed method in identifying nodes that play a special role in multi-relational networks, e.g. spammers. Most complex systems involve multiple types of relationships and interactions among entities. Combining information from different relationships may be crucial for various prediction tasks. Instead of creating distinct prediction models for each type of relationship, in this paper we present a tensor factorization approach based on RESCAL, which collectively exploits all existing relations. We extend RESCAL to produce a semi-supervised factorization method that combines a classification error term with the standard factor optimization process. The coupled optimization approach, models the tensorial data assimilating observed information from all the relations, while also taking into account classification performance. Our evaluation on real-world social network data shows that incorporating supervision, when available, leads to models that are more accurate.
Tasks	Node Classification
Published	2019-07-24
URL	https://arxiv.org/abs/1907.10416v1
PDF	https://arxiv.org/pdf/1907.10416v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-tensor-factorization-for-node
Repo
Framework

Solving large-scale L1-regularized SVMs and cousins: the surprising effectiveness of column and constraint generation


Title	Solving large-scale L1-regularized SVMs and cousins: the surprising effectiveness of column and constraint generation
Authors	Antoine Dedieu, Rahul Mazumder
Abstract	The linear Support Vector Machine (SVM) is one of the most popular binary classification techniques in machine learning. Motivated by applications in modern high dimensional statistics, we consider penalized SVM problems involving the minimization of a hinge-loss function with a convex sparsity-inducing regularizer such as: the L1-norm on the coefficients, its grouped generalization and the sorted L1-penalty (aka Slope). Each problem can be expressed as a Linear Program (LP) and is computationally challenging when the number of features and/or samples is large – the current state of algorithms for these problems is rather nascent when compared to the usual L2-regularized linear SVM. To this end, we propose new computational algorithms for these LPs by bringing together techniques from (a) classical column (and constraint) generation methods and (b) first order methods for non-smooth convex optimization - techniques that are rarely used together for solving large scale LPs. These components have their respective strengths; and while they are found to be useful as separate entities, they have not been used together in the context of solving large scale LPs such as the ones studied herein. Our approach complements the strengths of (a) and (b) — leading to a scheme that seems to outperform commercial solvers as well as specialized implementations for these problems by orders of magnitude. We present numerical results on a series of real and synthetic datasets demonstrating the surprising effectiveness of classic column/constraint generation methods in the context of challenging LP-based machine learning tasks.
Tasks
Published	2019-01-06
URL	http://arxiv.org/abs/1901.01585v1
PDF	http://arxiv.org/pdf/1901.01585v1.pdf
PWC	https://paperswithcode.com/paper/solving-large-scale-l1-regularized-svms-and
Repo
Framework

Multitask and Transfer Learning for Autotuning Exascale Applications


Title	Multitask and Transfer Learning for Autotuning Exascale Applications
Authors	Wissam M. Sid-Lakhdar, Mohsen Mahmoudi Aznaveh, Xiaoye S. Li, James W. Demmel
Abstract	Multitask learning and transfer learning have proven to be useful in the field of machine learning when additional knowledge is available to help a prediction task. We aim at deriving methods following these paradigms for use in autotuning, where the goal is to find the optimal performance parameters of an application treated as a black-box function. We show comparative results with state-of-the-art autotuning techniques. For instance, we observe an average $1.5x$ improvement of the application runtime compared to the OpenTuner and HpBandSter autotuners. We explain how our approaches can be more suitable than some state-of-the-art autotuners for the tuning of any application in general and of expensive exascale applications in particular.
Tasks	Transfer Learning
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05792v1
PDF	https://arxiv.org/pdf/1908.05792v1.pdf
PWC	https://paperswithcode.com/paper/multitask-and-transfer-learning-for
Repo
Framework

Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks


Title	Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks
Authors	Trapit Bansal, Rishikesh Jha, Andrew McCallum
Abstract	Self-supervised pre-training of transformer models has shown enormous success in improving performance on a number of downstream tasks. However, fine-tuning on a new task still requires large amounts of task-specific labelled data to achieve good performance. We consider this problem of learning to generalize to new tasks with few examples as a meta-learning problem. While meta-learning has shown tremendous progress in recent years, its application is still limited to simulated problems or problems with limited diversity across tasks. We develop a novel method, LEOPARD, which enables optimization-based meta-learning across tasks with different number of classes, and evaluate different methods on generalization to diverse NLP classification tasks. LEOPARD is trained with the state-of-the-art transformer architecture and shows better generalization to tasks not seen at all during training, with as few as 4 examples per label. Across 17 NLP tasks, including diverse domains of entity typing, natural language inference, sentiment analysis, and several other text classification tasks, we show that LEOPARD learns better initial parameters for few-shot learning than self-supervised pre-training or multi-task training, outperforming many strong baselines, for example, yielding 14.5% average relative gain in accuracy on unseen tasks with only 4 examples per label.
Tasks	Entity Typing, Few-Shot Learning, Meta-Learning, Natural Language Inference, Relation Extraction, Sentiment Analysis, Text Categorization, Text Classification
Published	2019-11-10
URL	https://arxiv.org/abs/1911.03863v2
PDF	https://arxiv.org/pdf/1911.03863v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-few-shot-learn-across-diverse
Repo
Framework