Paper Group ANR 390
SM-NAS: Structural-to-Modular Neural Architecture Search for Object Detection. Reinforcement Learning with Non-uniform State Representations for Adaptive Search. Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space. Learning to Learn Semantic Parsers from Natural Language Supervision. Counterfactual diag …
SM-NAS: Structural-to-Modular Neural Architecture Search for Object Detection
Title | SM-NAS: Structural-to-Modular Neural Architecture Search for Object Detection |
Authors | Lewei Yao, Hang Xu, Wei Zhang, Xiaodan Liang, Zhenguo Li |
Abstract | The state-of-the-art object detection method is complicated with various modules such as backbone, feature fusion neck, RPN and RCNN head, where each module may have different designs and structures. How to leverage the computational cost and accuracy trade-off for the structural combination as well as the modular selection of multiple modules? Neural architecture search (NAS) has shown great potential in finding an optimal solution. Existing NAS works for object detection only focus on searching better design of a single module such as backbone or feature fusion neck, while neglecting the balance of the whole system. In this paper, we present a two-stage coarse-to-fine searching strategy named Structural-to-Modular NAS (SM-NAS) for searching a GPU-friendly design of both an efficient combination of modules and better modular-level architecture for object detection. Specifically, Structural-level searching stage first aims to find an efficient combination of different modules; Modular-level searching stage then evolves each specific module and pushes the Pareto front forward to a faster task-specific network. We consider a multi-objective search where the search space covers many popular designs of detection methods. We directly search a detection backbone without pre-trained models or any proxy task by exploring a fast training from scratch strategy. The resulting architectures dominate state-of-the-art object detection systems in both inference time and accuracy and demonstrate the effectiveness on multiple detection datasets, e.g. halving the inference time with additional 1% mAP improvement compared to FPN and reaching 46% mAP with the similar inference time of MaskRCNN. |
Tasks | Neural Architecture Search, Object Detection |
Published | 2019-11-22 |
URL | https://arxiv.org/abs/1911.09929v2 |
https://arxiv.org/pdf/1911.09929v2.pdf | |
PWC | https://paperswithcode.com/paper/sm-nas-structural-to-modular-neural |
Repo | |
Framework | |
Reinforcement Learning with Non-uniform State Representations for Adaptive Search
Title | Reinforcement Learning with Non-uniform State Representations for Adaptive Search |
Authors | Sandeep Manjanna, Herke van Hoof, Gregory Dudek |
Abstract | Efficient spatial exploration is a key aspect of search and rescue. In this paper, we present a search algorithm that generates efficient trajectories that optimize the rate at which probability mass is covered by a searcher. This should allow an autonomous vehicle find one or more lost targets as rapidly as possible. We do this by performing non-uniform sampling of the search region. The path generated minimizes the expected time to locate the missing target by visiting high probability regions using non-myopic path generation based on reinforcement learning. We model the target probability distribution using a classic mixture of Gaussians model with means and mixture coefficients tuned according to the location and time of sightings of the lost target. Key features of our search algorithm are the ability to employ a very general non-deterministic action model and the ability to generate action plans for any new probability distribution using the parameters learned on other similar looking distributions. One of the key contributions of this paper is the use of non-uniform state aggregation for policy search in the context of robotics. |
Tasks | |
Published | 2019-06-15 |
URL | https://arxiv.org/abs/1906.06588v1 |
https://arxiv.org/pdf/1906.06588v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-with-non-uniform-state |
Repo | |
Framework | |
Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space
Title | Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space |
Authors | Keizo Kato, Jing Zhou, Tomotake Sasaki, Akira Nakagawa |
Abstract | To analyze high-dimensional and complex data in the real world, generative model approach of machine learning aims to reduce the dimension and acquire a probabilistic model of the data. For this purpose, deep-autoencoder based generative models such as variational autoencoder (VAE) have been proposed. However, in previous works, the scale of metrics between the real and the reduced-dimensional space(latent space) is not well-controlled. Therefore, the quantitative impact of the latent variable on real data is unclear. In the end, the probability distribution function (PDF) in the real space cannot be estimated from that of the latent space accurately. To overcome this problem, we propose Rate-Distortion Optimization guided autoencoder. We show our method has the following properties theoretically and experimentally: (i) the columns of Jacobian matrix between two spaces is constantly-scaled orthonormal system and data can be embedded in a Euclidean space isometrically; (ii) the PDF of the latent space is proportional to that of the real space. Furthermore, to verify the usefulness in the practical application, we evaluate its performance in unsupervised anomaly detection and it outperforms current state-of-the-art methods. |
Tasks | Anomaly Detection, Unsupervised Anomaly Detection |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04329v2 |
https://arxiv.org/pdf/1910.04329v2.pdf | |
PWC | https://paperswithcode.com/paper/rate-distortion-optimization-guided-1 |
Repo | |
Framework | |
Learning to Learn Semantic Parsers from Natural Language Supervision
Title | Learning to Learn Semantic Parsers from Natural Language Supervision |
Authors | Igor Labutov, Bishan Yang, Tom Mitchell |
Abstract | As humans, we often rely on language to learn language. For example, when corrected in a conversation, we may learn from that correction, over time improving our language fluency. Inspired by this observation, we propose a learning algorithm for training semantic parsers from supervision (feedback) expressed in natural language. Our algorithm learns a semantic parser from users’ corrections such as “no, what I really meant was before his job, not after”, by also simultaneously learning to parse this natural language feedback in order to leverage it as a form of supervision. Unlike supervision with gold-standard logical forms, our method does not require the user to be familiar with the underlying logical formalism, and unlike supervision from denotation, it does not require the user to know the correct answer to their query. This makes our learning algorithm naturally scalable in settings where existing conversational logs are available and can be leveraged as training data. We construct a novel dataset of natural language feedback in a conversational setting, and show that our method is effective at learning a semantic parser from such natural language supervision. |
Tasks | |
Published | 2019-02-22 |
URL | http://arxiv.org/abs/1902.08373v1 |
http://arxiv.org/pdf/1902.08373v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-learn-semantic-parsers-from |
Repo | |
Framework | |
Counterfactual diagnosis
Title | Counterfactual diagnosis |
Authors | Jonathan G. Richens, Ciaran M. Lee, Saurabh Johri |
Abstract | Machine learning promises to revolutionize clinical decision making and diagnosis. In medical diagnosis a doctor aims to explain a patient’s symptoms by determining the diseases \emph{causing} them. However, existing diagnostic algorithms are purely associative, identifying diseases that are strongly correlated with a patients symptoms and medical history. We show that this inability to disentangle correlation from causation can result in sub-optimal or dangerous diagnoses. To overcome this, we reformulate diagnosis as a counterfactual inference task and derive new counterfactual diagnostic algorithms. We show that this approach is closer to the diagnostic reasoning of clinicians and significantly improves the accuracy and safety of the resulting diagnoses. We compare our counterfactual algorithm to the standard Bayesian diagnostic algorithm and a cohort of 44 doctors using a test set of clinical vignettes. While the Bayesian algorithm achieves an accuracy comparable to the average doctor, placing in the top 48% of doctors in our cohort, our counterfactual algorithm places in the top 25% of doctors, achieving expert clinical accuracy. This improvement is achieved simply by changing how we query our model, without requiring any additional model improvements. Our results show that counterfactual reasoning is a vital missing ingredient for applying machine learning to medical diagnosis. |
Tasks | Counterfactual Inference, Decision Making, Medical Diagnosis |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06772v3 |
https://arxiv.org/pdf/1910.06772v3.pdf | |
PWC | https://paperswithcode.com/paper/counterfactual-diagnosis |
Repo | |
Framework | |
Enhancing Prediction Models for One-Year Mortality in Patients with Acute Myocardial Infarction and Post Myocardial Infarction Syndrome
Title | Enhancing Prediction Models for One-Year Mortality in Patients with Acute Myocardial Infarction and Post Myocardial Infarction Syndrome |
Authors | Seyedeh Neelufar Payrovnaziri, Laura A. Barrett, Daniel Bis, Jiang Bian, Zhe He |
Abstract | Predicting the risk of mortality for patients with acute myocardial infarction (AMI) using electronic health records (EHRs) data can help identify risky patients who might need more tailored care. In our previous work, we built computational models to predict one-year mortality of patients admitted to an intensive care unit (ICU) with AMI or post myocardial infarction syndrome. Our prior work only used the structured clinical data from MIMIC-III, a publicly available ICU clinical database. In this study, we enhanced our work by adding the word embedding features from free-text discharge summaries. Using a richer set of features resulted in significant improvement in the performance of our deep learning models. The average accuracy of our deep learning models was 92.89% and the average F-measure was 0.928. We further reported the impact of different combinations of features extracted from structured and/or unstructured data on the performance of the deep learning models. |
Tasks | |
Published | 2019-04-28 |
URL | http://arxiv.org/abs/1904.12383v1 |
http://arxiv.org/pdf/1904.12383v1.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-prediction-models-for-one-year |
Repo | |
Framework | |
Poison as a Cure: Detecting & Neutralizing Variable-Sized Backdoor Attacks in Deep Neural Networks
Title | Poison as a Cure: Detecting & Neutralizing Variable-Sized Backdoor Attacks in Deep Neural Networks |
Authors | Alvin Chan, Yew-Soon Ong |
Abstract | Deep learning models have recently shown to be vulnerable to backdoor poisoning, an insidious attack where the victim model predicts clean images correctly but classifies the same images as the target class when a trigger poison pattern is added. This poison pattern can be embedded in the training dataset by the adversary. Existing defenses are effective under certain conditions such as a small size of the poison pattern, knowledge about the ratio of poisoned training samples or when a validated clean dataset is available. Since a defender may not have such prior knowledge or resources, we propose a defense against backdoor poisoning that is effective even when those prerequisites are not met. It is made up of several parts: one to extract a backdoor poison signal, detect poison target and base classes, and filter out poisoned from clean samples with proven guarantees. The final part of our defense involves retraining the poisoned model on a dataset augmented with the extracted poison signal and corrective relabeling of poisoned samples to neutralize the backdoor. Our approach has shown to be effective in defending against backdoor attacks that use both small and large-sized poison patterns on nine different target-base class pairs from the CIFAR10 dataset. |
Tasks | |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08040v1 |
https://arxiv.org/pdf/1911.08040v1.pdf | |
PWC | https://paperswithcode.com/paper/poison-as-a-cure-detecting-neutralizing |
Repo | |
Framework | |
Local Trend Inconsistency: A Prediction-driven Approach to Unsupervised Anomaly Detection in Multi-seasonal Time Series
Title | Local Trend Inconsistency: A Prediction-driven Approach to Unsupervised Anomaly Detection in Multi-seasonal Time Series |
Authors | Wentai Wu, Ligang He, Weiwei Lin |
Abstract | On-line detection of anomalies in time series is a key technique in various event-sensitive scenarios such as robotic system monitoring, smart sensor networks and data center security. However, the increasing diversity of data sources and demands are making this task more challenging than ever. First, the rapid increase of unlabeled data makes supervised learning no longer suitable in many cases. Second, a great portion of time series have complex seasonality features. Third, on-line anomaly detection needs to be fast and reliable. In view of this, we in this paper adopt an unsupervised prediction-driven approach on the basis of a backbone model combining a series decomposition part and an inference part. We then propose a novel metric, Local Trend Inconsistency (LTI), along with a detection algorithm that efficiently computes LTI chronologically along the series and marks each data point with a score indicating its probability of being anomalous. We experimentally evaluated our algorithm on datasets from UCI public repository and a production environment. The result shows that our scheme outperforms several representative anomaly detection algorithms in Area Under Curve (AUC) metric with decent time efficiency. |
Tasks | Anomaly Detection, Time Series, Unsupervised Anomaly Detection |
Published | 2019-08-03 |
URL | https://arxiv.org/abs/1908.01146v1 |
https://arxiv.org/pdf/1908.01146v1.pdf | |
PWC | https://paperswithcode.com/paper/local-trend-inconsistency-a-prediction-driven |
Repo | |
Framework | |
Low-Latency Speaker-Independent Continuous Speech Separation
Title | Low-Latency Speaker-Independent Continuous Speech Separation |
Authors | Takuya Yoshioka, Zhuo Chen, Changliang Liu, Xiong Xiao, Hakan Erdogan, Dimitrios Dimitriadis |
Abstract | Speaker independent continuous speech separation (SI-CSS) is a task of converting a continuous audio stream, which may contain overlapping voices of unknown speakers, into a fixed number of continuous signals each of which contains no overlapping speech segment. A separated, or cleaned, version of each utterance is generated from one of SI-CSS’s output channels nondeterministically without being split up and distributed to multiple channels. A typical application scenario is transcribing multi-party conversations, such as meetings, recorded with microphone arrays. The output signals can be simply sent to a speech recognition engine because they do not include speech overlaps. The previous SI-CSS method uses a neural network trained with permutation invariant training and a data-driven beamformer and thus requires much processing latency. This paper proposes a low-latency SI-CSS method whose performance is comparable to that of the previous method in a microphone array-based meeting transcription task.This is achieved (1) by using a new speech separation network architecture combined with a double buffering scheme and (2) by performing enhancement with a set of fixed beamformers followed by a neural post-filter. |
Tasks | Speech Recognition, Speech Separation |
Published | 2019-04-13 |
URL | http://arxiv.org/abs/1904.06478v1 |
http://arxiv.org/pdf/1904.06478v1.pdf | |
PWC | https://paperswithcode.com/paper/low-latency-speaker-independent-continuous |
Repo | |
Framework | |
Bandwidth Slicing to Boost Federated Learning in Edge Computing
Title | Bandwidth Slicing to Boost Federated Learning in Edge Computing |
Authors | Jun Li, Xiaoman Shen, Lei Chen, Jiajia Chen |
Abstract | Bandwidth slicing is introduced to support federated learning in edge computing to assure low communication delay for training traffic. Results reveal that bandwidth slicing significantly improves training efficiency while achieving good learning accuracy. |
Tasks | |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1911.07615v1 |
https://arxiv.org/pdf/1911.07615v1.pdf | |
PWC | https://paperswithcode.com/paper/bandwidth-slicing-to-boost-federated-learning |
Repo | |
Framework | |
OCKELM+: Kernel Extreme Learning Machine based One-class Classification using Privileged Information (or KOC+: Kernel Ridge Regression or Least Square SVM with zero bias based One-class Classification using Privileged Information)
Title | OCKELM+: Kernel Extreme Learning Machine based One-class Classification using Privileged Information (or KOC+: Kernel Ridge Regression or Least Square SVM with zero bias based One-class Classification using Privileged Information) |
Authors | Chandan Gautam, Aruna Tiwari, M. Tanveer |
Abstract | Kernel method-based one-class classifier is mainly used for outlier or novelty detection. In this letter, kernel ridge regression (KRR) based one-class classifier (KOC) has been extended for learning using privileged information (LUPI). LUPI-based KOC method is referred to as KOC+. This privileged information is available as a feature with the dataset but only for training (not for testing). KOC+ utilizes the privileged information differently compared to normal feature information by using a so-called correction function. Privileged information helps KOC+ in achieving better generalization performance which is exhibited in this letter by testing the classifiers with and without privileged information. Existing and proposed classifiers are evaluated on the datasets from UCI machine learning repository and also on MNIST dataset. Moreover, experimental results evince the advantage of KOC+ over KOC and support vector machine (SVM) based one-class classifiers. |
Tasks | One-class classifier |
Published | 2019-04-13 |
URL | http://arxiv.org/abs/1904.08338v1 |
http://arxiv.org/pdf/1904.08338v1.pdf | |
PWC | https://paperswithcode.com/paper/190408338 |
Repo | |
Framework | |
Ordinal Distribution Regression for Gait-based Age Estimation
Title | Ordinal Distribution Regression for Gait-based Age Estimation |
Authors | Haiping Zhu, Yuheng Zhang, Guohao Li, Junping Zhang, Hongming Shan |
Abstract | Computer vision researchers prefer to estimate age from face images because facial features provide useful information. However, estimating age from face images becomes challenging when people are distant from the camera or occluded. A person’s gait is a unique biometric feature that can be perceived efficiently even at a distance. Thus, gait can be used to predict age when face images are not available. However, existing gait-based classification or regression methods ignore the ordinal relationship of different ages, which is an important clue for age estimation. This paper proposes an ordinal distribution regression with a global and local convolutional neural network for gait-based age estimation. Specifically, we decompose gait-based age regression into a series of binary classifications to incorporate the ordinal age information. Then, an ordinal distribution loss is proposed to consider the inner relationships among these classifications by penalizing the distribution discrepancy between the estimated value and the ground truth. In addition, our neural network comprises a global and three local sub-networks, and thus, is capable of learning the global structure and local details from the head, body, and feet. Experimental results indicate that the proposed approach outperforms state-of-the-art gait-based age estimation methods on the OULP-Age dataset. |
Tasks | Age Estimation |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11005v4 |
https://arxiv.org/pdf/1905.11005v4.pdf | |
PWC | https://paperswithcode.com/paper/ordinal-distribution-regression-for-gait |
Repo | |
Framework | |
Abnormal Chest X-ray Identification With Generative Adversarial One-Class Classifier
Title | Abnormal Chest X-ray Identification With Generative Adversarial One-Class Classifier |
Authors | Yuxing Tang, Youbao Tang, Mei Han, Jing Xiao, Ronald M. Summers |
Abstract | Being one of the most common diagnostic imaging tests, chest radiography requires timely reporting of potential findings in the images. In this paper, we propose an end-to-end architecture for abnormal chest X-ray identification using generative adversarial one-class learning. Unlike previous approaches, our method takes only normal chest X-ray images as input. The architecture is composed of three deep neural networks, each of which learned by competing while collaborating among them to model the underlying content structure of the normal chest X-rays. Given a chest X-ray image in the testing phase, if it is normal, the learned architecture can well model and reconstruct the content; if it is abnormal, since the content is unseen in the training phase, the model would perform poorly in its reconstruction. It thus enables distinguishing abnormal chest X-rays from normal ones. Quantitative and qualitative experiments demonstrate the effectiveness and efficiency of our approach, where an AUC of 0.841 is achieved on the challenging NIH Chest X-ray dataset in a one-class learning setting, with the potential in reducing the workload for radiologists. |
Tasks | One-class classifier |
Published | 2019-03-05 |
URL | http://arxiv.org/abs/1903.02040v1 |
http://arxiv.org/pdf/1903.02040v1.pdf | |
PWC | https://paperswithcode.com/paper/abnormal-chest-x-ray-identification-with |
Repo | |
Framework | |
Representing ill-known parts of a numerical model using a machine learning approach
Title | Representing ill-known parts of a numerical model using a machine learning approach |
Authors | Julien Brajard, Anastase Charantonis, Jérôme Sirven |
Abstract | In numerical modeling of the Earth System, many processes remain unknown or ill represented (let us quote sub-grid processes, the dependence to unknown latent variables or the non-inclusion of complex dynamics in numerical models) but sometimes can be observed. This paper proposes a methodology to produce a hybrid model combining a physical-based model (forecasting the well-known processes) with a neural-net model trained from observations (forecasting the remaining processes). The approach is applied to a shallow-water model in which the forcing, dissipative and diffusive terms are assumed to be unknown. We show that the hybrid model is able to reproduce with great accuracy the unknown terms (correlation close to 1). For long term simulations it reproduces with no significant difference the mean state, the kinetic energy, the potential energy and the potential vorticity of the system. Lastly it is able to function with new forcings that were not encountered during the training phase of the neural network. |
Tasks | |
Published | 2019-03-18 |
URL | http://arxiv.org/abs/1903.07358v1 |
http://arxiv.org/pdf/1903.07358v1.pdf | |
PWC | https://paperswithcode.com/paper/representing-ill-known-parts-of-a-numerical |
Repo | |
Framework | |
Movie Plot Analysis via Turning Point Identification
Title | Movie Plot Analysis via Turning Point Identification |
Authors | Pinelopi Papalampidi, Frank Keller, Mirella Lapata |
Abstract | According to screenwriting theory, turning points (e.g., change of plans, major setback, climax) are crucial narrative moments within a screenplay: they define the plot structure, determine its progression and segment the screenplay into thematic units (e.g., setup, complications, aftermath). We propose the task of turning point identification in movies as a means of analyzing their narrative structure. We argue that turning points and the segmentation they provide can facilitate processing long, complex narratives, such as screenplays, for summarization and question answering. We introduce a dataset consisting of screenplays and plot synopses annotated with turning points and present an end-to-end neural network model that identifies turning points in plot synopses and projects them onto scenes in screenplays. Our model outperforms strong baselines based on state-of-the-art sentence representations and the expected position of turning points. |
Tasks | Question Answering |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10328v2 |
https://arxiv.org/pdf/1908.10328v2.pdf | |
PWC | https://paperswithcode.com/paper/movie-plot-analysis-via-turning-point |
Repo | |
Framework | |