October 18, 2019

3190 words 15 mins read

Paper Group ANR 672

Paper Group ANR 672

Automating Motion Correction in Multishot MRI Using Generative Adversarial Networks. CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments. Can Deep Learning Outperform Modern Commercial CT Image Reconstruction Methods?. A Multi-Stage Algorithm for Acoustic Physical Model Parameters Estimation. Are You Sure You Want To …

Automating Motion Correction in Multishot MRI Using Generative Adversarial Networks

Title Automating Motion Correction in Multishot MRI Using Generative Adversarial Networks
Authors Siddique Latif, Muhammad Asim, Muhammad Usman, Junaid Qadir, Rajib Rana
Abstract Multishot Magnetic Resonance Imaging (MRI) has recently gained popularity as it accelerates the MRI data acquisition process without compromising the quality of final MR image. However, it suffers from motion artifacts caused by patient movements which may lead to misdiagnosis. Modern state-of-the-art motion correction techniques are able to counter small degree motion, however, their adoption is hindered by their time complexity. This paper proposes a Generative Adversarial Network (GAN) for reconstructing motion free high-fidelity images while reducing the image reconstruction time by an impressive two orders of magnitude.
Tasks Image Reconstruction, Motion Correction In Multishot Mri
Published 2018-11-24
URL http://arxiv.org/abs/1811.09750v1
PDF http://arxiv.org/pdf/1811.09750v1.pdf
PWC https://paperswithcode.com/paper/automating-motion-correction-in-multishot-mri
Repo
Framework

CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments

Title CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments
Authors Nelson Yalta, Shinji Watanabe, Takaaki Hori, Kazuhiro Nakadai, Tetsuya Ogata
Abstract Casual conversations involving multiple speakers and noises from surrounding devices are common in everyday environments, which degrades the performances of automatic speech recognition systems. These challenging characteristics of environments are the target of the CHiME-5 challenge. By employing a convolutional neural network (CNN)-based multichannel end-to-end speech recognition system, this study attempts to overcome the presents difficulties in everyday environments. The system comprises of an attention-based encoder-decoder neural network that directly generates a text as an output from a sound input. The multichannel CNN encoder, which uses residual connections and batch renormalization, is trained with augmented data, including white noise injection. The experimental results show that the word error rate is reduced by 8.5% and 0.6% absolute from a single channel end-to-end and the best baseline (LF-MMI TDNN) on the CHiME-5 corpus, respectively.
Tasks End-To-End Speech Recognition, Speech Recognition
Published 2018-11-07
URL https://arxiv.org/abs/1811.02735v3
PDF https://arxiv.org/pdf/1811.02735v3.pdf
PWC https://paperswithcode.com/paper/cnn-based-multichannel-end-to-end-speech
Repo
Framework

Can Deep Learning Outperform Modern Commercial CT Image Reconstruction Methods?

Title Can Deep Learning Outperform Modern Commercial CT Image Reconstruction Methods?
Authors Hongming Shan, Atul Padole, Fatemeh Homayounieh, Uwe Kruger, Ruhani Doda Khera, Chayanin Nitiwarangkul, Mannudeep K. Kalra, Ge Wang
Abstract Commercial iterative reconstruction techniques on modern CT scanners target radiation dose reduction but there are lingering concerns over their impact on image appearance and low contrast detectability. Recently, machine learning, especially deep learning, has been actively investigated for CT. Here we design a novel neural network architecture for low-dose CT (LDCT) and compare it with commercial iterative reconstruction methods used for standard of care CT. While popular neural networks are trained for end-to-end mapping, driven by big data, our novel neural network is intended for end-to-process mapping so that intermediate image targets are obtained with the associated search gradients along which the final image targets are gradually reached. This learned dynamic process allows to include radiologists in the training loop to optimize the LDCT denoising workflow in a task-specific fashion with the denoising depth as a key parameter. Our progressive denoising network was trained with the Mayo LDCT Challenge Dataset, and tested on images of the chest and abdominal regions scanned on the CT scanners made by three leading CT vendors. The best deep learning based reconstructions are systematically compared to the best iterative reconstructions in a double-blinded reader study. It is found that our deep learning approach performs either comparably or favorably in terms of noise suppression and structural fidelity, and runs orders of magnitude faster than the commercial iterative CT reconstruction algorithms.
Tasks Denoising, Image Reconstruction
Published 2018-11-08
URL http://arxiv.org/abs/1811.03691v1
PDF http://arxiv.org/pdf/1811.03691v1.pdf
PWC https://paperswithcode.com/paper/can-deep-learning-outperform-modern
Repo
Framework

A Multi-Stage Algorithm for Acoustic Physical Model Parameters Estimation

Title A Multi-Stage Algorithm for Acoustic Physical Model Parameters Estimation
Authors Leonardo Gabrielli, Stefano Tomassetti, Stefano Squartini, Carlo Zinato, Stefano Guaiana
Abstract One of the challenges in computational acoustics is the identification of models that can simulate and predict the physical behavior of a system generating an acoustic signal. Whenever such models are used for commercial applications an additional constraint is the time-to-market, making automation of the sound design process desirable. In previous works, a computational sound design approach has been proposed for the parameter estimation problem involving timbre matching by deep learning, which was applied to the synthesis of pipe organ tones. In this work we refine previous results by introducing the former approach in a multi-stage algorithm that also adds heuristics and a stochastic optimization method operating on objective cost functions based on psychoacoustics. The optimization method shows to be able to refine the first estimate given by the deep learning approach and substantially improve the objective metrics, with the additional benefit of reducing the sound design process time. Subjective listening tests are also conducted to gather additional insights on the results.
Tasks Stochastic Optimization
Published 2018-09-14
URL http://arxiv.org/abs/1809.05483v2
PDF http://arxiv.org/pdf/1809.05483v2.pdf
PWC https://paperswithcode.com/paper/a-multi-stage-algorithm-for-acoustic-physical
Repo
Framework

Are You Sure You Want To Do That? Classification with Verification

Title Are You Sure You Want To Do That? Classification with Verification
Authors Harris Chan, Atef Chaudhury, Kevin Shen
Abstract Classification systems typically act in isolation, meaning they are required to implicitly memorize the characteristics of all candidate classes in order to classify. The cost of this is increased memory usage and poor sample efficiency. We propose a model which instead verifies using reference images during the classification process, reducing the burden of memorization. The model uses iterative nondifferentiable queries in order to classify an image. We demonstrate that such a model is feasible to train and can match baseline accuracy while being more parameter efficient. However, we show that finding the correct balance between image recognition and verification is essential to pushing the model towards desired behavior, suggesting that a pipeline of recognition followed by verification is a more promising approach.
Tasks
Published 2018-09-07
URL http://arxiv.org/abs/1809.02652v2
PDF http://arxiv.org/pdf/1809.02652v2.pdf
PWC https://paperswithcode.com/paper/are-you-sure-you-want-to-do-that
Repo
Framework

A Generation Method of Immunological Memory in Clonal Selection Algorithm by using Restricted Boltzmann Machines

Title A Generation Method of Immunological Memory in Clonal Selection Algorithm by using Restricted Boltzmann Machines
Authors Shin Kamada, Takumi Ichimura
Abstract Recently, a high technique of image processing is required to extract the image features in real time. In our research, the tourist subject data are collected from the Mobile Phone based Participatory Sensing (MPPS) system. Each record consists of image files with GPS, geographic location name, user’s numerical evaluation, and comments written in natural language at sightseeing spots where a user really visits. In our previous research, the famous landmarks in sightseeing spot can be detected by Clonal Selection Algorithm with Immunological Memory Cell (CSAIM). However, some landmarks was not detected correctly by the previous method because they didn’t have enough amount of information for the feature extraction. In order to improve the weakness, we propose the generation method of immunological memory by Restricted Boltzmann Machines. To verify the effectiveness of the method, some experiments for classification of the subjective data are executed by using machine learning tools for Deep Learning.
Tasks
Published 2018-04-09
URL http://arxiv.org/abs/1804.02816v1
PDF http://arxiv.org/pdf/1804.02816v1.pdf
PWC https://paperswithcode.com/paper/a-generation-method-of-immunological-memory
Repo
Framework

LoAdaBoost:Loss-Based AdaBoost Federated Machine Learning on medical Data

Title LoAdaBoost:Loss-Based AdaBoost Federated Machine Learning on medical Data
Authors Li Huang, Yifeng Yin, Zeng Fu, Shifa Zhang, Hao Deng, Dianbo Liu
Abstract Medical data are valuable for improvement of health care, policy making and many other purposes. Vast amount of medical data are stored in different locations, on many different devices and in different data silos. Sharing medical data among different sources is a big challenge due to regulatory, operational and security reasons. One potential solution is federated machine learning ,which is a method that sends machine learning algorithms simultaneously to all data sources, train models in each source and aggregates the learned models. This strategy allows utilization of valuable data without moving them.One challenge in applying federated machine learning is the heterogeneity of data from different sources. To tackle this problem, we proposed an adaptive boosting method that increases the efficiency of federated machine learning. Using intensive care unit data from hospital, we showed that LoAdaBoost federated learning outperformed baseline method and increased communication efficiency at negligible additional cost.
Tasks
Published 2018-11-30
URL https://arxiv.org/abs/1811.12629v3
PDF https://arxiv.org/pdf/1811.12629v3.pdf
PWC https://paperswithcode.com/paper/loadaboostloss-based-adaboost-federated
Repo
Framework

Two-layer Lossless HDR Coding considering Histogram Sparseness with Backward Compatibility to JPEG

Title Two-layer Lossless HDR Coding considering Histogram Sparseness with Backward Compatibility to JPEG
Authors Osamu Watanabe, Hiroyuki Kobayashi, Hitoshi Kiya
Abstract An efficient two-layer coding method using the histogram packing technique with the backward compatibility to the legacy JPEG is proposed in this paper. The JPEG XT, which is the international standard to compress HDR images, adopts two-layer coding scheme for backward compatibility to the legacy JPEG. However, this two-layer coding structure does not give better lossless performance than the other existing single-layer coding methods for HDR images. Moreover, the JPEG XT has problems on determination of the lossless coding parameters; Finding appropriate combination of the parameter values is necessary to achieve good lossless performance. The histogram sparseness of HDR images is discussed and it is pointed out that the histogram packing technique considering the sparseness is able to improve the performance of lossless compression for HDR images and a novel two-layer coding with the histogram packing technique is proposed. The experimental results demonstrate that not only the proposed method has a better lossless compression performance than that of the JPEG XT, but also there is no need to determine image-dependent parameter values for good compression performance in spite of having the backward compatibility to the well known legacy JPEG standard.
Tasks
Published 2018-06-28
URL http://arxiv.org/abs/1806.10746v1
PDF http://arxiv.org/pdf/1806.10746v1.pdf
PWC https://paperswithcode.com/paper/two-layer-lossless-hdr-coding-considering
Repo
Framework

Unsupervised learning for cross-domain medical image synthesis using deformation invariant cycle consistency networks

Title Unsupervised learning for cross-domain medical image synthesis using deformation invariant cycle consistency networks
Authors Chengjia Wang, Gillian Macnaught, Giorgos Papanastasiou, Tom MacGillivray, David Newby
Abstract Recently, the cycle-consistent generative adversarial networks (CycleGAN) has been widely used for synthesis of multi-domain medical images. The domain-specific nonlinear deformations captured by CycleGAN make the synthesized images difficult to be used for some applications, for example, generating pseudo-CT for PET-MR attenuation correction. This paper presents a deformation-invariant CycleGAN (DicycleGAN) method using deformable convolutional layers and new cycle-consistency losses. Its robustness dealing with data that suffer from domain-specific nonlinear deformations has been evaluated through comparison experiments performed on a multi-sequence brain MR dataset and a multi-modality abdominal dataset. Our method has displayed its ability to generate synthesized data that is aligned with the source while maintaining a proper quality of signal compared to CycleGAN-generated data. The proposed model also obtained comparable performance with CycleGAN when data from the source and target domains are alignable through simple affine transformations.
Tasks Image Generation
Published 2018-08-12
URL http://arxiv.org/abs/1808.03944v1
PDF http://arxiv.org/pdf/1808.03944v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-learning-for-cross-domain
Repo
Framework

Many-Goals Reinforcement Learning

Title Many-Goals Reinforcement Learning
Authors Vivek Veeriah, Junhyuk Oh, Satinder Singh
Abstract All-goals updating exploits the off-policy nature of Q-learning to update all possible goals an agent could have from each transition in the world, and was introduced into Reinforcement Learning (RL) by Kaelbling (1993). In prior work this was mostly explored in small-state RL problems that allowed tabular representations and where all possible goals could be explicitly enumerated and learned separately. In this paper we empirically explore 3 different extensions of the idea of updating many (instead of all) goals in the context of RL with deep neural networks (or DeepRL for short). First, in a direct adaptation of Kaelbling’s approach we explore if many-goals updating can be used to achieve mastery in non-tabular visual-observation domains. Second, we explore whether many-goals updating can be used to pre-train a network to subsequently learn faster and better on a single main task of interest. Third, we explore whether many-goals updating can be used to provide auxiliary task updates in training a network to learn faster and better on a single main task of interest. We provide comparisons to baselines for each of the 3 extensions.
Tasks Q-Learning
Published 2018-06-22
URL http://arxiv.org/abs/1806.09605v1
PDF http://arxiv.org/pdf/1806.09605v1.pdf
PWC https://paperswithcode.com/paper/many-goals-reinforcement-learning
Repo
Framework

Linear Transformations for Cross-lingual Semantic Textual Similarity

Title Linear Transformations for Cross-lingual Semantic Textual Similarity
Authors Tomáš Brychcín
Abstract Cross-lingual semantic textual similarity systems estimate the degree of the meaning similarity between two sentences, each in a different language. State-of-the-art algorithms usually employ machine translation and combine vast amount of features, making the approach strongly supervised, resource rich, and difficult to use for poorly-resourced languages. In this paper, we study linear transformations, which project monolingual semantic spaces into a shared space using bilingual dictionaries. We propose a novel transformation, which builds on the best ideas from prior works. We experiment with unsupervised techniques for sentence similarity based only on semantic spaces and we show they can be significantly improved by the word weighting. Our transformation outperforms other methods and together with word weighting leads to very promising results on several datasets in different languages.
Tasks Cross-Lingual Semantic Textual Similarity, Machine Translation, Semantic Textual Similarity
Published 2018-07-11
URL http://arxiv.org/abs/1807.04172v1
PDF http://arxiv.org/pdf/1807.04172v1.pdf
PWC https://paperswithcode.com/paper/linear-transformations-for-cross-lingual
Repo
Framework

Low-Resolution Face Recognition

Title Low-Resolution Face Recognition
Authors Zhiyi Cheng, Xiatian Zhu, Shaogang Gong
Abstract Whilst recent face-recognition (FR) techniques have made significant progress on recognising constrained high-resolution web images, the same cannot be said on natively unconstrained low-resolution images at large scales. In this work, we examine systematically this under-studied FR problem, and introduce a novel Complement Super-Resolution and Identity (CSRI) joint deep learning method with a unified end-to-end network architecture. We further construct a new large-scale dataset TinyFace of native unconstrained low-resolution face images from selected public datasets, because none benchmark of this nature exists in the literature. With extensive experiments we show there is a significant gap between the reported FR performances on popular benchmarks and the results on TinyFace, and the advantages of the proposed CSRI over a variety of state-of-the-art FR and super-resolution deep models on solving this largely ignored FR scenario. The TinyFace dataset is released publicly at: https://qmul-tinyface.github.io/.
Tasks Face Recognition, Super-Resolution
Published 2018-11-21
URL http://arxiv.org/abs/1811.08965v2
PDF http://arxiv.org/pdf/1811.08965v2.pdf
PWC https://paperswithcode.com/paper/low-resolution-face-recognition
Repo
Framework

MRI Cross-Modality NeuroImage-to-NeuroImage Translation

Title MRI Cross-Modality NeuroImage-to-NeuroImage Translation
Authors Qianye Yang, Nannan Li, Zixu Zhao, Xingyu Fan, Eric I-Chao Chang, Yan Xu
Abstract We present a cross-modality generation framework that learns to generate translated modalities from given modalities in MR images without real acquisition. Our proposed method performs NeuroImage-to-NeuroImage translation (abbreviated as N2N) by means of a deep learning model that leverages conditional generative adversarial networks (cGANs). Our framework jointly exploits the low-level features (pixel-wise information) and high-level representations (e.g. brain tumors, brain structure like gray matter, etc.) between cross modalities which are important for resolving the challenging complexity in brain structures. Our framework can serve as an auxiliary method in clinical diagnosis and has great application potential. Based on our proposed framework, we first propose a method for cross-modality registration by fusing the deformation fields to adopt the cross-modality information from translated modalities. Second, we propose an approach for MRI segmentation, translated multichannel segmentation (TMS), where given modalities, along with translated modalities, are segmented by fully convolutional networks (FCN) in a multichannel manner. Both of these two methods successfully adopt the cross-modality information to improve the performance without adding any extra data. Experiments demonstrate that our proposed framework advances the state-of-the-art on five brain MRI datasets. We also observe encouraging results in cross-modality registration and segmentation on some widely adopted brain datasets. Overall, our work can serve as an auxiliary method in clinical diagnosis and be applied to various tasks in medical fields. Keywords: image-to-image, cross-modality, registration, segmentation, brain MRI
Tasks
Published 2018-01-22
URL http://arxiv.org/abs/1801.06940v2
PDF http://arxiv.org/pdf/1801.06940v2.pdf
PWC https://paperswithcode.com/paper/mri-cross-modality-neuroimage-to-neuroimage
Repo
Framework

Testing Untestable Neural Machine Translation: An Industrial Case

Title Testing Untestable Neural Machine Translation: An Industrial Case
Authors Wujie Zheng, Wenyu Wang, Dian Liu, Changrong Zhang, Qinsong Zeng, Yuetang Deng, Wei Yang, Pinjia He, Tao Xie
Abstract Neural Machine Translation (NMT) has been widely adopted recently due to its advantages compared with the traditional Statistical Machine Translation (SMT). However, an NMT system still often produces translation failures due to the complexity of natural language and sophistication in designing neural networks. While in-house black-box system testing based on reference translations (i.e., examples of valid translations) has been a common practice for NMT quality assurance, an increasingly critical industrial practice, named in-vivo testing, exposes unseen types or instances of translation failures when real users are using a deployed industrial NMT system. To fill the gap of lacking test oracle for in-vivo testing of an NMT system, in this paper, we propose a new approach for automatically identifying translation failures, without requiring reference translations for a translation task; our approach can directly serve as a test oracle for in-vivo testing. Our approach focuses on properties of natural language translation that can be checked systematically and uses information from both the test inputs (i.e., the texts to be translated) and the test outputs (i.e., the translations under inspection) of the NMT system. Our evaluation conducted on real-world datasets shows that our approach can effectively detect targeted property violations as translation failures. Our experiences on deploying our approach in both production and development environments of WeChat (a messenger app with over one billion monthly active users) demonstrate high effectiveness of our approach along with high industry impact.
Tasks Machine Translation
Published 2018-07-06
URL http://arxiv.org/abs/1807.02340v2
PDF http://arxiv.org/pdf/1807.02340v2.pdf
PWC https://paperswithcode.com/paper/testing-untestable-neural-machine-translation
Repo
Framework

A comparative study of artificial intelligence and human doctors for the purpose of triage and diagnosis

Title A comparative study of artificial intelligence and human doctors for the purpose of triage and diagnosis
Authors Salman Razzaki, Adam Baker, Yura Perov, Katherine Middleton, Janie Baxter, Daniel Mullarkey, Davinder Sangar, Michael Taliercio, Mobasher Butt, Azeem Majeed, Arnold DoRosario, Megan Mahoney, Saurabh Johri
Abstract Online symptom checkers have significant potential to improve patient care, however their reliability and accuracy remain variable. We hypothesised that an artificial intelligence (AI) powered triage and diagnostic system would compare favourably with human doctors with respect to triage and diagnostic accuracy. We performed a prospective validation study of the accuracy and safety of an AI powered triage and diagnostic system. Identical cases were evaluated by both an AI system and human doctors. Differential diagnoses and triage outcomes were evaluated by an independent judge, who was blinded from knowing the source (AI system or human doctor) of the outcomes. Independently of these cases, vignettes from publicly available resources were also assessed to provide a benchmark to previous studies and the diagnostic component of the MRCGP exam. Overall we found that the Babylon AI powered Triage and Diagnostic System was able to identify the condition modelled by a clinical vignette with accuracy comparable to human doctors (in terms of precision and recall). In addition, we found that the triage advice recommended by the AI System was, on average, safer than that of human doctors, when compared to the ranges of acceptable triage provided by independent expert judges, with only a minimal reduction in appropriateness.
Tasks
Published 2018-06-27
URL http://arxiv.org/abs/1806.10698v1
PDF http://arxiv.org/pdf/1806.10698v1.pdf
PWC https://paperswithcode.com/paper/a-comparative-study-of-artificial
Repo
Framework
comments powered by Disqus