January 29, 2020

3377 words 16 mins read

Paper Group ANR 569

Paper Group ANR 569

A Predictive Model for Steady-State Multiphase Pipe Flow: Machine Learning on Lab Data. Learning Target-oriented Dual Attention for Robust RGB-T Tracking. Image-based reconstruction for the impact problems by using DPNNs. Problems with automating translation of movie/TV show subtitles. Falls Prediction Based on Body Keypoints and Seq2Seq Architectu …

A Predictive Model for Steady-State Multiphase Pipe Flow: Machine Learning on Lab Data

Title A Predictive Model for Steady-State Multiphase Pipe Flow: Machine Learning on Lab Data
Authors Evgenii Kanin, Andrei Osiptsov, Albert Vainshtein, Evgeny Burnaev
Abstract Engineering simulators used for steady-state multiphase pipe flows are commonly utilized to predict pressure drop. Such simulators are typically based on either empirical correlations or first-principles mechanistic models. The simulators allow evaluating the pressure drop in multiphase pipe flow with acceptable accuracy. However, the only shortcoming of these correlations and mechanistic models is their applicability. In order to extend the applicability and the accuracy of the existing accessible methods, a method of pressure drop calculation in the pipeline is proposed. The method is based on well segmentation and calculation of the pressure gradient in each segment using three surrogate models based on Machine Learning algorithms trained on a representative lab data set from the open literature. The first model predicts the value of a liquid holdup in the segment, the second one determines the flow pattern, and the third one is used to estimate the pressure gradient. To build these models, several ML algorithms are trained such as Random Forest, Gradient Boosting Decision Trees, Support Vector Machine, and Artificial Neural Network, and their predictive abilities are cross-compared. The proposed method for pressure gradient calculation yields $R^2 = 0.95$ by using the Gradient Boosting algorithm as compared with $R^2 = 0.92$ in case of Mukherjee and Brill correlation and $R^2 = 0.91$ when a combination of Ansari and Xiao mechanistic models is utilized. The method for pressure drop prediction is also validated on three real field cases. Validation indicates that the proposed model yields the following coefficients of determination: $R^2 = 0.806, 0.815$ and 0.99 as compared with the highest values obtained by commonly used techniques: $R^2 = 0.82$ (Beggs and Brill correlation), $R^2 = 0.823$ (Mukherjee and Brill correlation) and $R^2 = 0.98$ (Beggs and Brill correlation).
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.09746v1
PDF https://arxiv.org/pdf/1905.09746v1.pdf
PWC https://paperswithcode.com/paper/a-predictive-model-for-steady-state
Repo
Framework

Learning Target-oriented Dual Attention for Robust RGB-T Tracking

Title Learning Target-oriented Dual Attention for Robust RGB-T Tracking
Authors Rui Yang, Yabin Zhu, Xiao Wang, Chenglong Li, Jin Tang
Abstract RGB-Thermal object tracking attempt to locate target object using complementary visual and thermal infrared data. Existing RGB-T trackers fuse different modalities by robust feature representation learning or adaptive modal weighting. However, how to integrate dual attention mechanism for visual tracking is still a subject that has not been studied yet. In this paper, we propose two visual attention mechanisms for robust RGB-T object tracking. Specifically, the local attention is implemented by exploiting the common visual attention of RGB and thermal data to train deep classifiers. We also introduce the global attention, which is a multi-modal target-driven attention estimation network. It can provide global proposals for the classifier together with local proposals extracted from previous tracking result. Extensive experiments on two RGB-T benchmark datasets validated the effectiveness of our proposed algorithm.
Tasks Object Tracking, Representation Learning, Rgb-T Tracking, Visual Tracking
Published 2019-08-12
URL https://arxiv.org/abs/1908.04441v1
PDF https://arxiv.org/pdf/1908.04441v1.pdf
PWC https://paperswithcode.com/paper/learning-target-oriented-dual-attention-for
Repo
Framework

Image-based reconstruction for the impact problems by using DPNNs

Title Image-based reconstruction for the impact problems by using DPNNs
Authors Yu Li, Hu Wang, Wenquan Shuai, Honghao Zhang, Yong Peng
Abstract With the improvement of the pattern recognition and feature extraction of Deep Neural Networks (DPNNs), image-based design and optimization have been widely used in multidisciplinary researches. Recently, a Reconstructive Neural Network (ReConNN) has been proposed to obtain an image-based model from an analysis-based model [1, 2], and a steady-state heat transfer of a heat sink has been successfully reconstructed. Commonly, this method is suitable to handle stable-state problems. However, it has difficulties handling nonlinear transient impact problems, due to the bottlenecks of the Deep Neural Network (DPNN). For example, nonlinear transient problems make it difficult for the Generative Adversarial Network (GAN) to generate various reasonable images. Therefore, in this study, an improved ReConNN method is proposed to address the mentioned weaknesses. Time-dependent ordered images can be generated. Furthermore, the improved method is successfully applied in impact simulation case and engineering experiment. Through the experiments, comparisons and analyses, the improved method is demonstrated to outperform the former one in terms of its accuracy, efficiency and costs.
Tasks
Published 2019-04-08
URL https://arxiv.org/abs/1905.03229v3
PDF https://arxiv.org/pdf/1905.03229v3.pdf
PWC https://paperswithcode.com/paper/190503229
Repo
Framework

Problems with automating translation of movie/TV show subtitles

Title Problems with automating translation of movie/TV show subtitles
Authors Prabhakar Gupta, Mayank Sharma, Kartik Pitale, Keshav Kumar
Abstract We present 27 problems encountered in automating the translation of movie/TV show subtitles. We categorize each problem in one of the three categories viz. problems directly related to textual translation, problems related to subtitle creation guidelines, and problems due to adaptability of machine translation (MT) engines. We also present the findings of a translation quality evaluation experiment where we share the frequency of 16 key problems. We show that the systems working at the frontiers of Natural Language Processing do not perform well for subtitles and require some post-processing solutions for redressal of these problems
Tasks Machine Translation
Published 2019-09-04
URL https://arxiv.org/abs/1909.05362v1
PDF https://arxiv.org/pdf/1909.05362v1.pdf
PWC https://paperswithcode.com/paper/problems-with-automating-translation-of
Repo
Framework

Falls Prediction Based on Body Keypoints and Seq2Seq Architecture

Title Falls Prediction Based on Body Keypoints and Seq2Seq Architecture
Authors Minjie Hua, Yibing Nan, Shiguo Lian
Abstract This paper presents a novel approach for predicting the falls of people in advance from monocular video. First, all persons in the observed frames are detected and tracked with the coordinates of their body keypoints being extracted meanwhile. A keypoints vectorization method is exploited to eliminate irrelevant information in the initial coordinate representation. Then, the observed keypoint sequence of each person is input to the pose prediction module adapted from sequence-to-sequence(seq2seq) architecture to predict the future keypoint sequence. Finally, the predicted pose is analyzed by the falls classifier to judge whether the person will fall down in the future. The pose prediction module and falls classifier are trained separately and tuned jointly using Le2i dataset, which contains 191 videos of various normal daily activities as well as falls performed by several actors. The contrast experiments with mainstream raw RGB-based models show the accuracy improvement of utilizing body keypoints in falls classification. Moreover, the precognition of falls is proved effective by comparisons between models that with and without the pose prediction module.
Tasks Pose Prediction
Published 2019-08-01
URL https://arxiv.org/abs/1908.00275v2
PDF https://arxiv.org/pdf/1908.00275v2.pdf
PWC https://paperswithcode.com/paper/falls-prediction-based-on-body-keypoints-and
Repo
Framework

Style transfer-based image synthesis as an efficient regularization technique in deep learning

Title Style transfer-based image synthesis as an efficient regularization technique in deep learning
Authors Agnieszka Mikołajczyk, Michał Grochowski
Abstract These days deep learning is the fastest-growing area in the field of Machine Learning. Convolutional Neural Networks are currently the main tool used for image analysis and classification purposes. Although great achievements and perspectives, deep neural networks and accompanying learning algorithms have some relevant challenges to tackle. In this paper, we have focused on the most frequently mentioned problem in the field of machine learning, that is relatively poor generalization abilities. Partial remedies for this are regularization techniques e.g. dropout, batch normalization, weight decay, transfer learning, early stopping and data augmentation. In this paper, we have focused on data augmentation. We propose to use a method based on a neural style transfer, which allows generating new unlabeled images of a high perceptual quality that combine the content of a base image with the appearance of another one. In a proposed approach, the newly created images are described with pseudo-labels, and then used as a training dataset. Real, labeled images are divided into the validation and test set. We validated the proposed method on a challenging skin lesion classification case study. Four representative neural architectures are examined. Obtained results show the strong potential of the proposed approach.
Tasks Data Augmentation, Image Generation, Skin Lesion Classification, Style Transfer, Transfer Learning
Published 2019-05-27
URL https://arxiv.org/abs/1905.10974v1
PDF https://arxiv.org/pdf/1905.10974v1.pdf
PWC https://paperswithcode.com/paper/style-transfer-based-image-synthesis-as-an
Repo
Framework

RTOP: A Conceptual and Computational Framework for General Intelligence

Title RTOP: A Conceptual and Computational Framework for General Intelligence
Authors Shilpesh Garg
Abstract A novel general intelligence model is proposed with three types of learning. A unified sequence of the foreground percept trace and the command trace translates into direct and time-hop observation paths to form the basis of Raw learning. Raw learning includes the formation of image-image associations, which lead to the perception of temporal and spatial relationships among objects and object parts; and the formation of image-audio associations, which serve as the building blocks of language. Offline identification of similar segments in the observation paths and their subsequent reduction into a common segment through merging of memory nodes leads to Generalized learning. Generalization includes the formation of interpolated sensory nodes for robust and generic matching, the formation of sensory properties nodes for specific matching and superimposition, and the formation of group nodes for simpler logic pathways. Online superimposition of memory nodes across multiple predictions, primarily the superimposition of images on the internal projection canvas, gives rise to Innovative learning and thought. The learning of actions happens the same way as raw learning while the action determination happens through the utility model built into the raw learnings, the utility function being the pleasure and pain of the physical senses.
Tasks
Published 2019-10-23
URL https://arxiv.org/abs/1910.10393v2
PDF https://arxiv.org/pdf/1910.10393v2.pdf
PWC https://paperswithcode.com/paper/rtop-a-conceptual-and-computational-framework
Repo
Framework

Unimodal-uniform Constrained Wasserstein Training for Medical Diagnosis

Title Unimodal-uniform Constrained Wasserstein Training for Medical Diagnosis
Authors Xiaofeng Liu, Xu Han, Yukai Qiao, Yi Ge, Lu Jun
Abstract The labels in medical diagnosis task are usually discrete and successively distributed. For example, the Diabetic Retinopathy Diagnosis (DR) involves five health risk levels: no DR (0), mild DR (1), moderate DR (2), severe DR (3) and proliferative DR (4). This labeling system is common for medical disease. Previous methods usually construct a multi-binary-classification task or propose some re-parameter schemes in the output unit. In this paper, we target on this task from the perspective of loss function. More specifically, the Wasserstein distance is utilized as an alternative, explicitly incorporating the inter-class correlations by pre-defining its ground metric. Then, the ground metric which serves as a linear, convex or concave increasing function w.r.t. the Euclidean distance in a line is explored from an optimization perspective. Meanwhile, this paper also proposes of constructing the smoothed target labels that model the inlier and outlier noises by using a unimodal-uniform mixture distribution. Different from the one-hot setting, the smoothed label endues the computation of Wasserstein distance with more challenging features. With either one-hot or smoothed target label, this paper systematically concludes the practical closed-form solution. We evaluate our method on several medical diagnosis tasks (e.g., Diabetic Retinopathy and Ultrasound Breast dataset) and achieve state-of-the-art performance.
Tasks Medical Diagnosis
Published 2019-11-03
URL https://arxiv.org/abs/1911.02475v1
PDF https://arxiv.org/pdf/1911.02475v1.pdf
PWC https://paperswithcode.com/paper/unimodal-uniform-constrained-wasserstein
Repo
Framework

Inference of visual field test performance from OCT volumes using deep learning

Title Inference of visual field test performance from OCT volumes using deep learning
Authors Stefan Maetschke, Bhavna Antony, Hiroshi Ishikawa, Gadi Wollstein, Joel Schuman, Rahil Garnavi
Abstract Visual field tests (VFT) are pivotal for glaucoma diagnosis and conducted regularly to monitor disease progression. Here we address the question to what degree aggregate VFT measurements such as Visual Field Index (VFI) and Mean Deviation (MD) can be inferred from Optical Coherence Tomography (OCT) scans of the Optic Nerve Head (ONH) or the macula. Accurate inference of VFT measurements from OCT could reduce examination time and cost. We propose a novel 3D Convolutional Neural Network (CNN) for this task and compare its accuracy with classical machine learning (ML) algorithms trained on common, segmentation-based OCT, features employed for glaucoma diagnostics. Peak accuracies were achieved on ONH scans when inferring VFI with a Pearson Correlation (PC) of 0.88$\pm$0.035 for the CNN and a significantly lower (p $<$ 0.01) PC of 0.74$\pm$0.090 for the best performing, classical ML algorithm - a Random Forest regressor. Estimation of MD was equally accurate with a PC of 0.88$\pm$0.023 on ONH scans for the CNN.
Tasks
Published 2019-08-05
URL https://arxiv.org/abs/1908.01428v3
PDF https://arxiv.org/pdf/1908.01428v3.pdf
PWC https://paperswithcode.com/paper/inference-of-visual-field-test-performance
Repo
Framework

A New Compensatory Genetic Algorithm-Based Method for Effective Compressed Multi-function Convolutional Neural Network Model Selection with Multi-Objective Optimization

Title A New Compensatory Genetic Algorithm-Based Method for Effective Compressed Multi-function Convolutional Neural Network Model Selection with Multi-Objective Optimization
Authors Luna M. Zhang
Abstract In recent years, there have been many popular Convolutional Neural Networks (CNNs), such as Google’s Inception-V4, that have performed very well for various image classification problems. These commonly used CNN models usually use the same activation function, such as RELU, for all neurons in the convolutional layers; they are “Single-function CNNs.” However, SCNNs may not always be optimal. Thus, a “Multi-function CNN” (MCNN), which uses different activation functions for different neurons, has been shown to outperform a SCNN. Also, CNNs typically have very large architectures that use a lot of memory and need a lot of data in order to be trained well. As a result, they tend to have very high training and prediction times too. An important research problem is how to automatically and efficiently find the best CNN with both high classification performance and compact architecture with high training and prediction speeds, small power usage, and small memory size for any image classification problem. It is very useful to intelligently find an effective, fast, energy-efficient, and memory-efficient “Compressed Multi-function CNN” (CMCNN) from a large number of candidate MCNNs. A new compensatory algorithm using a new genetic algorithm (GA) is created to find the best CMCNN with an ideal compensation between performance and architecture size. The optimal CMCNN has the best performance and the smallest architecture size. Simulations using the CIFAR10 dataset showed that the new compensatory algorithm could find CMCNNs that could outperform non-compressed MCNNs in terms of classification performance (F1-score), speed, power usage, and memory usage. Other effective, fast, power-efficient, and memory-efficient CMCNNs based on popular CNN architectures will be developed for image classification problems in important real-world applications, such as brain informatics and biomedical imaging.
Tasks Image Classification, Model Selection
Published 2019-06-08
URL https://arxiv.org/abs/1906.11912v1
PDF https://arxiv.org/pdf/1906.11912v1.pdf
PWC https://paperswithcode.com/paper/a-new-compensatory-genetic-algorithm-based
Repo
Framework

Sinusoidal wave generating network based on adversarial learning and its application: synthesizing frog sounds for data augmentation

Title Sinusoidal wave generating network based on adversarial learning and its application: synthesizing frog sounds for data augmentation
Authors Sangwook Park, David K. Han, Hanseok Ko
Abstract Simulators that generate observations based on theoretical models can be important tools for development, prediction, and assessment of signal processing algorithms. In order to design these simulators, painstaking effort is required to construct mathematical models according to their application. Complex models are sometimes necessary to represent a variety of real phenomena. In contrast, obtaining synthetic observations from generative models developed from real observations often require much less effort. This paper proposes a generative model based on adversarial learning. Given that observations are typically signals composed of a linear combination of sinusoidal waves and random noises, sinusoidal wave generating networks are first designed based on an adversarial network. Audio waveform generation can then be performed using the proposed network. Several approaches to designing the objective function of the proposed network using adversarial learning are investigated experimentally. In addition, amphibian sound classification is performed using a convolutional neural network trained with real and synthetic sounds. Both qualitative and quantitative results show that the proposed generative model makes realistic signals and is very helpful for data augmentation and data analysis.
Tasks Data Augmentation
Published 2019-01-07
URL http://arxiv.org/abs/1901.02050v1
PDF http://arxiv.org/pdf/1901.02050v1.pdf
PWC https://paperswithcode.com/paper/sinusoidal-wave-generating-network-based-on
Repo
Framework

Towards a framework for the evolution of artificial general intelligence

Title Towards a framework for the evolution of artificial general intelligence
Authors Sidney Pontes-Filho, Stefano Nichele
Abstract In this work, a novel framework for the emergence of general intelligence is proposed, where agents evolve through environmental rewards and learn throughout their lifetime without supervision, i.e., self-supervised learning through embodiment. The chosen control mechanism for agents is a biologically plausible neuron model based on spiking neural networks. Network topologies become more complex through evolution, i.e., the topology is not fixed, while the synaptic weights of the networks cannot be inherited, i.e., newborn brains are not trained and have no innate knowledge of the environment. What is subject to the evolutionary process is the network topology, the type of neurons, and the type of learning. This process ensures that controllers that are passed through the generations have the intrinsic ability to learn and adapt during their lifetime in mutable environments. We envision that the described approach may lead to the emergence of the simplest form of artificial general intelligence.
Tasks
Published 2019-03-25
URL http://arxiv.org/abs/1903.10410v3
PDF http://arxiv.org/pdf/1903.10410v3.pdf
PWC https://paperswithcode.com/paper/towards-a-framework-for-the-evolution-of
Repo
Framework

Opinion shaping in social networks using reinforcement learning

Title Opinion shaping in social networks using reinforcement learning
Authors Vivek Borkar, Alexandre Reiffers-Masson
Abstract In this paper, we study how to shape opinions in social networks when the matrix of interactions is unknown. We consider classical opinion dynamics with some stubborn agents and the possibility of continuously influencing the opinions of a few selected agents, albeit under resource constraints. We map the opinion dynamics to a value iteration scheme for policy evaluation for a specific stochastic shortest path problem. This leads to a representation of the opinion vector as an approximate value function for a stochastic shortest path problem with some non-classical constraints. We suggest two possible ways of influencing agents. One leads to a convex optimization problem and the other to a non-convex one. Firstly, for both problems, we propose two different online two-time scale reinforcement learning schemes that converge to the optimal solution of each problem. Secondly, we suggest stochastic gradient descent schemes and compare these classes of algorithms with the two-time scale reinforcement learning schemes. Thirdly, we also derive another algorithm designed to tackle the curse of dimensionality one faces when all agents are observed. Numerical studies are provided to illustrate the convergence and efficiency of our algorithms.
Tasks
Published 2019-10-19
URL https://arxiv.org/abs/1910.08802v1
PDF https://arxiv.org/pdf/1910.08802v1.pdf
PWC https://paperswithcode.com/paper/opinion-shaping-in-social-networks-using
Repo
Framework

Making the Cut: A Bandit-based Approach to Tiered Interviewing

Title Making the Cut: A Bandit-based Approach to Tiered Interviewing
Authors Candice Schumann, Zhi Lang, Jeffrey S. Foster, John P. Dickerson
Abstract Given a huge set of applicants, how should a firm allocate sequential resume screenings, phone interviews, and in-person site visits? In a tiered interview process, later stages (e.g., in-person visits) are more informative, but also more expensive than earlier stages (e.g., resume screenings). Using accepted hiring models and the concept of structured interviews, a best practice in human resources, we cast tiered hiring as a combinatorial pure exploration (CPE) problem in the stochastic multi-armed bandit setting. The goal is to select a subset of arms (in our case, applicants) with some combinatorial structure. We present new algorithms in both the probably approximately correct (PAC) and fixed-budget settings that select a near-optimal cohort with provable guarantees. We show via simulations on real data from one of the largest US-based computer science graduate programs that our algorithms make better hiring decisions or use less budget than the status quo.
Tasks
Published 2019-06-23
URL https://arxiv.org/abs/1906.09621v2
PDF https://arxiv.org/pdf/1906.09621v2.pdf
PWC https://paperswithcode.com/paper/making-the-cut-a-bandit-based-approach-to
Repo
Framework

Deep Contextualized Biomedical Abbreviation Expansion

Title Deep Contextualized Biomedical Abbreviation Expansion
Authors Qiao Jin, Jinling Liu, Xinghua Lu
Abstract Automatic identification and expansion of ambiguous abbreviations are essential for biomedical natural language processing applications, such as information retrieval and question answering systems. In this paper, we present DEep Contextualized Biomedical. Abbreviation Expansion (DECBAE) model. DECBAE automatically collects substantial and relatively clean annotated contexts for 950 ambiguous abbreviations from PubMed abstracts using a simple heuristic. Then it utilizes BioELMo to extract the contextualized features of words, and feed those features to abbreviation-specific bidirectional LSTMs, where the hidden states of the ambiguous abbreviations are used to assign the exact definitions. Our DECBAE model outperforms other baselines by large margins, achieving average accuracy of 0.961 and macro-F1 of 0.917 on the dataset. It also surpasses human performance for expanding a sample abbreviation, and remains robust in imbalanced, low-resources and clinical settings.
Tasks Information Retrieval, Question Answering
Published 2019-06-08
URL https://arxiv.org/abs/1906.03360v1
PDF https://arxiv.org/pdf/1906.03360v1.pdf
PWC https://paperswithcode.com/paper/deep-contextualized-biomedical-abbreviation
Repo
Framework
comments powered by Disqus