January 29, 2020

3397 words 16 mins read

Paper Group ANR 515

A Transfer Learning Method for Goal Recognition Exploiting Cross-Domain Spatial Features. Cross-Subject Transfer Learning in Human Activity Recognition Systems using Generative Adversarial Networks. Interactive Trajectory Adaptation through Force-guided Bayesian Optimization. Neural Spectrum Alignment. Fluid segmentation in Neutrosophic domain. Dis …

A Transfer Learning Method for Goal Recognition Exploiting Cross-Domain Spatial Features


Title	A Transfer Learning Method for Goal Recognition Exploiting Cross-Domain Spatial Features
Authors	Thibault Duhamel, Mariane Maynard, Froduald Kabanza
Abstract	The ability to infer the intentions of others, predict their goals, and deduce their plans are critical features for intelligent agents. For a long time, several approaches investigated the use of symbolic representations and inferences with limited success, principally because it is difficult to capture the cognitive knowledge behind human decisions explicitly. The trend, nowadays, is increasingly focusing on learning to infer intentions directly from data, using deep learning in particular. We are now observing interesting applications of intent classification in natural language processing, visual activity recognition, and emerging approaches in other domains. This paper discusses a novel approach combining few-shot and transfer learning with cross-domain features, to learn to infer the intent of an agent navigating in physical environments, executing arbitrary long sequences of actions to achieve their goals. Experiments in synthetic environments demonstrate improved performance in terms of learning from few samples and generalizing to unseen configurations, compared to a deep-learning baseline approach.
Tasks	Activity Recognition, Intent Classification, Transfer Learning
Published	2019-11-22
URL	https://arxiv.org/abs/1911.10134v1
PDF	https://arxiv.org/pdf/1911.10134v1.pdf
PWC	https://paperswithcode.com/paper/a-transfer-learning-method-for-goal
Repo
Framework

Cross-Subject Transfer Learning in Human Activity Recognition Systems using Generative Adversarial Networks


Title	Cross-Subject Transfer Learning in Human Activity Recognition Systems using Generative Adversarial Networks
Authors	Elnaz Soleimania, Ehsan Nazerfard
Abstract	Application of intelligent systems especially in smart homes and health-related topics has been drawing more attention in the last decades. Training Human Activity Recognition (HAR) models – as a major module – requires a fair amount of labeled data. Despite training with large datasets, most of the existing models will face a dramatic performance drop when they are tested against unseen data from new users. Moreover, recording enough data for each new user is unviable due to the limitations and challenges of working with human users. Transfer learning techniques aim to transfer the knowledge which has been learned from the source domain (subject) to the target domain in order to decrease the models’ performance loss in the target domain. This paper presents a novel method of adversarial knowledge transfer named SA-GAN stands for Subject Adaptor GAN which utilizes Generative Adversarial Network framework to perform cross-subject transfer learning in the domain of wearable sensor-based Human Activity Recognition. SA-GAN outperformed other state-of-the-art methods in more than 66% of experiments and showed the second best performance in the remaining 25% of experiments. In some cases, it reached up to 90% of the accuracy which can be obtained by supervised training over the same domain data.
Tasks	Activity Recognition, Human Activity Recognition, Transfer Learning
Published	2019-03-29
URL	http://arxiv.org/abs/1903.12489v1
PDF	http://arxiv.org/pdf/1903.12489v1.pdf
PWC	https://paperswithcode.com/paper/cross-subject-transfer-learning-in-human
Repo
Framework

Interactive Trajectory Adaptation through Force-guided Bayesian Optimization


Title	Interactive Trajectory Adaptation through Force-guided Bayesian Optimization
Authors	Leonel Rozo
Abstract	Flexible manufacturing processes demand robots to easily adapt to changes in the environment and interact with humans. In such dynamic scenarios, robotic tasks may be programmed through learning-from-demonstration approaches, where a nominal plan of the task is learned by the robot. However, the learned plan may need to be adapted in order to fulfill additional requirements or overcome unexpected environment changes. When the required adaptation occurs at the end-effector trajectory level, a human operator may want to intuitively show the robot the desired changes by physically interacting with it. In this scenario, the robot needs to understand the human intended changes from noisy haptic data, quickly adapt accordingly and execute the nominal task plan when no further adaptation is needed. This paper addresses the aforementioned challenges by leveraging LfD and Bayesian optimization to endow the robot with data-efficient adaptation capabilities. Our approach exploits the sensed interaction forces to guide the robot adaptation, and speeds up the optimization process by defining local search spaces extracted from the learned task model. We show how our framework quickly adapts the learned spatial-temporal patterns of the task, leading to deformed trajectory distributions that are consistent with the nominal plan and the changes introduced by the human.
Tasks
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07263v1
PDF	https://arxiv.org/pdf/1908.07263v1.pdf
PWC	https://paperswithcode.com/paper/interactive-trajectory-adaptation-through
Repo
Framework

Neural Spectrum Alignment


Title	Neural Spectrum Alignment
Authors	Dmitry Kopitkov, Vadim Indelman
Abstract	Expressiveness of deep models was recently addressed via the connection between neural networks (NNs) and kernel learning, where first-order dynamics of NN during a gradient-descent (GD) optimization were related to gradient similarity kernel, also known as Neural Tangent Kernel (NTK). In the majority of works this kernel is considered to be time-invariant, with its properties being defined entirely by NN architecture and independent of the learning task at hand. In contrast, in this paper we empirically explore these properties along the optimization and show that in practical applications the NN kernel changes in a very dramatic and meaningful way, with its top eigenfunctions aligning toward the target function learned by NN. Moreover, these top eigenfunctions serve sort of basis functions for NN output - a function represented by NN is spanned almost completely by them for the entire optimization process. Further, since the learning along top eigenfunctions is typically fast, their alignment with the target function improves the overall optimization performance. In addition, we study how the neural spectrum is affected by learning rate decay, typically done by practitioners, showing various trends in the kernel behavior. We argue that the presented phenomena may lead to a more complete theoretical understanding behind NN learning.
Tasks
Published	2019-10-19
URL	https://arxiv.org/abs/1910.08720v2
PDF	https://arxiv.org/pdf/1910.08720v2.pdf
PWC	https://paperswithcode.com/paper/neural-spectrum-alignment
Repo
Framework

Fluid segmentation in Neutrosophic domain


Title	Fluid segmentation in Neutrosophic domain
Authors	Elyas Rashno, Abdolreza Rashno, Sadegh Fadaei
Abstract	Optical coherence tomography (OCT) as retina imaging technology is currently used by ophthalmologist as a non-invasive and non-contact method for diagnosis of agerelated degeneration (AMD) and diabetic macular edema (DME) diseases. Fluid regions in OCT images reveal the main signs of AMD and DME. In this paper, an efficient and fast clustering in neutrosophic (NS) domain referred as neutrosophic C-means is adapted for fluid segmentation. For this task, a NCM cost function in NS domain is adapted for fluid segmentation and then optimized by gradient descend methods which leads to binary segmentation of OCT Bscans to fluid and tissue regions. The proposed method is evaluated in OCT datasets of subjects with DME abnormalities. Results showed that the proposed method outperforms existing fluid segmentation methods by 6% in dice coefficient and sensitivity criteria.
Tasks
Published	2019-12-23
URL	https://arxiv.org/abs/1912.11540v1
PDF	https://arxiv.org/pdf/1912.11540v1.pdf
PWC	https://paperswithcode.com/paper/fluid-segmentation-in-neutrosophic-domain
Repo
Framework

Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features


Title	Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features
Authors	Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen
Abstract	Deep clustering (DC) and utterance-level permutation invariant training (uPIT) have been demonstrated promising for speaker-independent speech separation. DC is usually formulated as two-step processes: embedding learning and embedding clustering, which results in complex separation pipelines and a huge obstacle in directly optimizing the actual separation objectives. As for uPIT, it only minimizes the chosen permutation with the lowest mean square error, doesn’t discriminate it with other permutations. In this paper, we propose a discriminative learning method for speaker-independent speech separation using deep embedding features. Firstly, a DC network is trained to extract deep embedding features, which contain each source’s information and have an advantage in discriminating each target speakers. Then these features are used as the input for uPIT to directly separate the different sources. Finally, uPIT and DC are jointly trained, which directly optimizes the actual separation objectives. Moreover, in order to maximize the distance of each permutation, the discriminative learning is applied to fine tuning the whole model. Our experiments are conducted on WSJ0-2mix dataset. Experimental results show that the proposed models achieve better performances than DC and uPIT for speaker-independent speech separation.
Tasks	Speech Separation
Published	2019-07-23
URL	https://arxiv.org/abs/1907.09884v1
PDF	https://arxiv.org/pdf/1907.09884v1.pdf
PWC	https://paperswithcode.com/paper/discriminative-learning-for-monaural-speech
Repo
Framework

Group, Extract and Aggregate: Summarizing a Large Amount of Finance News for Forex Movement Prediction


Title	Group, Extract and Aggregate: Summarizing a Large Amount of Finance News for Forex Movement Prediction
Authors	Deli Chen, Shuming ma, Keiko Harimoto, Ruihan Bao, Qi Su, Xu Sun
Abstract	Incorporating related text information has proven successful in stock market prediction. However, it is a huge challenge to utilize texts in the enormous forex (foreign currency exchange) market because the associated texts are too redundant. In this work, we propose a BERT-based Hierarchical Aggregation Model to summarize a large amount of finance news to predict forex movement. We firstly group news from different aspects: time, topic and category. Then we extract the most crucial news in each group by the SOTA extractive summarization method. Finally, we conduct interaction between the news and the trade data with attention to predict the forex movement. The experimental results show that the category based method performs best among three grouping methods and outperforms all the baselines. Besides, we study the influence of essential news attributes (category and region) by statistical analysis and summarize the influence patterns for different currency pairs.
Tasks	Stock Market Prediction
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05032v1
PDF	https://arxiv.org/pdf/1910.05032v1.pdf
PWC	https://paperswithcode.com/paper/group-extract-and-aggregate-summarizing-a
Repo
Framework

Enhanced Center Coding for Cell Detection with Convolutional Neural Networks


Title	Enhanced Center Coding for Cell Detection with Convolutional Neural Networks
Authors	Haoyi Liang, Aijaz Naik, Cedric L. Williams, Jaideep Kapur, Daniel S. Weller
Abstract	Cell imaging and analysis are fundamental to biomedical research because cells are the basic functional units of life. Among different cell-related analysis, cell counting and detection are widely used. In this paper, we focus on one common step of learning-based cell counting approaches: coding the raw dot labels into more suitable maps for learning. Two criteria of coding raw dot labels are discussed, and a new coding scheme is proposed in this paper. The two criteria measure how easy it is to train the model with a coding scheme, and how robust the recovered raw dot labels are when predicting. The most compelling advantage of the proposed coding scheme is the ability to distinguish neighboring cells in crowded regions. Cell counting and detection experiments are conducted for five coding schemes on four types of cells and two network architectures. The proposed coding scheme improves the counting accuracy versus the widely-used Gaussian and rectangle kernels up to 12%, and also improves the detection accuracy versus the common proximity coding up to 14%.
Tasks
Published	2019-04-18
URL	http://arxiv.org/abs/1904.08864v1
PDF	http://arxiv.org/pdf/1904.08864v1.pdf
PWC	https://paperswithcode.com/paper/enhanced-center-coding-for-cell-detection
Repo
Framework

Acoustic Model Optimization Based On Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition


Title	Acoustic Model Optimization Based On Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition
Authors	Xiaodong Cui, Michael Picheny
Abstract	Evolutionary stochastic gradient descent (ESGD) was proposed as a population-based approach that combines the merits of gradient-aware and gradient-free optimization algorithms for superior overall optimization performance. In this paper we investigate a variant of ESGD for optimization of acoustic models for automatic speech recognition (ASR). In this variant, we assume the existence of a well-trained acoustic model and use it as an anchor in the parent population whose good “gene” will propagate in the evolution to the offsprings. We propose an ESGD algorithm leveraging the anchor models such that it guarantees the best fitness of the population will never degrade from the anchor model. Experiments on 50-hour Broadcast News (BN50) and 300-hour Switchboard (SWB300) show that the ESGD with anchors can further improve the loss and ASR performance over the existing well-trained acoustic models.
Tasks	Speech Recognition
Published	2019-07-10
URL	https://arxiv.org/abs/1907.04882v1
PDF	https://arxiv.org/pdf/1907.04882v1.pdf
PWC	https://paperswithcode.com/paper/acoustic-model-optimization-based-on
Repo
Framework

Idealize - A Notion of Idea Strength


Title	Idealize - A Notion of Idea Strength
Authors	Rui Portocarrero Sarmento
Abstract	Business Entrepreneurs frequently thrive on looking for ways to test business ideas, without giving too much information. Recent techniques in startup development promote the use of surveys to measure the potential client’s interest. In this preliminary report, we describe the concept behind Idealize, a Shiny R application to measure the local trend strength of a potential idea. Additionally, the system might provide a relative distance to the capital city of the country. The tests were made for the United States of America, i.e., made available regarding native English language. This report shows some of the tests results with this system.
Tasks
Published	2019-04-06
URL	http://arxiv.org/abs/1904.03401v1
PDF	http://arxiv.org/pdf/1904.03401v1.pdf
PWC	https://paperswithcode.com/paper/idealize-a-notion-of-idea-strength
Repo
Framework

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI


Title	Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI
Authors	Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, Raja Chatila, Francisco Herrera
Abstract	In the last years, Artificial Intelligence (AI) has achieved a notable momentum that may deliver the best of expectations over many application sectors across the field. For this to occur, the entire community stands in front of the barrier of explainability, an inherent problem of AI techniques brought by sub-symbolism (e.g. ensembles or Deep Neural Networks) that were not present in the last hype of AI. Paradigms underlying this problem fall within the so-called eXplainable AI (XAI) field, which is acknowledged as a crucial feature for the practical deployment of AI models. This overview examines the existing literature in the field of XAI, including a prospect toward what is yet to be reached. We summarize previous efforts to define explainability in Machine Learning, establishing a novel definition that covers prior conceptual propositions with a major focus on the audience for which explainability is sought. We then propose and discuss about a taxonomy of recent contributions related to the explainability of different Machine Learning models, including those aimed at Deep Learning methods for which a second taxonomy is built. This literature analysis serves as the background for a series of challenges faced by XAI, such as the crossroads between data fusion and explainability. Our prospects lead toward the concept of Responsible Artificial Intelligence, namely, a methodology for the large-scale implementation of AI methods in real organizations with fairness, model explainability and accountability at its core. Our ultimate goal is to provide newcomers to XAI with a reference material in order to stimulate future research advances, but also to encourage experts and professionals from other disciplines to embrace the benefits of AI in their activity sectors, without any prior bias for its lack of interpretability.
Tasks
Published	2019-10-22
URL	https://arxiv.org/abs/1910.10045v2
PDF	https://arxiv.org/pdf/1910.10045v2.pdf
PWC	https://paperswithcode.com/paper/explainable-artificial-intelligence-xai
Repo
Framework

Segmentation of Levator Hiatus Using Multi-Scale Local Region Active contours and Boundary Shape Similarity Constraint


Title	Segmentation of Levator Hiatus Using Multi-Scale Local Region Active contours and Boundary Shape Similarity Constraint
Authors	Xinling Zhang, Xu Li, Ying Chen, Yixin Gan, Dexing Kong, Rongqin Zheng
Abstract	In this paper, a multi-scale framework with local region based active contour and boundary shape similarity constraint is proposed for the segmentation of levator hiatus in ultrasound images. In this paper, we proposed a multiscale active contour framework to segment levator hiatus ultrasound images by combining the local region information and boundary shape similarity constraint. In order to get more precisely initializations and reduce the computational cost, Gaussian pyramid method is used to decompose the image into coarse-to-fine scales. A localized region active contour model is firstly performed on the coarsest scale image to get a rough contour of the levator hiatus, then the segmentation result on the coarse scale is interpolated into the finer scale image as the initialization. The boundary shape similarity between different scales is incorporate into the local region based active contour model so that the result from coarse scale can guide the contour evolution at finer scale. By incorporating the multi-scale and boundary shape similarity, the proposed method can precisely locate the levator hiatus boundaries despite various ultrasound image artifacts. With a data set of 90 levator hiatus ultrasound images, the efficiency and accuracy of the proposed method are validated by quantitative and qualitative evaluations (TP, FP, Js) and comparison with other two state-of-art active contour segmentation methods (C-V model, DRLSE model).
Tasks
Published	2019-01-11
URL	http://arxiv.org/abs/1901.03472v1
PDF	http://arxiv.org/pdf/1901.03472v1.pdf
PWC	https://paperswithcode.com/paper/segmentation-of-levator-hiatus-using-multi
Repo
Framework

New Item Consumption Prediction Using Deep Learning


Title	New Item Consumption Prediction Using Deep Learning
Authors	Michael Shekasta, Gilad Katz, Asnat Greenstein-Messica, Lior Rokach, Bracha Shapira
Abstract	Recommendation systems have become ubiquitous in today’s online world and are an integral part of practically every e-commerce platform. While traditional recommender systems use customer history, this approach is not feasible in ‘cold start’ scenarios. Such scenarios include the need to produce recommendations for new or unregistered users and the introduction of new items. In this study, we present the Purchase Intent Session-bAsed (PISA) algorithm, a content-based algorithm for predicting the purchase intent for cold start session-based scenarios. Our approach employs deep learning techniques both for modeling the content and purchase intent prediction. Our experiments show that PISA outperforms a well-known deep learning baseline when new items are introduced. In addition, while content-based approaches often fail to perform well in highly imbalanced datasets, our approach successfully handles such cases. Finally, our experiments show that combining PISA with the baseline in non-cold start scenarios further improves performance.
Tasks	Recommendation Systems
Published	2019-05-05
URL	https://arxiv.org/abs/1905.01686v2
PDF	https://arxiv.org/pdf/1905.01686v2.pdf
PWC	https://paperswithcode.com/paper/new-item-consumption-prediction-using-deep
Repo
Framework

Natural Language Semantics With Pictures: Some Language & Vision Datasets and Potential Uses for Computational Semantics


Title	Natural Language Semantics With Pictures: Some Language & Vision Datasets and Potential Uses for Computational Semantics
Authors	David Schlangen
Abstract	Propelling, and propelled by, the “deep learning revolution”, recent years have seen the introduction of ever larger corpora of images annotated with natural language expressions. We survey some of these corpora, taking a perspective that reverses the usual directionality, as it were, by viewing the images as semantic annotation of the natural language expressions. We discuss datasets that can be derived from the corpora, and tasks of potential interest for computational semanticists that can be defined on those. In this, we make use of relations provided by the corpora (namely, the link between expression and image, and that between two expressions linked to the same image) and relations that we can add (similarity relations between expressions, or between images). Specifically, we show that in this way we can create data that can be used to learn and evaluate lexical and compositional grounded semantics, and we show that the “linked to same image” relation tracks a semantic implication relation that is recognisable to annotators even in the absence of the linking image as evidence. Finally, as an example of possible benefits of this approach, we show that an exemplar-model-based approach to implication beats a (simple) distributional space-based one on some derived datasets, while lending itself to explainability.
Tasks
Published	2019-04-15
URL	http://arxiv.org/abs/1904.07318v1
PDF	http://arxiv.org/pdf/1904.07318v1.pdf
PWC	https://paperswithcode.com/paper/natural-language-semantics-with-pictures-some
Repo
Framework

Through-Wall Pose Imaging in Real-Time with a Many-to-Many Encoder/Decoder Paradigm


Title	Through-Wall Pose Imaging in Real-Time with a Many-to-Many Encoder/Decoder Paradigm
Authors	Kevin Meng, Yu Meng
Abstract	Overcoming the visual barrier and developing “see-through vision” has been one of mankind’s long-standing dreams. Unlike visible light, Radio Frequency (RF) signals penetrate opaque obstructions and reflect highly off humans. This paper establishes a deep-learning model that can be trained to reconstruct continuous video of a 15-point human skeleton even through visual occlusion. The training process adopts a student/teacher learning procedure inspired by the Feynman learning technique, in which video frames and RF data are first collected simultaneously using a co-located setup containing an optical camera and an RF antenna array transceiver. Next, the video frames are processed with a computer-vision-based gait analysis “teacher” module to generate ground-truth human skeletons for each frame. Then, the same type of skeleton is predicted from corresponding RF data using a “student” deep-learning model consisting of a Residual Convolutional Neural Network (CNN), Region Proposal Network (RPN), and Recurrent Neural Network with Long-Short Term Memory (LSTM) that 1) extracts spatial features from RF images, 2) detects all people present in a scene, and 3) aggregates information over many time-steps, respectively. The model is shown to both accurately and completely predict the pose of humans behind visual obstruction solely using RF signals. Primary academic contributions include the novel many-to-many imaging methodology, unique integration of RPN and LSTM networks, and original training pipeline.
Tasks
Published	2019-03-15
URL	https://arxiv.org/abs/1904.00739v2
PDF	https://arxiv.org/pdf/1904.00739v2.pdf
PWC	https://paperswithcode.com/paper/the-sixth-sense-with-artificial-intelligence
Repo
Framework