October 17, 2019

3353 words 16 mins read

Paper Group ANR 805

2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning. Multi-optional Many-sorted Past Present Future structures and its description. A Method for Analysis of Patient Speech in Dialogue for Dementia Detection. Automatic Documentation of ICD Codes with Far-Field Speech Recognition. Unsupervised Machine Learning Based on Non-Nega …

2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning


Title	2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning
Authors	Diogo C. Luvizon, David Picard, Hedi Tabia
Abstract	Action recognition and human pose estimation are closely related but both problems are generally handled as distinct tasks in the literature. In this work, we propose a multitask framework for jointly 2D and 3D pose estimation from still images and human action recognition from video sequences. We show that a single architecture can be used to solve the two problems in an efficient way and still achieves state-of-the-art results. Additionally, we demonstrate that optimization from end-to-end leads to significantly higher accuracy than separated learning. The proposed architecture can be trained with data from different categories simultaneously in a seamlessly way. The reported results on four datasets (MPII, Human3.6M, Penn Action and NTU) demonstrate the effectiveness of our method on the targeted tasks.
Tasks	3D Pose Estimation, Action Recognition In Videos, Pose Estimation, Temporal Action Localization
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09232v2
PDF	http://arxiv.org/pdf/1802.09232v2.pdf
PWC	https://paperswithcode.com/paper/2d3d-pose-estimation-and-action-recognition
Repo
Framework

Multi-optional Many-sorted Past Present Future structures and its description


Title	Multi-optional Many-sorted Past Present Future structures and its description
Authors	Sergio Miguel Tomé
Abstract	The cognitive theory of true conditions (CTTC) is a proposal to describe the model-theoretic semantics of symbolic cognitive architectures and design the implementation of cognitive abilities. The CTTC is formulated mathematically using the multi-optional many-sorted past present future(MMPPF) structures. This article defines mathematically the MMPPF structures and the formal languages proposed to describe them by the CTTC.
Tasks
Published	2018-01-24
URL	http://arxiv.org/abs/1801.08212v1
PDF	http://arxiv.org/pdf/1801.08212v1.pdf
PWC	https://paperswithcode.com/paper/multi-optional-many-sorted-past-present
Repo
Framework

A Method for Analysis of Patient Speech in Dialogue for Dementia Detection


Title	A Method for Analysis of Patient Speech in Dialogue for Dementia Detection
Authors	Saturnino Luz, Sofia de la Fuente, Pierre Albert
Abstract	We present an approach to automatic detection of Alzheimer’s type dementia based on characteristics of spontaneous spoken language dialogue consisting of interviews recorded in natural settings. The proposed method employs additive logistic regression (a machine learning boosting method) on content-free features extracted from dialogical interaction to build a predictive model. The model training data consisted of 21 dialogues between patients with Alzheimer’s and interviewers, and 17 dialogues between patients with other health conditions and interviewers. Features analysed included speech rate, turn-taking patterns and other speech parameters. Despite relying solely on content-free features, our method obtains overall accuracy of 86.5%, a result comparable to those of state-of-the-art methods that employ more complex lexical, syntactic and semantic features. While further investigation is needed, the fact that we were able to obtain promising results using only features that can be easily extracted from spontaneous dialogues suggests the possibility of designing non-invasive and low-cost mental health monitoring tools for use at scale.
Tasks
Published	2018-11-25
URL	http://arxiv.org/abs/1811.09919v1
PDF	http://arxiv.org/pdf/1811.09919v1.pdf
PWC	https://paperswithcode.com/paper/a-method-for-analysis-of-patient-speech-in
Repo
Framework

Automatic Documentation of ICD Codes with Far-Field Speech Recognition


Title	Automatic Documentation of ICD Codes with Far-Field Speech Recognition
Authors	Albert Haque, Corinna Fukushima
Abstract	Documentation errors increase healthcare costs and cause unnecessary patient deaths. As the standard language for diagnoses and billing, ICD codes serve as the foundation for medical documentation worldwide. Despite the prevalence of electronic medical records, hospitals still witness high levels of ICD miscoding. In this paper, we propose to automatically document ICD codes with far-field speech recognition. Far-field speech occurs when the microphone is located several meters from the source, as is common with smart homes and security systems. Our method combines acoustic signal processing with recurrent neural networks to recognize and document ICD codes in real time. To evaluate our model, we collected a far-field speech dataset of ICD-10 codes and found our model to achieve 87% accuracy with a BLEU score of 85%. By sampling from an unsupervised medical language model, our method is able to outperform existing methods. Overall, this work shows the potential of automatic speech recognition to provide efficient, accurate, and cost-effective healthcare documentation.
Tasks	Language Modelling, Speech Recognition
Published	2018-04-30
URL	http://arxiv.org/abs/1804.11046v4
PDF	http://arxiv.org/pdf/1804.11046v4.pdf
PWC	https://paperswithcode.com/paper/automatic-documentation-of-icd-codes-with-far
Repo
Framework

Unsupervised Machine Learning Based on Non-Negative Tensor Factorization for Analyzing Reactive-Mixing


Title	Unsupervised Machine Learning Based on Non-Negative Tensor Factorization for Analyzing Reactive-Mixing
Authors	V. V. Vesselinov, M. K. Mudunuru, S. Karra, D. O. Malley, B. S. Alexandrov
Abstract	Analysis of reactive-diffusion simulations requires a large number of independent model runs. For each high-fidelity simulation, inputs are varied and the predicted mixing behavior is represented by changes in species concentration. It is then required to discern how the model inputs impact the mixing process. This task is challenging and typically involves interpretation of large model outputs. However, the task can be automated and substantially simplified by applying Machine Learning (ML) methods. In this paper, we present an application of an unsupervised ML method (called NTFk) using Non-negative Tensor Factorization (NTF) coupled with a custom clustering procedure based on k-means to reveal hidden features in product concentration. An attractive aspect of the proposed ML method is that it ensures the extracted features are non-negative, which are important to obtain a meaningful deconstruction of the mixing processes. The ML method is applied to a large set of high-resolution FEM simulations representing reaction-diffusion processes in perturbed vortex-based velocity fields. The applied FEM ensures that species concentration are always non-negative. The simulated reaction is a fast irreversible bimolecular reaction. The reactive-diffusion model input parameters that control mixing include properties of velocity field, anisotropic dispersion, and molecular diffusion. We demonstrate the applicability of the ML method to produce a meaningful deconstruction of model outputs to discriminate between different physical processes impacting the reactants, their mixing, and the spatial distribution of the product. The presented ML analysis allowed us to identify additive features that characterize mixing behavior.
Tasks
Published	2018-05-16
URL	http://arxiv.org/abs/1805.06454v2
PDF	http://arxiv.org/pdf/1805.06454v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-machine-learning-based-on-non
Repo
Framework

An Integration of Bottom-up and Top-Down Salient Cues on RGB-D Data: Saliency from Objectness vs. Non-Objectness


Title	An Integration of Bottom-up and Top-Down Salient Cues on RGB-D Data: Saliency from Objectness vs. Non-Objectness
Authors	Nevrez Imamoglu, Wataru Shimoda, Chi Zhang, Yuming Fang, Asako Kanezaki, Keiji Yanai, Yoshifumi Nishida
Abstract	Bottom-up and top-down visual cues are two types of information that helps the visual saliency models. These salient cues can be from spatial distributions of the features (space-based saliency) or contextual / task-dependent features (object based saliency). Saliency models generally incorporate salient cues either in bottom-up or top-down norm separately. In this work, we combine bottom-up and top-down cues from both space and object based salient features on RGB-D data. In addition, we also investigated the ability of various pre-trained convolutional neural networks for extracting top-down saliency on color images based on the object dependent feature activation. We demonstrate that combining salient features from color and dept through bottom-up and top-down methods gives significant improvement on the salient object detection with space based and object based salient cues. RGB-D saliency integration framework yields promising results compared with the several state-of-the-art-models.
Tasks	Object Detection, Salient Object Detection
Published	2018-07-04
URL	http://arxiv.org/abs/1807.01532v1
PDF	http://arxiv.org/pdf/1807.01532v1.pdf
PWC	https://paperswithcode.com/paper/an-integration-of-bottom-up-and-top-down
Repo
Framework

AIQ: Measuring Intelligence of Business AI Software


Title	AIQ: Measuring Intelligence of Business AI Software
Authors	Moshe BenBassat
Abstract	Focusing on Business AI, this article introduces the AIQ quadrant that enables us to measure AI for business applications in a relative comparative manner, i.e. to judge that software A has more or less intelligence than software B. Recognizing that the goal of Business software is to maximize value in terms of business results, the dimensions of the quadrant are the key factors that determine the business value of AI software: Level of Output Quality (Smartness) and Level of Automation. The use of the quadrant is illustrated by several software solutions to support the real life business challenge of field service scheduling. The role of machine learning and conversational digital assistants in increasing the business value are also discussed and illustrated with a recent integration of existing intelligent digital assistants for factory floor decision making with the new version of Google Glass. Such hands free AI solutions elevate the AIQ level to its ultimate position.
Tasks	Decision Making
Published	2018-08-10
URL	http://arxiv.org/abs/1808.03454v1
PDF	http://arxiv.org/pdf/1808.03454v1.pdf
PWC	https://paperswithcode.com/paper/aiq-measuring-intelligence-of-business-ai
Repo
Framework

A Novel Integrated Framework for Learning both Text Detection and Recognition


Title	A Novel Integrated Framework for Learning both Text Detection and Recognition
Authors	Wanchen Sui, Qing Zhang, Jun Yang, Wei Chu
Abstract	In this paper, we propose a novel integrated framework for learning both text detection and recognition. For most of the existing methods, detection and recognition are treated as two isolated tasks and trained separately, since parameters of detection and recognition models are different and two models target to optimize their own loss functions during individual training processes. In contrast to those methods, by sharing model parameters, we merge the detection model and recognition model into a single end-to-end trainable model and train the joint model for two tasks simultaneously. The shared parameters not only help effectively reduce the computational load in inference process, but also improve the end-to-end text detection-recognition accuracy. In addition, we design a simpler and faster sequence learning method for the recognition network based on a succession of stacked convolutional layers without any recurrent structure, this is proved feasible and dramatically improves inference speed. Extensive experiments on different datasets demonstrate that the proposed method achieves very promising results.
Tasks
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08611v1
PDF	http://arxiv.org/pdf/1811.08611v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-integrated-framework-for-learning
Repo
Framework

Memcomputing: Leveraging memory and physics to compute efficiently


Title	Memcomputing: Leveraging memory and physics to compute efficiently
Authors	Massimiliano Di Ventra, Fabio L. Traversa
Abstract	It is well known that physical phenomena may be of great help in computing some difficult problems efficiently. A typical example is prime factorization that may be solved in polynomial time by exploiting quantum entanglement on a quantum computer. There are, however, other types of (non-quantum) physical properties that one may leverage to compute efficiently a wide range of hard problems. In this perspective we discuss how to employ one such property, memory (time non-locality), in a novel physics-based approach to computation: Memcomputing. In particular, we focus on digital memcomputing machines (DMMs) that are scalable. DMMs can be realized with non-linear dynamical systems with memory. The latter property allows the realization of a new type of Boolean logic, one that is self-organizing. Self-organizing logic gates are “terminal-agnostic”, namely they do not distinguish between input and output terminals. When appropriately assembled to represent a given combinatorial/optimization problem, the corresponding self-organizing circuit converges to the equilibrium points that express the solutions of the problem at hand. In doing so, DMMs take advantage of the long-range order that develops during the transient dynamics. This collective dynamical behavior, reminiscent of a phase transition, or even the “edge of chaos”, is mediated by families of classical trajectories (instantons) that connect critical points of increasing stability in the system’s phase space. The topological character of the solution search renders DMMs robust against noise and structural disorder. Since DMMs are non-quantum systems described by ordinary differential equations, not only can they be built in hardware with available technology, they can also be simulated efficiently on modern classical computers. As an example, we will show the polynomial-time solution of the subset-sum problem for the worst…
Tasks	Combinatorial Optimization
Published	2018-02-20
URL	http://arxiv.org/abs/1802.06928v2
PDF	http://arxiv.org/pdf/1802.06928v2.pdf
PWC	https://paperswithcode.com/paper/memcomputing-leveraging-memory-and-physics-to
Repo
Framework

Optimal Learning for Dynamic Coding in Deadline-Constrained Multi-Channel Networks


Title	Optimal Learning for Dynamic Coding in Deadline-Constrained Multi-Channel Networks
Authors	Semih Cayci, Atilla Eryilmaz
Abstract	We study the problem of serving randomly arriving and delay-sensitive traffic over a multi-channel communication system with time-varying channel states and unknown statistics. This problem deviates from the classical exploration-exploitation setting in that the design and analysis must accommodate the dynamics of packet availability and urgency as well as the cost of each channel use at the time of decision. To that end, we have developed and investigated an index-based policy UCB-Deadline, which performs dynamic channel allocation decisions that incorporate these traffic requirements and costs. Under symmetric channel conditions, we have proved that the UCB-Deadline policy can achieve bounded regret in the likely case where the cost of using a channel is not too high to prevent all transmissions, and logarithmic regret otherwise. In this case, we show that UCB-Deadline is order-optimal. We also perform numerical investigations to validate the theoretical findings, and also compare the performance of the UCB-Deadline to another learning algorithm that we propose based on Thompson Sampling.
Tasks
Published	2018-11-27
URL	http://arxiv.org/abs/1811.10829v1
PDF	http://arxiv.org/pdf/1811.10829v1.pdf
PWC	https://paperswithcode.com/paper/optimal-learning-for-dynamic-coding-in
Repo
Framework

Robust Landmark Detection for Alignment of Mouse Brain Section Images


Title	Robust Landmark Detection for Alignment of Mouse Brain Section Images
Authors	Yuncong Chen, David Kleinfeld, Martyn Goulding, Yoav Freund
Abstract	Brightfield and fluorescent imaging of whole brain sections are funda- mental tools of research in mouse brain study. As sectioning and imaging become more efficient, there is an increasing need to automate the post-processing of sec- tions for alignment and three dimensional visualization. There is a further need to facilitate the development of a digital atlas, i.e. a brain-wide map annotated with cell type and tract tracing data, which would allow the automatic registra- tion of images stacks to a common coordinate system. Currently, registration of slices requires manual identification of landmarks. In this work we describe the first steps in developing a semi-automated system to construct a histology at- las of mouse brainstem that combines atlas-guided annotation, landmark-based registration and atlas generation in an iterative framework. We describe an unsu- pervised approach for identifying and matching region and boundary landmarks, based on modelling texture. Experiments show that the detected landmarks corre- spond well with brain structures, and matching is robust under distortion. These results will serve as the basis for registration and atlas building.
Tasks
Published	2018-03-09
URL	http://arxiv.org/abs/1803.03420v1
PDF	http://arxiv.org/pdf/1803.03420v1.pdf
PWC	https://paperswithcode.com/paper/robust-landmark-detection-for-alignment-of
Repo
Framework

Semi-supervised mp-MRI Data Synthesis with StitchLayer and Auxiliary Distance Maximization


Title	Semi-supervised mp-MRI Data Synthesis with StitchLayer and Auxiliary Distance Maximization
Authors	Zhiwei Wang, Yi Lin, Kwang-Ting Cheng, Xin Yang
Abstract	In this paper, we address the problem of synthesizing multi-parameter magnetic resonance imaging (mp-MRI) data, i.e. Apparent Diffusion Coefficients (ADC) and T2-weighted (T2w), containing clinically significant (CS) prostate cancer (PCa) via semi-supervised adversarial learning. Specifically, our synthesizer generates mp-MRI data in a sequential manner: first generating ADC maps from 128-d latent vectors, followed by translating them to the T2w images. The synthesizer is trained in a semisupervised manner. In the supervised training process, a limited amount of paired ADC-T2w images and the corresponding ADC encodings are provided and the synthesizer learns the paired relationship by explicitly minimizing the reconstruction losses between synthetic and real images. To avoid overfitting limited ADC encodings, an unlimited amount of random latent vectors and unpaired ADC-T2w Images are utilized in the unsupervised training process for learning the marginal image distributions of real images. To improve the robustness of synthesizing, we decompose the difficult task of generating full-size images into several simpler tasks which generate sub-images only. A StitchLayer is then employed to fuse sub-images together in an interlaced manner into a full-size image. To enforce the synthetic images to indeed contain distinguishable CS PCa lesions, we propose to also maximize an auxiliary distance of Jensen-Shannon divergence (JSD) between CS and nonCS images. Experimental results show that our method can effectively synthesize a large variety of mpMRI images which contain meaningful CS PCa lesions, display a good visual quality and have the correct paired relationship. Compared to the state-of-the-art synthesis methods, our method achieves a significant improvement in terms of both visual and quantitative evaluation metrics.
Tasks	Synthesizing Multi-Parameter Magnetic Resonance Imaging (Mp-Mri) Data
Published	2018-12-17
URL	http://arxiv.org/abs/1812.06625v1
PDF	http://arxiv.org/pdf/1812.06625v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-mp-mri-data-synthesis-with
Repo
Framework

Graph reduction with spectral and cut guarantees


Title	Graph reduction with spectral and cut guarantees
Authors	Andreas Loukas
Abstract	Can one reduce the size of a graph without significantly altering its basic properties? The graph reduction problem is hereby approached from the perspective of restricted spectral approximation, a modification of the spectral similarity measure used for graph sparsification. This choice is motivated by the observation that restricted approximation carries strong spectral and cut guarantees, and that it implies approximation results for unsupervised learning problems relying on spectral embeddings. The paper then focuses on coarsening—the most common type of graph reduction. Sufficient conditions are derived for a small graph to approximate a larger one in the sense of restricted similarity. These findings give rise to nearly-linear algorithms that, compared to both standard and advanced graph reduction methods, find coarse graphs of improved quality, often by a large margin, without sacrificing speed.
Tasks
Published	2018-08-31
URL	http://arxiv.org/abs/1808.10650v2
PDF	http://arxiv.org/pdf/1808.10650v2.pdf
PWC	https://paperswithcode.com/paper/graph-reduction-with-spectral-and-cut
Repo
Framework

An algebraic-geometric approach for linear regression without correspondences


Title	An algebraic-geometric approach for linear regression without correspondences
Authors	Manolis C. Tsakiris, Liangzu Peng, Aldo Conca, Laurent Kneip, Yuanming Shi, Hayoung Choi
Abstract	Linear regression without correspondences is the problem of performing a linear regression fit to a dataset for which the correspondences between the independent samples and the observations are unknown. Such a problem naturally arises in diverse domains such as computer vision, data mining, communications and biology. In its simplest form, it is tantamount to solving a linear system of equations, for which the entries of the right hand side vector have been permuted. This type of data corruption renders the linear regression task considerably harder, even in the absence of other corruptions, such as noise, outliers or missing entries. Existing methods are either applicable only to noiseless data or they are very sensitive to initialization or they work only for partially shuffled data. In this paper we address these issues via an algebraic geometric approach, which uses symmetric polynomials to extract permutation-invariant constraints that the parameters $\xi^* \in \Re^n$ of the linear regression model must satisfy. This naturally leads to a polynomial system of $n$ equations in $n$ unknowns, which contains $\xi^*$ in its root locus. Using the machinery of algebraic geometry we prove that as long as the independent samples are generic, this polynomial system is always consistent with at most $n!$ complex roots, regardless of any type of corruption inflicted on the observations. The algorithmic implication of this fact is that one can always solve this polynomial system and use its most suitable root as initialization to the Expectation Maximization algorithm. To the best of our knowledge, the resulting method is the first working solution for small values of $n$ able to handle thousands of fully shuffled noisy observations in milliseconds.
Tasks
Published	2018-10-12
URL	https://arxiv.org/abs/1810.05440v2
PDF	https://arxiv.org/pdf/1810.05440v2.pdf
PWC	https://paperswithcode.com/paper/an-algebraic-geometric-approach-to-shuffled
Repo
Framework

Attentive Aspect Modeling for Review-aware Recommendation


Title	Attentive Aspect Modeling for Review-aware Recommendation
Authors	Xinyu Guan, Zhiyong Cheng, Xiangnan He, Yongfeng Zhang, Zhibo Zhu, Qinke Peng, Tat-Seng Chua
Abstract	In recent years, many studies extract aspects from user reviews and integrate them with ratings for improving the recommendation performance. The common aspects mentioned in a user’s reviews and a product’s reviews indicate indirect connections between the user and product. However, these aspect-based methods suffer from two problems. First, the common aspects are usually very sparse, which is caused by the sparsity of user-product interactions and the diversity of individual users’ vocabularies. Second, a user’s interests on aspects could be different with respect to different products, which are usually assumed to be static in existing methods. In this paper, we propose an Attentive Aspect-based Recommendation Model (AARM) to tackle these challenges. For the first problem, to enrich the aspect connections between user and product, besides common aspects, AARM also models the interactions between synonymous and similar aspects. For the second problem, a neural attention network which simultaneously considers user, product and aspect information is constructed to capture a user’s attention towards aspects when examining different products. Extensive quantitative and qualitative experiments show that AARM can effectively alleviate the two aforementioned problems and significantly outperforms several state-of-the-art recommendation methods on top-N recommendation task.
Tasks
Published	2018-11-11
URL	http://arxiv.org/abs/1811.04375v3
PDF	http://arxiv.org/pdf/1811.04375v3.pdf
PWC	https://paperswithcode.com/paper/attentive-aspect-modeling-for-review-aware
Repo
Framework