January 30, 2020

3307 words 16 mins read

Paper Group ANR 439

A Framework for Predicting Impactability of Healthcare Interventions Using Machine Learning Methods, Administrative Claims, Sociodemographic and App Generated Data. Improving Voice Separation by Incorporating End-to-end Speech Recognition. Competence-based Curriculum Learning for Neural Machine Translation. Probabilistic Approximate Logic and its I …

A Framework for Predicting Impactability of Healthcare Interventions Using Machine Learning Methods, Administrative Claims, Sociodemographic and App Generated Data


Title	A Framework for Predicting Impactability of Healthcare Interventions Using Machine Learning Methods, Administrative Claims, Sociodemographic and App Generated Data
Authors	Heather Mattie, Patrick Reidy, Patrik Bachtiger, Emily Lindemer, Mohammad Jouni, Trishan Panch
Abstract	It is not clear how to target patients who are most likely to benefit from digital care management programs ex-ante, a shortcoming of current risk score based approaches. This study focuses on defining impactability by identifying those patients most likely to benefit from technology enabled care management, delivered through a digital health platform, including a mobile app and clinician web dashboard. Anonymized insurance claims data were used from a commercially insured population across several U.S. states and combined with inferred sociodemographic data and data derived from the patient-held mobile application itself. Our approach involves the creation of two models and the comparative analysis of the methodologies and performances therein. We first train a cost prediction model to calculate the differences in predicted (without intervention) versus actual (with onboarding onto digital health platform) healthcare expenditure for patients (N = 1,242). This enables the classification of impactability if differences in predicted versus actual costs meet a predetermined threshold. A random forest machine learning model was then trained to accurately categorize new patients as impactable versus not impactable, reaching an overall accuracy of 71.9%. We then modify these parameters through grid search to define the parameters that deliver optimal performance. A roadmap is proposed to iteratively improve the performance of the model. As the number of newly onboarded patients and length of use continues to increase, the accuracy of predicting impactability will improve commensurately as more advanced machine learning techniques such as deep learning become relevant. This approach is generalizable to analyzing the impactability of any intervention and is a key component of realising closed loop feedback systems for continuous improvement in healthcare.
Tasks
Published	2019-04-19
URL	https://arxiv.org/abs/1905.00751v2
PDF	https://arxiv.org/pdf/1905.00751v2.pdf
PWC	https://paperswithcode.com/paper/190500751
Repo
Framework

Improving Voice Separation by Incorporating End-to-end Speech Recognition


Title	Improving Voice Separation by Incorporating End-to-end Speech Recognition
Authors	Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Parthasaarathy Sudarsanam, Sriram Ganapathy, Yuki Mitsufuji
Abstract	Despite recent advances in voice separation methods, many challenges remain in realistic scenarios such as noisy recording and the limits of available data. In this work, we propose to explicitly incorporate the phonetic and linguistic nature of speech by taking a transfer learning approach using an end-to-end automatic speech recognition (E2EASR) system. The voice separation is conditioned on deep features extracted from E2EASR to cover the long-term dependence of phonetic aspects. Experimental results on speech separation and enhancement task on the AVSpeech dataset show that the proposed method significantly improves the signal-to-distortion ratio over the baseline model and even outperforms an audio visual model, that utilizes visual information of lip movements.
Tasks	End-To-End Speech Recognition, Speech Recognition, Speech Separation, Transfer Learning
Published	2019-11-29
URL	https://arxiv.org/abs/1911.12928v1
PDF	https://arxiv.org/pdf/1911.12928v1.pdf
PWC	https://paperswithcode.com/paper/improving-voice-separation-by-incorporating
Repo
Framework

Competence-based Curriculum Learning for Neural Machine Translation


Title	Competence-based Curriculum Learning for Neural Machine Translation
Authors	Emmanouil Antonios Platanios, Otilia Stretcu, Graham Neubig, Barnabas Poczos, Tom M. Mitchell
Abstract	Current state-of-the-art NMT systems use large neural networks that are not only slow to train, but also often require many heuristics and optimization tricks, such as specialized learning rate schedules and large batch sizes. This is undesirable as it requires extensive hyperparameter tuning. In this paper, we propose a curriculum learning framework for NMT that reduces training time, reduces the need for specialized heuristics or large batch sizes, and results in overall better performance. Our framework consists of a principled way of deciding which training samples are shown to the model at different times during training, based on the estimated difficulty of a sample and the current competence of the model. Filtering training samples in this manner prevents the model from getting stuck in bad local optima, making it converge faster and reach a better solution than the common approach of uniformly sampling training examples. Furthermore, the proposed method can be easily applied to existing NMT models by simply modifying their input data pipelines. We show that our framework can help improve the training time and the performance of both recurrent neural network models and Transformers, achieving up to a 70% decrease in training time, while at the same time obtaining accuracy improvements of up to 2.2 BLEU.
Tasks	Machine Translation
Published	2019-03-23
URL	http://arxiv.org/abs/1903.09848v2
PDF	http://arxiv.org/pdf/1903.09848v2.pdf
PWC	https://paperswithcode.com/paper/competence-based-curriculum-learning-for
Repo
Framework

Probabilistic Approximate Logic and its Implementation in the Logical Imagination Engine


Title	Probabilistic Approximate Logic and its Implementation in the Logical Imagination Engine
Authors	Mark-Oliver Stehr, Minyoung Kim, Carolyn L. Talcott, Merrill Knapp, Akos Vertes
Abstract	In spite of the rapidly increasing number of applications of machine learning in various domains, a principled and systematic approach to the incorporation of domain knowledge in the engineering process is still lacking and ad hoc solutions that are difficult to validate are still the norm in practice, which is of growing concern not only in mission-critical applications. In this note, we introduce Probabilistic Approximate Logic (PALO) as a logic based on the notion of mean approximate probability to overcome conceptual and computational difficulties inherent to strictly probabilistic logics. The logic is approximate in several dimensions. Logical independence assumptions are used to obtain approximate probabilities, but by averaging over many instances of formulas a useful estimate of mean probability with known confidence can usually be obtained. To enable efficient computational inference, the logic has a continuous semantics that reflects only a subset of the structural properties of classical logic, but this imprecision can be partly compensated by richer theories obtained by classical inference or other means. Computational inference, which refers to the construction of models and validation of logical properties, is based on Stochastic Gradient Descent (SGD) and Markov Chain Monte Carlo (MCMC) techniques and hence another dimension where approximations are involved. We also present the Logical Imagination Engine (LIME), a prototypical implementation of PALO based on TensorFlow. Albeit not limited to the biological domain, we illustrate its operation in a quite substantial bioinformatics machine learning application concerned with network synthesis and analysis in a recent DARPA project.
Tasks
Published	2019-07-25
URL	https://arxiv.org/abs/1907.11321v1
PDF	https://arxiv.org/pdf/1907.11321v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-approximate-logic-and-its
Repo
Framework

Ranking sentences from product description & bullets for better search


Title	Ranking sentences from product description & bullets for better search
Authors	Prateek Verma, Aliasgar Kutiyanawala, Ke Shen
Abstract	Products in an ecommerce catalog contain information-rich fields like description and bullets that can be useful to extract entities (attributes) using NER based systems. However, these fields are often verbose and contain lot of information that is not relevant from a search perspective. Treating each sentence within these fields equally can lead to poor full text match and introduce problems in extracting attributes to develop ontologies, semantic search etc. To address this issue, we describe two methods based on extractive summarization with reinforcement learning by leveraging information in product titles and search click through logs to rank sentences from bullets, description, etc. Finally, we compare the accuracy of these two models.
Tasks
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06330v1
PDF	https://arxiv.org/pdf/1907.06330v1.pdf
PWC	https://paperswithcode.com/paper/ranking-sentences-from-product-description
Repo
Framework

Incremental and Decremental Fuzzy Bounded Twin Support Vector Machine


Title	Incremental and Decremental Fuzzy Bounded Twin Support Vector Machine
Authors	Alexandre Reeberg de Mello, Marcelo Ricardo Stemmer, Alessandro Lameiras Koerich
Abstract	In this paper we present an incremental variant of the Twin Support Vector Machine (TWSVM) called Fuzzy Bounded Twin Support Vector Machine (FBTWSVM) to deal with large datasets and learning from data streams. We combine the TWSVM with a fuzzy membership function, so that each input has a different contribution to each hyperplane in a binary classifier. To solve the pair of quadratic programming problems (QPPs) we use a dual coordinate descent algorithm with a shrinking strategy, and to obtain a robust classification with a fast training we propose the use of a Fourier Gaussian approximation function with our linear FBTWSVM. Inspired by the shrinking technique, the incremental algorithm re-utilizes part of the training method with some heuristics, while the decremental procedure is based on a scored window. The FBTWSVM is also extended for multi-class problems by combining binary classifiers using a Directed Acyclic Graph (DAG) approach. Moreover, we analyzed the theoretical foundations properties of the proposed approach and its extension, and the experimental results on benchmark datasets indicate that the FBTWSVM has a fast training and retraining process while maintaining a robust classification performance.
Tasks
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09613v2
PDF	https://arxiv.org/pdf/1907.09613v2.pdf
PWC	https://paperswithcode.com/paper/incremental-and-decremental-fuzzy-bounded
Repo
Framework

Noisy and Incomplete Boolean Matrix Factorizationvia Expectation Maximization


Title	Noisy and Incomplete Boolean Matrix Factorizationvia Expectation Maximization
Authors	Lifan Liang, Songjian Lu
Abstract	Probabilistic approach to Boolean matrix factorization can provide solutions robustagainst noise and missing values with linear computational complexity. However,the assumption about latent factors can be problematic in real world applications.This study proposed a new probabilistic algorithm free of assumptions of latentfactors, while retaining the advantages of previous algorithms. Real data experimentshowed that our algorithm was favourably compared with current state-of-the-artprobabilistic algorithms.
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12766v1
PDF	https://arxiv.org/pdf/1905.12766v1.pdf
PWC	https://paperswithcode.com/paper/noisy-and-incomplete-boolean-matrix
Repo
Framework

Combining Stochastic Adaptive Cubic Regularization with Negative Curvature for Nonconvex Optimization


Title	Combining Stochastic Adaptive Cubic Regularization with Negative Curvature for Nonconvex Optimization
Authors	Seonho Park, Seung Hyun Jung, Panos M. Pardalos
Abstract	We focus on minimizing nonconvex finite-sum functions that typically arise in machine learning problems. In an attempt to solve this problem, the adaptive cubic regularized Newton method has shown its strong global convergence guarantees and ability to escape from strict saddle points. This method uses a trust region-like scheme to determine if an iteration is successful or not, and updates only when it is successful. In this paper, we suggest an algorithm combining negative curvature with the adaptive cubic regularized Newton method to update even at unsuccessful iterations. We call this new method Stochastic Adaptive cubic regularization with Negative Curvature (SANC). Unlike the previous method, in order to attain stochastic gradient and Hessian estimators, the SANC algorithm uses independent sets of data points of consistent size over all iterations. It makes the SANC algorithm more practical to apply for solving large-scale machine learning problems. To the best of our knowledge, this is the first approach that combines the negative curvature method with the adaptive cubic regularized Newton method. Finally, we provide experimental results including neural networks problems supporting the efficiency of our method.
Tasks
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11417v1
PDF	https://arxiv.org/pdf/1906.11417v1.pdf
PWC	https://paperswithcode.com/paper/combining-stochastic-adaptive-cubic
Repo
Framework

Multi-channel Speech Separation Using Deep Embedding Model with Multilayer Bootstrap Networks


Title	Multi-channel Speech Separation Using Deep Embedding Model with Multilayer Bootstrap Networks
Authors	Ziye Yang, Xiao-Lei Zhang
Abstract	Recently, deep clustering (DPCL) based speaker-independent speech separation has drawn much attention, since it needs little speaker prior information. However, it still has much room of improvement, particularly in reverberant environments. If the training and test environments mismatch which is a common case, the embedding vectors produced by DPCL may contain much noise and many small variations. To deal with the problem, we propose a variant of DPCL, named DPCL++, by applying a recent unsupervised deep learning method—multilayer bootstrap networks(MBN)—to further reduce the noise and small variations of the embedding vectors in an unsupervised way in the test stage, which fascinates k-means to produce a good result. MBN builds a gradually narrowed network from bottom-up via a stack of k-centroids clustering ensembles, where the k-centroids clusterings are trained independently by random sampling and one-nearest-neighbor optimization. To further improve the robustness of DPCL++ in reverberant environments, we take spatial features as part of its input. Experimental results demonstrate the effectiveness of the proposed method.
Tasks	Speech Separation
Published	2019-10-24
URL	https://arxiv.org/abs/1910.10912v1
PDF	https://arxiv.org/pdf/1910.10912v1.pdf
PWC	https://paperswithcode.com/paper/multi-channel-speech-separation-using-deep
Repo
Framework

Learn to Estimate Labels Uncertainty for Quality Assurance


Title	Learn to Estimate Labels Uncertainty for Quality Assurance
Authors	Agnieszka Tomczack, Nassir Navab, Shadi Albarqouni
Abstract	Deep Learning sets the state-of-the-art in many challenging tasks showing outstanding performance in a broad range of applications. Despite its success, it still lacks robustness hindering its adoption in medical applications. Modeling uncertainty, through Bayesian Inference and Monte-Carlo dropout, has been successfully introduced for better understanding the underlying deep learning models. Yet, another important source of uncertainty, coming from the inter-observer variability, has not been thoroughly addressed in the literature. In this paper, we introduce labels uncertainty which better suits medical applications and show that modeling such uncertainty together with epistemic uncertainty is of high interest for quality control and referral systems.
Tasks	Bayesian Inference
Published	2019-09-17
URL	https://arxiv.org/abs/1909.08058v1
PDF	https://arxiv.org/pdf/1909.08058v1.pdf
PWC	https://paperswithcode.com/paper/learn-to-estimate-labels-uncertainty-for
Repo
Framework

Prediction of Porosity and Permeability Alteration based on Machine Learning Algorithms


Title	Prediction of Porosity and Permeability Alteration based on Machine Learning Algorithms
Authors	Andrei Erofeev, Denis Orlov, Alexey Ryzhov, Dmitry Koroteev
Abstract	The objective of this work is to study the applicability of various Machine Learning algorithms for prediction of some rock properties which geoscientists usually define due to special lab analysis. We demonstrate that these special properties can be predicted only basing on routine core analysis (RCA) data. To validate the approach core samples from the reservoir with soluble rock matrix components (salts) were tested within 100+ laboratory experiments. The challenge of the experiments was to characterize the rate of salts in cores and alteration of porosity and permeability after reservoir desalination due to drilling mud or water injection. For these three measured characteristics, we developed the relevant predictive models, which were based on the results of RCA and data on coring depth and top and bottom depths of productive horizons. To select the most accurate Machine Learning algorithm a comparative analysis has been performed. It was shown that different algorithms work better in different models. However, two hidden layers Neural network has demonstrated the best predictive ability and generalizability for all three rock characteristics jointly. The other algorithms, such as Support Vector Machine and Linear Regression, also worked well on the dataset, but in particular cases. Overall, the applied approach allows predicting the alteration of porosity and permeability during desalination in porous rocks and also evaluating salt concentration without direct measurements in a laboratory. This work also shows that developed approaches could be applied for prediction of other rock properties (residual brine and oil saturations, relative permeability, capillary pressure, and others), which laboratory measurements are time-consuming and expensive.
Tasks
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06525v1
PDF	http://arxiv.org/pdf/1902.06525v1.pdf
PWC	https://paperswithcode.com/paper/prediction-of-porosity-and-permeability
Repo
Framework

On orthogonal projections for dimension reduction and applications in augmented target loss functions for learning problems


Title	On orthogonal projections for dimension reduction and applications in augmented target loss functions for learning problems
Authors	Anna Breger, Jose Ignacio Orlando, Pavol Harar, Monika Dörfler, Sophie Klimscha, Christoph Grechenig, Bianca S. Gerendas, Ursula Schmidt-Erfurth, Martin Ehler
Abstract	The use of orthogonal projections on high-dimensional input and target data in learning frameworks is studied. First, we investigate the relations between two standard objectives in dimension reduction, preservation of variance and of pairwise relative distances. Investigations of their asymptotic correlation as well as numerical experiments show that a projection does usually not satisfy both objectives at once. In a standard classification problem we determine projections on the input data that balance the objectives and compare subsequent results. Next, we extend our application of orthogonal projections to deep learning tasks and introduce a general framework of augmented target loss functions. These loss functions integrate additional information via transformations and projections of the target data. In two supervised learning problems, clinical image segmentation and music information classification, the application of our proposed augmented target loss functions increase the accuracy.
Tasks	Dimensionality Reduction, Semantic Segmentation
Published	2019-01-22
URL	https://arxiv.org/abs/1901.07598v4
PDF	https://arxiv.org/pdf/1901.07598v4.pdf
PWC	https://paperswithcode.com/paper/on-orthogonal-projections-for-dimension
Repo
Framework

Support vector machines on the D-Wave quantum annealer


Title	Support vector machines on the D-Wave quantum annealer
Authors	Dennis Willsch, Madita Willsch, Hans De Raedt, Kristel Michielsen
Abstract	Kernel-based support vector machines (SVMs) are supervised machine learning algorithms for classification and regression problems. We introduce a method to train SVMs on a D-Wave 2000Q quantum annealer and study its performance in comparison to SVMs trained on conventional computers. The method is applied to both synthetic data and real data obtained from biology experiments. We find that the quantum annealer produces an ensemble of different solutions that often generalizes better to unseen data than the single global minimum of an SVM trained on a conventional computer, especially in cases where only limited training data is available. For cases with more training data than currently fits on the quantum annealer, we show that a combination of classifiers for subsets of the data almost always produces stronger joint classifiers than the conventional SVM for the same parameters.
Tasks
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06283v2
PDF	https://arxiv.org/pdf/1906.06283v2.pdf
PWC	https://paperswithcode.com/paper/support-vector-machines-on-the-d-wave-quantum
Repo
Framework

Knowledge-aware Complementary Product Representation Learning


Title	Knowledge-aware Complementary Product Representation Learning
Authors	Da Xu, Chuanwei Ruan, Jason Cho, Evren Korpeoglu, Sushant Kumar, Kannan Achan
Abstract	Learning product representations that reflect complementary relationship plays a central role in e-commerce recommender system. In the absence of the product relationships graph, which existing methods rely on, there is a need to detect the complementary relationships directly from noisy and sparse customer purchase activities. Furthermore, unlike simple relationships such as similarity, complementariness is asymmetric and non-transitive. Standard usage of representation learning emphasizes on only one set of embedding, which is problematic for modelling such properties of complementariness. We propose using knowledge-aware learning with dual product embedding to solve the above challenges. We encode contextual knowledge into product representation by multi-task learning, to alleviate the sparsity issue. By explicitly modelling with user bias terms, we separate the noise of customer-specific preferences from the complementariness. Furthermore, we adopt the dual embedding framework to capture the intrinsic properties of complementariness and provide geometric interpretation motivated by the classic separating hyperplane theory. Finally, we propose a Bayesian network structure that unifies all the components, which also concludes several popular models as special cases. The proposed method compares favourably to state-of-art methods, in downstream classification and recommendation tasks. We also develop an implementation that scales efficiently to a dataset with millions of items and customers.
Tasks	Multi-Task Learning, Recommendation Systems, Representation Learning
Published	2019-03-16
URL	https://arxiv.org/abs/1904.12574v3
PDF	https://arxiv.org/pdf/1904.12574v3.pdf
PWC	https://paperswithcode.com/paper/190412574
Repo
Framework

Autoencoding Undirected Molecular Graphs With Neural Networks


Title	Autoencoding Undirected Molecular Graphs With Neural Networks
Authors	Jeppe Johan Waarkjær Olsen, Peter Ebert Christensen, Martin Hangaard Hansen, Alexander Rosenberg Johansen
Abstract	Discrete structure rules for validating molecular structures are usually limited to fulfillment of the octet rule or similar simple deterministic heuristics. We propose a model, inspired by language modeling from natural language processing, with the ability to learn from a collection of undirected molecular graphs, enabling fitting of any underlying structure rule present in the collection. We introduce an adaption to the popular Transformer model, which can learn relationships between atoms and bonds. To our knowledge, the Transformer adaption is the first model that is trained to solve the unsupervised task of recovering partially observed molecules. In this work, we assess how different degrees of information impact performance w.r.t. to fitting the QM9 dataset, which conforms to the octet rule, and to fitting the ZINC dataset, which contains hypervalent molecules and ions requiring the model to learn a more complex structure rule. More specifically, we test a full discrete graph with bond order information, a full discrete graph with only connectivity, a bag-of-neighbors, a bag-of-atoms, and a count-based unigram statistics. These results provide encouraging evidence that neural networks, even when only connectivity is available, can learn arbitrary molecular structure rules specific to a dataset, as the Transformer adaption surpasses a strong octet rule baseline on the ZINC dataset.
Tasks	Language Modelling
Published	2019-11-26
URL	https://arxiv.org/abs/2001.03517v2
PDF	https://arxiv.org/pdf/2001.03517v2.pdf
PWC	https://paperswithcode.com/paper/autoencoding-undirected-molecular-graphs-with
Repo
Framework