October 19, 2019

3167 words 15 mins read

Paper Group ANR 110

Efficient Text Classification Using Tree-structured Multi-linear Principal Component Analysis. Optional Stopping with Bayes Factors: a categorization and extension of folklore results, with an application to invariant situations. On the Statistical Challenges of Echo State Networks and Some Potential Remedies. HeteroMed: Heterogeneous Information N …

Efficient Text Classification Using Tree-structured Multi-linear Principal Component Analysis


Title	Efficient Text Classification Using Tree-structured Multi-linear Principal Component Analysis
Authors	Yuanhang Su, Yuzhong Huang, C. -C. Jay Kuo
Abstract	A novel text data dimension reduction technique, called the tree-structured multi-linear principal component anal- ysis (TMPCA), is proposed in this work. Being different from traditional text dimension reduction methods that deal with the word-level representation, the TMPCA technique reduces the dimension of input sequences and sentences to simplify the following text classification tasks. It is shown mathematically and experimentally that the TMPCA tool demands much lower complexity (and, hence, less computing power) than the ordinary principal component analysis (PCA). Furthermore, it is demon- strated by experimental results that the support vector machine (SVM) method applied to the TMPCA-processed data achieves commensurable or better performance than the state-of-the-art recurrent neural network (RNN) approach.
Tasks	Dimensionality Reduction, Text Classification
Published	2018-01-20
URL	http://arxiv.org/abs/1801.06607v2
PDF	http://arxiv.org/pdf/1801.06607v2.pdf
PWC	https://paperswithcode.com/paper/efficient-text-classification-using-tree
Repo
Framework

Optional Stopping with Bayes Factors: a categorization and extension of folklore results, with an application to invariant situations


Title	Optional Stopping with Bayes Factors: a categorization and extension of folklore results, with an application to invariant situations
Authors	Allard Hendriksen, Rianne de Heide, Peter Grünwald
Abstract	It is often claimed that Bayesian methods, in particular Bayes factor methods for hypothesis testing, can deal with optional stopping. We first give an overview, using elementary probability theory, of three different mathematical meanings that various authors give to this claim: (1) stopping rule independence, (2) posterior calibration and (3) (semi-) frequentist robustness to optional stopping. We then prove theorems to the effect that these claims do indeed hold in a general measure-theoretic setting. For claims of type (2) and (3), such results are new. By allowing for non-integrable measures based on improper priors, we obtain particularly strong results for the practically important case of models with nuisance parameters satisfying a group invariance (such as location or scale). We also discuss the practical relevance of (1)-(3), and conclude that whether Bayes factor methods actually perform welll under optional stopping crucially depends on details of models, priors and the goal of the analysis.
Tasks	Calibration
Published	2018-07-24
URL	https://arxiv.org/abs/1807.09077v2
PDF	https://arxiv.org/pdf/1807.09077v2.pdf
PWC	https://paperswithcode.com/paper/optional-stopping-with-bayes-factors-a
Repo
Framework

On the Statistical Challenges of Echo State Networks and Some Potential Remedies


Title	On the Statistical Challenges of Echo State Networks and Some Potential Remedies
Authors	Qiuyi Wu, Ernest Fokoue, Dhireesha Kudithipudi
Abstract	Echo state networks are powerful recurrent neural networks. However, they are often unstable and shaky, making the process of finding an good ESN for a specific dataset quite hard. Obtaining a superb accuracy by using the Echo State Network is a challenging task. We create, develop and implement a family of predictably optimal robust and stable ensemble of Echo State Networks via regularizing the training and perturbing the input. Furthermore, several distributions of weights have been tried based on the shape to see if the shape of the distribution has the impact for reducing the error. We found ESN can track in short term for most dataset, but it collapses in the long run. Short-term tracking with large size reservoir enables ESN to perform strikingly with superior prediction. Based on this scenario, we go a further step to aggregate many of ESNs into an ensemble to lower the variance and stabilize the system by stochastic replications and bootstrapping of input data.
Tasks
Published	2018-02-20
URL	http://arxiv.org/abs/1802.07369v1
PDF	http://arxiv.org/pdf/1802.07369v1.pdf
PWC	https://paperswithcode.com/paper/on-the-statistical-challenges-of-echo-state
Repo
Framework

HeteroMed: Heterogeneous Information Network for Medical Diagnosis


Title	HeteroMed: Heterogeneous Information Network for Medical Diagnosis
Authors	Anahita Hosseini, Ting Chen, Wenjun Wu, Yizhou Sun, Majid Sarrafzadeh
Abstract	With the recent availability of Electronic Health Records (EHR) and great opportunities they offer for advancing medical informatics, there has been growing interest in mining EHR for improving quality of care. Disease diagnosis due to its sensitive nature, huge costs of error, and complexity has become an increasingly important focus of research in past years. Existing studies model EHR by capturing co-occurrence of clinical events to learn their latent embeddings. However, relations among clinical events carry various semantics and contribute differently to disease diagnosis which gives precedence to a more advanced modeling of heterogeneous data types and relations in EHR data than existing solutions. To address these issues, we represent how high-dimensional EHR data and its rich relationships can be suitably translated into HeteroMed, a heterogeneous information network for robust medical diagnosis. Our modeling approach allows for straightforward handling of missing values and heterogeneity of data. HeteroMed exploits metapaths to capture higher level and semantically important relations contributing to disease diagnosis. Furthermore, it employs a joint embedding framework to tailor clinical event representations to the disease diagnosis goal. To the best of our knowledge, this is the first study to use Heterogeneous Information Network for modeling clinical data and disease diagnosis. Experimental results of our study show superior performance of HeteroMed compared to prior methods in prediction of exact diagnosis codes and general disease cohorts. Moreover, HeteroMed outperforms baseline models in capturing similarities of clinical events which are examined qualitatively through case studies.
Tasks	Medical Diagnosis
Published	2018-04-22
URL	http://arxiv.org/abs/1804.08052v1
PDF	http://arxiv.org/pdf/1804.08052v1.pdf
PWC	https://paperswithcode.com/paper/heteromed-heterogeneous-information-network
Repo
Framework

On Fast Leverage Score Sampling and Optimal Learning


Title	On Fast Leverage Score Sampling and Optimal Learning
Authors	Alessandro Rudi, Daniele Calandriello, Luigi Carratino, Lorenzo Rosasco
Abstract	Leverage score sampling provides an appealing way to perform approximate computations for large matrices. Indeed, it allows to derive faithful approximations with a complexity adapted to the problem at hand. Yet, performing leverage scores sampling is a challenge in its own right requiring further approximations. In this paper, we study the problem of leverage score sampling for positive definite matrices defined by a kernel. Our contribution is twofold. First we provide a novel algorithm for leverage score sampling and second, we exploit the proposed method in statistical learning by deriving a novel solver for kernel ridge regression. Our main technical contribution is showing that the proposed algorithms are currently the most efficient and accurate for these problems.
Tasks
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13258v2
PDF	http://arxiv.org/pdf/1810.13258v2.pdf
PWC	https://paperswithcode.com/paper/on-fast-leverage-score-sampling-and-optimal
Repo
Framework

Core Conflictual Relationship: Text Mining to Discover What and When


Title	Core Conflictual Relationship: Text Mining to Discover What and When
Authors	Fionn Murtagh, Giuseppe Iurato
Abstract	Following detailed presentation of the Core Conflictual Relationship Theme (CCRT), there is the objective of relevant methods for what has been described as verbalization and visualization of data. Such is also termed data mining and text mining, and knowledge discovery in data. The Correspondence Analysis methodology, also termed Geometric Data Analysis, is shown in a case study to be comprehensive and revealing. Computational efficiency depends on how the analysis process is structured. For both illustrative and revealing aspects of the case study here, relatively extensive dream reports are used. This Geometric Data Analysis confirms the validity of CCRT method.
Tasks
Published	2018-05-28
URL	http://arxiv.org/abs/1805.11140v1
PDF	http://arxiv.org/pdf/1805.11140v1.pdf
PWC	https://paperswithcode.com/paper/core-conflictual-relationship-text-mining-to
Repo
Framework

Deep Graph Laplacian Regularization for Robust Denoising of Real Images


Title	Deep Graph Laplacian Regularization for Robust Denoising of Real Images
Authors	Jin Zeng, Jiahao Pang, Wenxiu Sun, Gene Cheung
Abstract	Recent developments in deep learning have revolutionized the paradigm of image restoration. However, its applications on real image denoising are still limited, due to its sensitivity to training data and the complex nature of real image noise. In this work, we combine the robustness merit of model-based approaches and the learning power of data-driven approaches for real image denoising. Specifically, by integrating graph Laplacian regularization as a trainable module into a deep learning framework, we are less susceptible to overfitting than pure CNN-based approaches, achieving higher robustness to small datasets and cross-domain denoising. First, a sparse neighborhood graph is built from the output of a convolutional neural network (CNN). Then the image is restored by solving an unconstrained quadratic programming problem, using a corresponding graph Laplacian regularizer as a prior term. The proposed restoration pipeline is fully differentiable and hence can be end-to-end trained. Experimental results demonstrate that our work is less prone to overfitting given small training data. It is also endowed with strong cross-domain generalization power, outperforming the state-of-the-art approaches by a remarkable margin.
Tasks	Denoising, Domain Generalization, Image Denoising, Image Restoration
Published	2018-07-31
URL	https://arxiv.org/abs/1807.11637v3
PDF	https://arxiv.org/pdf/1807.11637v3.pdf
PWC	https://paperswithcode.com/paper/deep-graph-laplacian-regularization-for
Repo
Framework

A survey of automatic de-identification of longitudinal clinical narratives


Title	A survey of automatic de-identification of longitudinal clinical narratives
Authors	Vithya Yogarajan, Michael Mayo, Bernhard Pfahringer
Abstract	Use of medical data, also known as electronic health records, in research helps develop and advance medical science. However, protecting patient confidentiality and identity while using medical data for analysis is crucial. Medical data can be in the form of tabular structures (i.e. tables), free-form narratives, and images. This study focuses on medical data in the free form longitudinal text. De-identification of electronic health records provides the opportunity to use such data for research without it affecting patient privacy, and avoids the need for individual patient consent. In recent years there is increasing interest in developing an accurate, robust and adaptable automatic de-identification system for electronic health records. This is mainly due to the dilemma between the availability of an abundance of health data, and the inability to use such data in research due to legal and ethical restrictions. De-identification tracks in competitions such as the 2014 i2b2 UTHealth and the 2016 CEGS N-GRID shared tasks have provided a great platform to advance this area. The primary reasons for this include the open source nature of the dataset and the fact that raw psychiatric data were used for 2016 competitions. This study focuses on noticeable trend changes in the techniques used in the development of automatic de-identification for longitudinal clinical narratives. More specifically, the shift from using conditional random fields (CRF) based systems only or rules (regular expressions, dictionary or combinations) based systems only, to hybrid models (combining CRF and rules), and more recently to deep learning based systems. We review the literature and results that arose from the 2014 and the 2016 competitions and discuss the outcomes of these systems. We also provide a list of research questions that emerged from this survey.
Tasks
Published	2018-10-16
URL	http://arxiv.org/abs/1810.06765v1
PDF	http://arxiv.org/pdf/1810.06765v1.pdf
PWC	https://paperswithcode.com/paper/a-survey-of-automatic-de-identification-of
Repo
Framework

Band gap prediction for large organic crystal structures with machine learning


Title	Band gap prediction for large organic crystal structures with machine learning
Authors	Bart Olsthoorn, R. Matthias Geilhufe, Stanislav S. Borysov, Alexander V. Balatsky
Abstract	Machine-learning models are capable of capturing the structure-property relationship from a dataset of computationally demanding ab initio calculations. Over the past two years, the Organic Materials Database (OMDB) has hosted a growing number of calculated electronic properties of previously synthesized organic crystal structures. The complexity of the organic crystals contained within the OMDB, which have on average 82 atoms per unit cell, makes this database a challenging platform for machine learning applications. In this paper, the focus is on predicting the band gap which represents one of the basic properties of a crystalline materials. With this aim, a consistent dataset of 12 500 crystal structures and their corresponding DFT band gap are released, freely available for download at https://omdb.mathub.io/dataset. An ensemble of two state-of-the-art models reach a mean absolute error (MAE) of 0.388 eV, which corresponds to a percentage error of 13% for an average band gap of 3.05 eV. Finally, the trained models are employed to predict the band gap for 260 092 materials contained within the Crystallography Open Database (COD) and made available online so that the predictions can be obtained for any arbitrary crystal structure uploaded by a user.
Tasks	Band Gap
Published	2018-10-30
URL	https://arxiv.org/abs/1810.12814v4
PDF	https://arxiv.org/pdf/1810.12814v4.pdf
PWC	https://paperswithcode.com/paper/band-gap-prediction-for-large-organic-crystal
Repo
Framework

Building Corpora for Single-Channel Speech Separation Across Multiple Domains


Title	Building Corpora for Single-Channel Speech Separation Across Multiple Domains
Authors	Matthew Maciejewski, Gregory Sell, Leibny Paola Garcia-Perera, Shinji Watanabe, Sanjeev Khudanpur
Abstract	To date, the bulk of research on single-channel speech separation has been conducted using clean, near-field, read speech, which is not representative of many modern applications. In this work, we develop a procedure for constructing high-quality synthetic overlap datasets, necessary for most deep learning-based separation frameworks. We produced datasets that are more representative of realistic applications using the CHiME-5 and Mixer 6 corpora and evaluate standard methods on this data to demonstrate the shortcomings of current source-separation performance. We also demonstrate the value of a wide variety of data in training robust models that generalize well to multiple conditions.
Tasks	Speech Separation
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02641v1
PDF	http://arxiv.org/pdf/1811.02641v1.pdf
PWC	https://paperswithcode.com/paper/building-corpora-for-single-channel-speech
Repo
Framework

Gradient Descent Finds Global Minima of Deep Neural Networks


Title	Gradient Descent Finds Global Minima of Deep Neural Networks
Authors	Simon S. Du, Jason D. Lee, Haochuan Li, Liwei Wang, Xiyu Zhai
Abstract	Gradient descent finds a global minimum in training deep neural networks despite the objective function being non-convex. The current paper proves gradient descent achieves zero training loss in polynomial time for a deep over-parameterized neural network with residual connections (ResNet). Our analysis relies on the particular structure of the Gram matrix induced by the neural network architecture. This structure allows us to show the Gram matrix is stable throughout the training process and this stability implies the global optimality of the gradient descent algorithm. We further extend our analysis to deep residual convolutional neural networks and obtain a similar convergence result.
Tasks
Published	2018-11-09
URL	https://arxiv.org/abs/1811.03804v4
PDF	https://arxiv.org/pdf/1811.03804v4.pdf
PWC	https://paperswithcode.com/paper/gradient-descent-finds-global-minima-of-deep
Repo
Framework

Probabilistic Model of Object Detection Based on Convolutional Neural Network


Title	Probabilistic Model of Object Detection Based on Convolutional Neural Network
Authors	Fang-Qi Li, Xu-Die Ren, Hao-Nan Guo
Abstract	The combination of a CNN detector and a search framework forms the basis for local object/pattern detection. To handle the waste of regional information and the defective compromise between efficiency and accuracy, this paper proposes a probabilistic model with a powerful search framework. By mapping an image into a probabilistic distribution of objects, this new model gives more informative outputs with less computation. The setting and analytic traits are elaborated in this paper, followed by a series of experiments carried out on FDDB, which show that the proposed model is sound, efficient and analytic.
Tasks	Object Detection
Published	2018-08-16
URL	http://arxiv.org/abs/1808.08272v1
PDF	http://arxiv.org/pdf/1808.08272v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-model-of-object-detection-based
Repo
Framework

Evaluating Conditional Cash Transfer Policies with Machine Learning Methods


Title	Evaluating Conditional Cash Transfer Policies with Machine Learning Methods
Authors	Tzai-Shuen Chen
Abstract	This paper presents an out-of-sample prediction comparison between major machine learning models and the structural econometric model. Over the past decade, machine learning has established itself as a powerful tool in many prediction applications, but this approach is still not widely adopted in empirical economic studies. To evaluate the benefits of this approach, I use the most common machine learning algorithms, CART, C4.5, LASSO, random forest, and adaboost, to construct prediction models for a cash transfer experiment conducted by the Progresa program in Mexico, and I compare the prediction results with those of a previous structural econometric study. Two prediction tasks are performed in this paper: the out-of-sample forecast and the long-term within-sample simulation. For the out-of-sample forecast, both the mean absolute error and the root mean square error of the school attendance rates found by all machine learning models are smaller than those found by the structural model. Random forest and adaboost have the highest accuracy for the individual outcomes of all subgroups. For the long-term within-sample simulation, the structural model has better performance than do all of the machine learning models. The poor within-sample fitness of the machine learning model results from the inaccuracy of the income and pregnancy prediction models. The result shows that the machine learning model performs better than does the structural model when there are many data to learn; however, when the data are limited, the structural model offers a more sensible prediction. The findings of this paper show promise for adopting machine learning in economic policy analyses in the era of big data.
Tasks
Published	2018-03-16
URL	http://arxiv.org/abs/1803.06401v1
PDF	http://arxiv.org/pdf/1803.06401v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-conditional-cash-transfer-policies
Repo
Framework

The Graph-based Broad Behavior-Aware Recommendation System for Interactive News


Title	The Graph-based Broad Behavior-Aware Recommendation System for Interactive News
Authors	Mingyuan Ma, Sen Na, Cong Xu, Xin Fan
Abstract	In this paper, we propose a heuristic recommendation system for interactive news, called the graph-based broad behavior-aware network (G-BBAN). Different from most of existing work, our network considers six behaviors that may potentially be conducted by users, including unclick, click, like, follow, comment, and share. Further, we introduce the core and coritivity concept from graph theory into the system to measure the concentration degree of interests of each user, which we show can help to improve the performance even further if it’s considered. There are three critical steps in our recommendation system. First, we build a structured user-dependent interaction behavior graph for multi-level and multi-category data as a preprocessing step. This graph constructs the data sources and knowledge information which will be used in G-BBAN through representation learning. Second, for each user node on the graph, we calculate its core and coritivity and then add the pair as a new feature associated to this user. According to the definition of core and coritivity, this user-dependent feature provides useful insights into the concentration degree of his/her interests and affects the trade-off between accuracy and diversity of the personalized recommendation. Last, we represent item (news) information by entity semantics and environment semantics; design a multi-channel convolutional neural network called G-CNN to learn the semantic information and an attention-based LSTM to learn user’s behavior representation; combine with previous concentration feature and input into another two fully connected layers to finish the classification task. The whole network consists of the final G-BBAN. Through comparing with baselines and several variates of itself, our proposed method shows the superior performance in extensive experiments.
Tasks	Representation Learning
Published	2018-11-30
URL	http://arxiv.org/abs/1812.00002v1
PDF	http://arxiv.org/pdf/1812.00002v1.pdf
PWC	https://paperswithcode.com/paper/the-graph-based-broad-behavior-aware
Repo
Framework

Deep Multi-Spectral Registration Using Invariant Descriptor Learning


Title	Deep Multi-Spectral Registration Using Invariant Descriptor Learning
Authors	Nati Ofir, Shai Silberstein, Hila Levi, Dani Rozenbaum, Yosi Keller, Sharon Duvdevani Bar
Abstract	In this paper, we introduce a novel deep-learning method to align cross-spectral images. Our approach relies on a learned descriptor which is invariant to different spectra. Multi-modal images of the same scene capture different signals and therefore their registration is challenging and it is not solved by classic approaches. To that end, we developed a feature-based approach that solves the visible (VIS) to Near-Infra-Red (NIR) registration problem. Our algorithm detects corners by Harris and matches them by a patch-metric learned on top of CIFAR-10 network descriptor. As our experiments demonstrate we achieve a high-quality alignment of cross-spectral images with a sub-pixel accuracy. Comparing to other existing methods, our approach is more accurate in the task of VIS to NIR registration.
Tasks
Published	2018-01-16
URL	http://arxiv.org/abs/1801.05171v6
PDF	http://arxiv.org/pdf/1801.05171v6.pdf
PWC	https://paperswithcode.com/paper/deep-multi-spectral-registration-using
Repo
Framework