Paper Group ANR 606
Toward Filament Segmentation Using Deep Neural Networks. Online Learning in Planar Pushing with Combined Prediction Model. NLVR2 Visual Bias Analysis. Convergence to minima for the continuous version of Backtracking Gradient Descent. Local Geometric Indexing of High Resolution Data for Facial Reconstruction from Sparse Markers. AI and Holistic Revi …
Toward Filament Segmentation Using Deep Neural Networks
Title | Toward Filament Segmentation Using Deep Neural Networks |
Authors | Azim Ahmadzadeh, Sushant S. Mahajan, Dustin J. Kempton, Rafal A. Angryk, Shihao Ji |
Abstract | We use a well-known deep neural network framework, called Mask R-CNN, for identification of solar filaments in full-disk H-alpha images from Big Bear Solar Observatory (BBSO). The image data, collected from BBSO’s archive, are integrated with the spatiotemporal metadata of filaments retrieved from the Heliophysics Events Knowledgebase (HEK) system. This integrated data is then treated as the ground-truth in the training process of the model. The available spatial metadata are the output of a currently running filament-detection module developed and maintained by the Feature Finding Team; an international consortium selected by NASA. Despite the known challenges in the identification and characterization of filaments by the existing module, which in turn are inherited into any other module that intends to learn from such outputs, Mask R-CNN shows promising results. Trained and validated on two years worth of BBSO data, this model is then tested on the three following years. Our case-by-case and overall analyses show that Mask R-CNN can clearly compete with the existing module and in some cases even perform better. Several cases of false positives and false negatives, that are correctly segmented by this model are also shown. The overall advantages of using the proposed model are two-fold: First, deep neural networks’ performance generally improves as more annotated data, or better annotations are provided. Second, such a model can be scaled up to detect other solar events, as well as a single multi-purpose module. The results presented in this study introduce a proof of concept in benefits of employing deep neural networks for detection of solar events, and in particular, filaments. |
Tasks | |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1912.02743v1 |
https://arxiv.org/pdf/1912.02743v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-filament-segmentation-using-deep |
Repo | |
Framework | |
Online Learning in Planar Pushing with Combined Prediction Model
Title | Online Learning in Planar Pushing with Combined Prediction Model |
Authors | Huidong Gao, Yi Ouyang, Masayoshi Tomizuka |
Abstract | Pushing is a useful robotic capability for positioning and reorienting objects. The ability to accurately predict the effect of pushes can enable efficient trajectory planning and complicated object manipulation. Physical prediction models for planar pushing have long been established, but their assumptions and requirements usually don’t hold in most practical settings. Data-driven approaches can provide accurate predictions for offline data, but they often have generalizability issues. In this paper, we propose a combined prediction model and an online learning framework for planar push prediction. The combined model consists of a neural network module and analytical components with a low-dimensional parameter. We train the neural network offline using pre-collected pushing data. In online situations, the low-dimensional analytical parameter is learned directly from online pushes to quickly adapt to the new environments. We test our combined model and learning framework on real pushing experiments. Our experimental results show that our model is able to quickly adapt to new environments while achieving similar final prediction performance as that of pure neural network models. |
Tasks | |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.08181v1 |
https://arxiv.org/pdf/1910.08181v1.pdf | |
PWC | https://paperswithcode.com/paper/online-learning-in-planar-pushing-with |
Repo | |
Framework | |
NLVR2 Visual Bias Analysis
Title | NLVR2 Visual Bias Analysis |
Authors | Alane Suhr, Yoav Artzi |
Abstract | NLVR2 (Suhr et al., 2019) was designed to be robust for language bias through a data collection process that resulted in each natural language sentence appearing with both true and false labels. The process did not provide a similar measure of control for visual bias. This technical report analyzes the potential for visual bias in NLVR2. We show that some amount of visual bias likely exists. Finally, we identify a subset of the test data that allows to test for model performance in a way that is robust to such potential biases. We show that the performance of existing models (Li et al., 2019; Tan and Bansal 2019) is relatively robust to this potential bias. We propose to add the evaluation on this subset of the data to the NLVR2 evaluation protocol, and update the official release to include it. A notebook including an implementation of the code used to replicate this analysis is available at http://nlvr.ai/NLVR2BiasAnalysis.html. |
Tasks | |
Published | 2019-09-23 |
URL | https://arxiv.org/abs/1909.10411v1 |
https://arxiv.org/pdf/1909.10411v1.pdf | |
PWC | https://paperswithcode.com/paper/190910411 |
Repo | |
Framework | |
Convergence to minima for the continuous version of Backtracking Gradient Descent
Title | Convergence to minima for the continuous version of Backtracking Gradient Descent |
Authors | Tuyen Trung Truong |
Abstract | The main result of this paper is: {\bf Theorem.} Let $f:\mathbb{R}^k\rightarrow \mathbb{R}$ be a $C^{1}$ function, so that $\nabla f$ is locally Lipschitz continuous. Assume moreover that $f$ is $C^2$ near its generalised saddle points. Fix real numbers $\delta_0>0$ and $0<\alpha <1$. Then there is a smooth function $h:\mathbb{R}^k\rightarrow (0,\delta_0]$ so that the map $H:\mathbb{R}^k\rightarrow \mathbb{R}^k$ defined by $H(x)=x-h(x)\nabla f(x)$ has the following property: (i) For all $x\in \mathbb{R}^k$, we have $f(H(x)))-f(x)\leq -\alpha h(x)\nabla f(x)^2$. (ii) For every $x_0\in \mathbb{R}^k$, the sequence $x_{n+1}=H(x_n)$ either satisfies $\lim_{n\rightarrow\infty}x_{n+1}-x_n=0$ or $ \lim_{n\rightarrow\infty}x_n=\infty$. Each cluster point of ${x_n}$ is a critical point of $f$. If moreover $f$ has at most countably many critical points, then ${x_n}$ either converges to a critical point of $f$ or $\lim_{n\rightarrow\infty}x_n=\infty$. (iii) There is a set $\mathcal{E}_1\subset \mathbb{R}^k$ of Lebesgue measure $0$ so that for all $x_0\in \mathbb{R}^k\backslash \mathcal{E}_1$, the sequence $x_{n+1}=H(x_n)$, {\bf if converges}, cannot converge to a {\bf generalised} saddle point. (iv) There is a set $\mathcal{E}_2\subset \mathbb{R}^k$ of Lebesgue measure $0$ so that for all $x_0\in \mathbb{R}^k\backslash \mathcal{E}_2$, any cluster point of the sequence $x_{n+1}=H(x_n)$ is not a saddle point, and more generally cannot be an isolated generalised saddle point. Some other results are proven. |
Tasks | |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04221v2 |
https://arxiv.org/pdf/1911.04221v2.pdf | |
PWC | https://paperswithcode.com/paper/convergence-to-minima-for-the-continuous |
Repo | |
Framework | |
Local Geometric Indexing of High Resolution Data for Facial Reconstruction from Sparse Markers
Title | Local Geometric Indexing of High Resolution Data for Facial Reconstruction from Sparse Markers |
Authors | Matthew Cong, Lana Lan, Ronald Fedkiw |
Abstract | When considering sparse motion capture marker data, one typically struggles to balance its overfitting via a high dimensional blendshape system versus underfitting caused by smoothness constraints. With the current trend towards using more and more data, our aim is not to fit the motion capture markers with a parameterized (blendshape) model or to smoothly interpolate a surface through the marker positions, but rather to find an instance in the high resolution dataset that contains local geometry to fit each marker. Just as is true for typical machine learning applications, this approach benefits from a plethora of data, and thus we also consider augmenting the dataset via specially designed physical simulations that target the high resolution dataset such that the simulation output lies on the same so-called manifold as the data targeted. |
Tasks | Motion Capture, Physical Simulations |
Published | 2019-03-01 |
URL | https://arxiv.org/abs/1903.00119v2 |
https://arxiv.org/pdf/1903.00119v2.pdf | |
PWC | https://paperswithcode.com/paper/local-geometric-indexing-of-high-resolution |
Repo | |
Framework | |
AI and Holistic Review: Informing Human Reading in College Admissions
Title | AI and Holistic Review: Informing Human Reading in College Admissions |
Authors | AJ Alvero, Noah Arthurs, anthony lising antonio, Benjamin W. Domingue, Ben Gebre-Medhin, Sonia Giebel, Mitchell L. Stevens |
Abstract | College admissions in the United States is carried out by a human-centered method of evaluation known as holistic review, which typically involves reading original narrative essays submitted by each applicant. The legitimacy and fairness of holistic review, which gives human readers significant discretion over determining each applicant’s fitness for admission, has been repeatedly challenged in courtrooms and the public sphere. Using a unique corpus of 283,676 application essays submitted to a large, selective, state university system between 2015 and 2016, we assess the extent to which applicant demographic characteristics can be inferred from application essays. We find a relatively interpretable classifier (logistic regression) was able to predict gender and household income with high levels of accuracy. Findings suggest that data auditing might be useful in informing holistic review, and perhaps other evaluative systems, by checking potential bias in human or computational readings. |
Tasks | |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.09318v1 |
https://arxiv.org/pdf/1912.09318v1.pdf | |
PWC | https://paperswithcode.com/paper/ai-and-holistic-review-informing-human |
Repo | |
Framework | |
Early Predictions for Medical Crowdfunding: A Deep Learning Approach Using Diverse Inputs
Title | Early Predictions for Medical Crowdfunding: A Deep Learning Approach Using Diverse Inputs |
Authors | Tong Wang, Fujie Jin, Yu, Hu, Yuan Cheng |
Abstract | Medical crowdfunding is a popular channel for people needing financial help paying medical bills to collect donations from large numbers of people. However, large heterogeneity exists in donations across cases, and fundraisers face significant uncertainty in whether their crowdfunding campaigns can meet fundraising goals. Therefore, it is important to provide early warnings for fundraisers if such a channel will eventually fail. In this study, we aim to develop novel algorithms to provide accurate and timely predictions of fundraising performance, to better inform fundraisers. In particular, we propose a new approach to combine time-series features and time-invariant features in the deep learning model, to process diverse sources of input data. Compared with baseline models, our model achieves better accuracy and requires a shorter observation window of the time-varying features from the campaign launch to provide robust predictions with high confidence. To extract interpretable insights, we further conduct a multivariate time-series clustering analysis and identify four typical temporal donation patterns. This demonstrates the heterogeneity in the features and how they relate to the fundraising outcome. The prediction model and the interpretable insights can be applied to assist fundraisers with better promoting their fundraising campaigns and can potentially help crowdfunding platforms to provide more timely feedback to all fundraisers. Our proposed framework is also generalizable to other fields where diverse structured and unstructured data are valuable for predictions. |
Tasks | Time Series, Time Series Clustering |
Published | 2019-11-09 |
URL | https://arxiv.org/abs/1911.05702v1 |
https://arxiv.org/pdf/1911.05702v1.pdf | |
PWC | https://paperswithcode.com/paper/early-predictions-for-medical-crowdfunding-a |
Repo | |
Framework | |
Fusion vectors: Embedding Graph Fusions for Efficient Unsupervised Rank Aggregation
Title | Fusion vectors: Embedding Graph Fusions for Efficient Unsupervised Rank Aggregation |
Authors | Icaro Cavalcante Dourado, Ricardo da Silva Torres |
Abstract | The vast increase in amount and complexity of digital content led to a wide interest in ad-hoc retrieval systems in recent years. Complementary, the existence of heterogeneous data sources and retrieval models stimulated the proliferation of increasingly ingenious and effective rank aggregation functions. Although recently proposed rank aggregation functions are promising with respect to effectiveness, existing proposals in the area usually overlook efficiency aspects. We propose an innovative rank aggregation function that is unsupervised, intrinsically multimodal, and targeted for fast retrieval and top effectiveness performance. We introduce the concepts of embedding and indexing of graph-based rank-aggregation representation models, and their application for search tasks. Embedding formulations are also proposed for graph-based rank representations. We introduce the concept of fusion vectors, a late-fusion representation of objects based on ranks, from which an intrinsically rank-aggregation retrieval model is defined. Next, we present an approach for fast retrieval based on fusion vectors, thus promoting an efficient rank aggregation system. Our method presents top effectiveness performance among state-of-the-art related work, while bringing novel aspects of multimodality and effectiveness. Consistent speedups are achieved against the recent baselines in all datasets considered. |
Tasks | |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.06011v2 |
https://arxiv.org/pdf/1906.06011v2.pdf | |
PWC | https://paperswithcode.com/paper/fusion-vectors-embedding-graph-fusions-for |
Repo | |
Framework | |
Hybrid Mortality Prediction using Multiple Source Systems
Title | Hybrid Mortality Prediction using Multiple Source Systems |
Authors | Isaac Mativo, Yelena Yesha, Michael Grasso, Tim Oates, Qian Zhu |
Abstract | The use of artificial intelligence in clinical care to improve decision support systems is increasing. This is not surprising since, by its very nature, the practice of medicine consists of making decisions based on observations from different systems both inside and outside the human body. In this paper, we combine three general systems (ICU, diabetes, and comorbidities) and use them to make patient clinical predictions. We use an artificial intelligence approach to show that we can improve mortality prediction of hospitalized diabetic patients. We do this by utilizing a machine learning approach to select clinical input features that are more likely to predict mortality. We then use these features to create a hybrid mortality prediction model and compare our results to non-artificial intelligence models. For simplicity, we limit our input features to patient comorbidities and features derived from a well-known mortality measure, the Sequential Organ Failure Assessment (SOFA). |
Tasks | Mortality Prediction |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1905.00752v1 |
http://arxiv.org/pdf/1905.00752v1.pdf | |
PWC | https://paperswithcode.com/paper/190500752 |
Repo | |
Framework | |
Neocortical plasticity: an unsupervised cake but no free lunch
Title | Neocortical plasticity: an unsupervised cake but no free lunch |
Authors | Eilif B. Muller, Philippe Beaudoin |
Abstract | The fields of artificial intelligence and neuroscience have a long history of fertile bi-directional interactions. On the one hand, important inspiration for the development of artificial intelligence systems has come from the study of natural systems of intelligence, the mammalian neocortex in particular. On the other, important inspiration for models and theories of the brain have emerged from artificial intelligence research. A central question at the intersection of these two areas is concerned with the processes by which neocortex learns, and the extent to which they are analogous to the back-propagation training algorithm of deep networks. Matching the data efficiency, transfer and generalization properties of neocortical learning remains an area of active research in the field of deep learning. Recent advances in our understanding of neuronal, synaptic and dendritic physiology of the neocortex suggest new approaches for unsupervised representation learning, perhaps through a new class of objective functions, which could act alongside or in lieu of back-propagation. Such local learning rules have implicit rather than explicit objectives with respect to the training data, facilitating domain adaptation and generalization. Incorporating them into deep networks for representation learning could better leverage unlabelled datasets to offer significant improvements in data efficiency of downstream supervised readout learning, and reduce susceptibility to adversarial perturbations, at the cost of a more restricted domain of applicability. |
Tasks | Domain Adaptation, Representation Learning, Unsupervised Representation Learning |
Published | 2019-11-15 |
URL | https://arxiv.org/abs/1911.08584v1 |
https://arxiv.org/pdf/1911.08584v1.pdf | |
PWC | https://paperswithcode.com/paper/neocortical-plasticity-an-unsupervised-cake |
Repo | |
Framework | |
Construction of efficient detectors for character information recognition
Title | Construction of efficient detectors for character information recognition |
Authors | A. A. Telnykh, I. V. Nuidel, Yu. R. Samorodova |
Abstract | We have developed and tested in numerical experiments a universal approach to searching objects of a given type in captured video images (for example, people’s faces, vehicles, special characters, numbers and letters, etc.). The novelty and versatility of this approach consists in a unique combination of the well-known methods ranging from creating detectors to making decisions independent of the type of recognition objects. The efficiencies of various types of basic features used for image coding, including the Haar features, the LBP features, and the modified Census transformation are compared. A combination of the modified methods is used for constructing 11 types of detectors of the number of railway carriages and for recognizing digits from zero to nine. The efficiency of the constructed detectors is studied. |
Tasks | |
Published | 2019-08-13 |
URL | https://arxiv.org/abs/1908.04634v1 |
https://arxiv.org/pdf/1908.04634v1.pdf | |
PWC | https://paperswithcode.com/paper/construction-of-efficient-detectors-for |
Repo | |
Framework | |
Deep Fictitious Play for Finding Markovian Nash Equilibrium in Multi-Agent Games
Title | Deep Fictitious Play for Finding Markovian Nash Equilibrium in Multi-Agent Games |
Authors | Jiequn Han, Ruimeng Hu |
Abstract | We propose a deep neural network-based algorithm to identify the Markovian Nash equilibrium of general large $N$-player stochastic differential games. Following the idea of fictitious play, we recast the $N$-player game into $N$ decoupled decision problems (one for each player) and solve them iteratively. The individual decision problem is characterized by a semilinear Hamilton-Jacobi-Bellman equation, to solve which we employ the recently developed deep BSDE method. The resulted algorithm can solve large $N$-player games for which conventional numerical methods would suffer from the curse of dimensionality. Multiple numerical examples involving identical or heterogeneous agents, with risk-neutral or risk-sensitive objectives, are tested to validate the accuracy of the proposed algorithm in large group games. Even for a fifty-player game with the presence of common noise, the proposed algorithm still finds the approximate Nash equilibrium accurately, which, to our best knowledge, is difficult to achieve by other numerical algorithms. |
Tasks | |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.01809v1 |
https://arxiv.org/pdf/1912.01809v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-fictitious-play-for-finding-markovian |
Repo | |
Framework | |
Multi-stage domain adversarial style reconstruction for cytopathological image stain normalization
Title | Multi-stage domain adversarial style reconstruction for cytopathological image stain normalization |
Authors | Xihao Chen, Jingya Yu, Li Chen, Shaoqun Zeng, Xiuli Liu, Shenghua Cheng |
Abstract | The different stain styles of cytopathological images have a negative effect on the generalization ability of automated image analysis algorithms. This article proposes a new framework that normalizes the stain style for cytopathological images through a stain removal module and a multi-stage domain adversarial style reconstruction module. We convert colorful images into grayscale images with a color-encoding mask. Using the mask, reconstructed images retain their basic color without red and blue mixing, which is important for cytopathological image interpretation. The style reconstruction module consists of per-pixel regression with intradomain adversarial learning, inter-domain adversarial learning, and optional task-based refining. Per-pixel regression with intradomain adversarial learning establishes the generative network from the decolorized input to the reconstructed output. The interdomain adversarial learning further reduces the difference in stain style. The generation network can be optimized by combining it with the task network. Experimental results show that the proposed techniques help to optimize the generation network. The average accuracy increases from 75.41% to 84.79% after the intra-domain adversarial learning, and to 87.00% after interdomain adversarial learning. Under the guidance of the task network, the average accuracy rate reaches 89.58%. The proposed method achieves unsupervised stain normalization of cytopathological images, while preserving the cell structure, texture structure, and cell color properties of the image. This method overcomes the problem of generalizing the task models between different stain styles of cytopathological images. |
Tasks | |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.05184v1 |
https://arxiv.org/pdf/1909.05184v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-stage-domain-adversarial-style |
Repo | |
Framework | |
Error bounds for deep ReLU networks using the Kolmogorov–Arnold superposition theorem
Title | Error bounds for deep ReLU networks using the Kolmogorov–Arnold superposition theorem |
Authors | Hadrien Montanelli, Haizhao Yang |
Abstract | We prove a theorem concerning the approximation of multivariate continuous functions by deep ReLU networks, for which the curse of the dimensionality is lessened. Our theorem is based on the Kolmogorov–Arnold superposition theorem, and on the approximation of the inner and outer functions that appear in the superposition by very deep ReLU networks. |
Tasks | |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11945v1 |
https://arxiv.org/pdf/1906.11945v1.pdf | |
PWC | https://paperswithcode.com/paper/error-bounds-for-deep-relu-networks-using-the |
Repo | |
Framework | |
A Study of Annotation and Alignment Accuracy for Performance Comparison in Complex Orchestral Music
Title | A Study of Annotation and Alignment Accuracy for Performance Comparison in Complex Orchestral Music |
Authors | Thassilo Gadermaier, Gerhard Widmer |
Abstract | Quantitative analysis of commonalities and differences between recorded music performances is an increasingly common task in computational musicology. A typical scenario involves manual annotation of different recordings of the same piece along the time dimension, for comparative analysis of, e.g., the musical tempo, or for mapping other performance-related information between performances. This can be done by manually annotating one reference performance, and then automatically synchronizing other performances, using audio-to-audio alignment algorithms. In this paper we address several questions related to those tasks. First, we analyze different annotations of the same musical piece, quantifying timing deviations between the respective human annotators. A statistical evaluation of the marker time stamps will provide (a) an estimate of the expected timing precision of human annotations and (b) a ground truth for subsequent automatic alignment experiments. We then carry out a systematic evaluation of different audio features for audio-to-audio alignment, quantifying the degree of alignment accuracy that can be achieved, and relate this to the results from the annotation study. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07394v1 |
https://arxiv.org/pdf/1910.07394v1.pdf | |
PWC | https://paperswithcode.com/paper/a-study-of-annotation-and-alignment-accuracy |
Repo | |
Framework | |