Paper Group ANR 66
Dual Neural Network Architecture for Determining Epistemic and Aleatoric Uncertainties. Solving Traveltime Tomography with Deep Learning. An Efficient Hardware-Oriented Dropout Algorithm. Embedded Bayesian Network Classifiers. Human-Machine Collaboration for Fast Land Cover Mapping. Age Progression and Regression with Spatial Attention Modules. Gro …
Dual Neural Network Architecture for Determining Epistemic and Aleatoric Uncertainties
Title | Dual Neural Network Architecture for Determining Epistemic and Aleatoric Uncertainties |
Authors | Augustin Prado, Ravinath Kausik, Lalitha Venkataramanan |
Abstract | Deep learning techniques have been shown to be extremely effective for various classification and regression problems, but quantifying the uncertainty of their predictions and separating them into the epistemic and aleatoric fractions is still considered challenging. In oil and gas exploration projects, tools consisting of seismic, sonic, magnetic resonance, resistivity, dielectric and/or nuclear sensors are sent downhole through boreholes to probe the earth’s rock and fluid properties. The measurements from these tools are used to build reservoir models that are subsequently used for estimation and optimization of hydrocarbon production. Machine learning algorithms are often used to estimate the rock and fluid properties from the measured downhole data. Quantifying uncertainties of these properties is crucial for rock and fluid evaluation and subsequent reservoir optimization and production decisions. These machine learning algorithms are often trained on a “ground-truth” or core database. During the inference phase which involves application of these algorithms to field data, it is critical that the machine learning algorithm flag data as out of distribution from new geologies that the model was not trained upon. It is also highly important to be sensitive to heteroscedastic aleatoric noise in the feature space arising from the combination of tool and geological conditions. Understanding the source of the uncertainty and reducing them is key to designing intelligent tools and applications such as automated log interpretation answer products for exploration and field development. In this paper we describe a methodology consisting of a system of dual networks comprising of the combination of a Bayesian Neural Network (BNN) and an Artificial Neural Network (ANN) addressing this challenge for geophysical applications. |
Tasks | |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.06153v1 |
https://arxiv.org/pdf/1910.06153v1.pdf | |
PWC | https://paperswithcode.com/paper/dual-neural-network-architecture-for |
Repo | |
Framework | |
Solving Traveltime Tomography with Deep Learning
Title | Solving Traveltime Tomography with Deep Learning |
Authors | Yuwei Fan, Lexing Ying |
Abstract | This paper introduces a neural network approach for solving two-dimensional traveltime tomography (TT) problems based on the eikonal equation. The mathematical problem of TT is to recover the slowness field of a medium based on the boundary measurement of the traveltimes of waves going through the medium. This inverse map is high-dimensional and nonlinear. For the circular tomography geometry, a perturbative analysis shows that the forward map can be approximated by a vectorized convolution operator in the angular direction. Motivated by this and filtered back-projection, we propose an effective neural network architecture for the inverse map using the recently proposed BCR-Net, with weights learned from training datasets. Numerical results demonstrate the efficiency of the proposed neural networks. |
Tasks | |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.11636v1 |
https://arxiv.org/pdf/1911.11636v1.pdf | |
PWC | https://paperswithcode.com/paper/solving-traveltime-tomography-with-deep |
Repo | |
Framework | |
An Efficient Hardware-Oriented Dropout Algorithm
Title | An Efficient Hardware-Oriented Dropout Algorithm |
Authors | Yoeng Jye Yeoh, Takashi Morie, Hakaru Tamukoh |
Abstract | This paper proposes a hardware-oriented dropout algorithm, which is efficient for field programmable gate array (FPGA) implementation. In deep neural networks (DNNs), overfitting occurs when networks are overtrained and adapt too well to training data. Consequently, they fail in predicting unseen data used as test data. Dropout is a common technique that is often applied in DNNs to overcome this problem. In general, implementing such training algorithms of DNNs in embedded systems is difficult due to power and memory constraints. Training DNNs is power-, time-, and memory- intensive; however, embedded systems require low power consumption and real-time processing. An FPGA is suitable for embedded systems for its parallel processing characteristic and low operating power; however, due to its limited memory and different architecture, it is difficult to apply general neural network algorithms. Therefore, we propose a hardware-oriented dropout algorithm that can effectively utilize the characteristics of an FPGA with less memory required. Software program verification demonstrates that the performance of the proposed method is identical to that of conventional dropout, and hardware synthesis demonstrates that it results in significant resource reduction. |
Tasks | |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.05941v1 |
https://arxiv.org/pdf/1911.05941v1.pdf | |
PWC | https://paperswithcode.com/paper/an-efficient-hardware-oriented-dropout |
Repo | |
Framework | |
Embedded Bayesian Network Classifiers
Title | Embedded Bayesian Network Classifiers |
Authors | David Heckerman, Chris Meek |
Abstract | Low-dimensional probability models for local distribution functions in a Bayesian network include decision trees, decision graphs, and causal independence models. We describe a new probability model for discrete Bayesian networks, which we call an embedded Bayesian network classifier or EBNC. The model for a node $Y$ given parents $\bf X$ is obtained from a (usually different) Bayesian network for $Y$ and $\bf X$ in which $\bf X$ need not be the parents of $Y$. We show that an EBNC is a special case of a softmax polynomial regression model. Also, we show how to identify a non-redundant set of parameters for an EBNC, and describe an asymptotic approximation for learning the structure of Bayesian networks that contain EBNCs. Unlike the decision tree, decision graph, and causal independence models, we are unaware of a semantic justification for the use of these models. Experiments are needed to determine whether the models presented in this paper are useful in practice. |
Tasks | |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.09715v1 |
https://arxiv.org/pdf/1910.09715v1.pdf | |
PWC | https://paperswithcode.com/paper/embedded-bayesian-network-classifiers |
Repo | |
Framework | |
Human-Machine Collaboration for Fast Land Cover Mapping
Title | Human-Machine Collaboration for Fast Land Cover Mapping |
Authors | Caleb Robinson, Anthony Ortiz, Kolya Malkin, Blake Elias, Andi Peng, Dan Morris, Bistra Dilkina, Nebojsa Jojic |
Abstract | We propose incorporating human labelers in a model fine-tuning system that provides immediate user feedback. In our framework, human labelers can interactively query model predictions on unlabeled data, choose which data to label, and see the resulting effect on the model’s predictions. This bi-directional feedback loop allows humans to learn how the model responds to new data. Our hypothesis is that this rich feedback allows human labelers to create mental models that enable them to better choose which biases to introduce to the model. We compare human-selected points to points selected using standard active learning methods. We further investigate how the fine-tuning methodology impacts the human labelers’ performance. We implement this framework for fine-tuning high-resolution land cover segmentation models. Specifically, we fine-tune a deep neural network – trained to segment high-resolution aerial imagery into different land cover classes in Maryland, USA – to a new spatial area in New York, USA. The tight loop turns the algorithm and the human operator into a hybrid system that can produce land cover maps of a large area much more efficiently than the traditional workflows. Our framework has applications in geospatial machine learning settings where there is a practically limitless supply of unlabeled data, of which only a small fraction can feasibly be labeled through human efforts. |
Tasks | Active Learning |
Published | 2019-06-10 |
URL | https://arxiv.org/abs/1906.04176v3 |
https://arxiv.org/pdf/1906.04176v3.pdf | |
PWC | https://paperswithcode.com/paper/human-machine-collaboration-for-fast-land |
Repo | |
Framework | |
Age Progression and Regression with Spatial Attention Modules
Title | Age Progression and Regression with Spatial Attention Modules |
Authors | Qi Li, Yunfan Liu, Zhenan Sun |
Abstract | Age progression and regression refers to aesthetically render-ing a given face image to present effects of face aging and rejuvenation, respectively. Although numerous studies have been conducted in this topic, there are two major problems: 1) multiple models are usually trained to simulate different age mappings, and 2) the photo-realism of generated face images is heavily influenced by the variation of training images in terms of pose, illumination, and background. To address these issues, in this paper, we propose a framework based on conditional Generative Adversarial Networks (cGANs) to achieve age progression and regression simultaneously. Particularly, since face aging and rejuvenation are largely different in terms of image translation patterns, we model these two processes using two separate generators, each dedicated to one age changing process. In addition, we exploit spatial attention mechanisms to limit image modifications to regions closely related to age changes, so that images with high visual fidelity could be synthesized for in-the-wild cases. Experiments on multiple datasets demonstrate the ability of our model in synthesizing lifelike face images at desired ages with personalized features well preserved, and keeping age-irrelevant regions unchanged. |
Tasks | |
Published | 2019-03-06 |
URL | https://arxiv.org/abs/1903.02133v2 |
https://arxiv.org/pdf/1903.02133v2.pdf | |
PWC | https://paperswithcode.com/paper/age-progression-and-regression-with-spatial |
Repo | |
Framework | |
Grounding learning of modifier dynamics: An application to color naming
Title | Grounding learning of modifier dynamics: An application to color naming |
Authors | Xudong Han, Philip Schulz, Trevor Cohn |
Abstract | Grounding is crucial for natural language understanding. An important subtask is to understand modified color expressions, such as ‘dirty blue’. We present a model of color modifiers that, compared with previous additive models in RGB space, learns more complex transformations. In addition, we present a model that operates in the HSV color space. We show that certain adjectives are better modeled in that space. To account for all modifiers, we train a hard ensemble model that selects a color space depending on the modifier color pair. Experimental results show significant and consistent improvements compared to the state-of-the-art baseline model. |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07586v1 |
https://arxiv.org/pdf/1909.07586v1.pdf | |
PWC | https://paperswithcode.com/paper/grounding-learning-of-modifier-dynamics-an |
Repo | |
Framework | |
Evaluating Discourse in Structured Text Representations
Title | Evaluating Discourse in Structured Text Representations |
Authors | Elisa Ferracane, Greg Durrett, Junyi Jessy Li, Katrin Erk |
Abstract | Discourse structure is integral to understanding a text and is helpful in many NLP tasks. Learning latent representations of discourse is an attractive alternative to acquiring expensive labeled discourse data. Liu and Lapata (2018) propose a structured attention mechanism for text classification that derives a tree over a text, akin to an RST discourse tree. We examine this model in detail, and evaluate on additional discourse-relevant tasks and datasets, in order to assess whether the structured attention improves performance on the end task and whether it captures a text’s discourse structure. We find the learned latent trees have little to no structure and instead focus on lexical cues; even after obtaining more structured trees with proposed model modifications, the trees are still far from capturing discourse structure when compared to discourse dependency trees from an existing discourse parser. Finally, ablation studies show the structured attention provides little benefit, sometimes even hurting performance. |
Tasks | Text Classification |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01472v2 |
https://arxiv.org/pdf/1906.01472v2.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-discourse-in-structured-text |
Repo | |
Framework | |
Predictive Precompute with Recurrent Neural Networks
Title | Predictive Precompute with Recurrent Neural Networks |
Authors | Hanson Wang, Zehui Wang, Yuanyuan Ma |
Abstract | In both mobile and web applications, speeding up user interface response times can often lead to significant improvements in user engagement. A common technique to improve responsiveness is to precompute data ahead of time for specific activities. However, simply precomputing data for all user and activity combinations is prohibitive at scale due to both network constraints and server-side computational costs. It is therefore important to accurately predict per-user application usage in order to minimize wasted precomputation (“predictive precompute”). In this paper, we describe the novel application of recurrent neural networks (RNNs) for predictive precompute. We compare their performance with traditional machine learning models, and share findings from their large-scale production use at Facebook. We demonstrate that RNN models improve prediction accuracy, eliminate most feature engineering steps, and reduce the computational cost of serving predictions by an order of magnitude. |
Tasks | Feature Engineering |
Published | 2019-12-14 |
URL | https://arxiv.org/abs/1912.06779v2 |
https://arxiv.org/pdf/1912.06779v2.pdf | |
PWC | https://paperswithcode.com/paper/predictive-precompute-with-recurrent-neural |
Repo | |
Framework | |
Online Learned Continual Compression with Adaptive Quantization Modules
Title | Online Learned Continual Compression with Adaptive Quantization Modules |
Authors | Lucas Caccia, Eugene Belilovsky, Massimo Caccia, Joelle Pineau |
Abstract | We introduce and study the problem of Online Continual Compression, where one attempts to simultaneously learn to compress and store a representative dataset from a non i.i.d data stream, while only observing each sample once. A naive application of auto-encoders in this setting encounters a major challenge: representations derived from earlier encoder states must be usable by later decoder states. We show how to use discrete auto-encoders to effectively address this challenge and introduce Adaptive Quantization Modules (AQM) to control variation in the compression ability of the module at any given stage of learning. This enables selecting an appropriate compression for incoming samples, while taking into account overall memory constraints and current progress of the learned compression. Unlike previous methods, our approach does not require any pretraining, even on challenging datasets. We show that using AQM to replace standard episodic memory in continual learning settings leads to significant gains on continual learning benchmarks. Furthermore we demonstrate this approach with larger images, LiDAR, and reinforcement learning agents. |
Tasks | Continual Learning, Quantization |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08019v2 |
https://arxiv.org/pdf/1911.08019v2.pdf | |
PWC | https://paperswithcode.com/paper/online-learned-continual-compression-with-1 |
Repo | |
Framework | |
The Spatially-Conscious Machine Learning Model
Title | The Spatially-Conscious Machine Learning Model |
Authors | Timothy J. Kiely, Nathaniel D. Bastian |
Abstract | Successfully predicting gentrification could have many social and commercial applications; however, real estate sales are difficult to predict because they belong to a chaotic system comprised of intrinsic and extrinsic characteristics, perceived value, and market speculation. Using New York City real estate as our subject, we combine modern techniques of data science and machine learning with traditional spatial analysis to create robust real estate prediction models for both classification and regression tasks. We compare several cutting edge machine learning algorithms across spatial, semi-spatial and non-spatial feature engineering techniques, and we empirically show that spatially-conscious machine learning models outperform non-spatial models when married with advanced prediction techniques such as feed-forward artificial neural networks and gradient boosting machine models. |
Tasks | Feature Engineering |
Published | 2019-02-01 |
URL | http://arxiv.org/abs/1902.00562v1 |
http://arxiv.org/pdf/1902.00562v1.pdf | |
PWC | https://paperswithcode.com/paper/the-spatially-conscious-machine-learning |
Repo | |
Framework | |
Cosine similarity-based adversarial process
Title | Cosine similarity-based adversarial process |
Authors | Hee-Soo Heo, Jee-weon Jung, Hye-jin Shim, IL-Ho Yang, Ha-Jin Yu |
Abstract | An adversarial process between two deep neural networks is a promising approach to train a robust model. In this paper, we propose an adversarial process using cosine similarity, whereas conventional adversarial processes are based on inverted categorical cross entropy (CCE). When used for training an identification model, the adversarial process induces the competition of two discriminative models; one for a primary task such as speaker identification or image recognition, the other one for a subsidiary task such as channel identification or domain identification. In particular, the adversarial process degrades the performance of the subsidiary model by eliminating the subsidiary information in the input which, in assumption, may degrade the performance of the primary model. The conventional adversarial processes maximize the CCE of the subsidiary model to degrade the performance. We have studied a framework for training robust discriminative models by eliminating channel or domain information (subsidiary information) by applying such an adversarial process. However, we found through experiments that using the process of maximizing the CCE does not guarantee the performance degradation of the subsidiary model. In the proposed adversarial process using cosine similarity, on the contrary, the performance of the subsidiary model can be degraded more efficiently by searching feature space orthogonal to the subsidiary model. The experiments on speaker identification and image recognition show that we found features that make the outputs of the subsidiary models independent of the input, and the performances of the primary models are improved. |
Tasks | Speaker Identification |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00542v1 |
https://arxiv.org/pdf/1907.00542v1.pdf | |
PWC | https://paperswithcode.com/paper/cosine-similarity-based-adversarial-process |
Repo | |
Framework | |
Forecasting residential gas demand: machine learning approaches and seasonal role of temperature forecasts
Title | Forecasting residential gas demand: machine learning approaches and seasonal role of temperature forecasts |
Authors | Andrea Marziali, Emanuele Fabbiani, Giuseppe De Nicolao |
Abstract | Gas demand forecasting is a critical task for energy providers as it impacts on pipe reservation and stock planning. In this paper, the one-day-ahead forecasting of residential gas demand at country level is investigated by implementing and comparing five models: Ridge Regression, Gaussian Process (GP), k-Nearest Neighbour, Artificial Neural Network (ANN), and Torus Model. Italian demand data from 2007 to 2017 are used for training and testing the proposed algorithms. The choice of the relevant covariates and the most significant aspects of the pre-processing and feature extraction steps are discussed in-depth, lending particular attention to the role of one-day-ahead temperature forecasts. Our best model, in terms of Root Mean Squared Error (RMSE), is the ANN, closely followed by the GP. If the Mean Absolute Error (MAE) is taken as an error measure, the GP becomes the best model, although by a narrow margin. A main novel contribution is the development of a model describing the propagation of temperature errors to gas forecasting errors that is successfully validated on experimental data. Being able to predict the quantitative impact of temperature forecasts on gas forecasts could be useful in order to assess potential improvement margins associated with more sophisticated weather forecasts. On the Italian data, it is shown that temperature forecast errors account for some 18% of the mean squared error of gas demand forecasts provided by ANN. |
Tasks | Feature Selection |
Published | 2019-01-04 |
URL | https://arxiv.org/abs/1901.02719v5 |
https://arxiv.org/pdf/1901.02719v5.pdf | |
PWC | https://paperswithcode.com/paper/short-term-forecasting-of-italian-residential |
Repo | |
Framework | |
Sampling Limits for Electron Tomography with Sparsity-exploiting Reconstructions
Title | Sampling Limits for Electron Tomography with Sparsity-exploiting Reconstructions |
Authors | Yi Jiang, Elliot Padgett, Robert Hovden, David A. Muller |
Abstract | Electron tomography (ET) has become a standard technique for 3D characterization of materials at the nano-scale. Traditional reconstruction algorithms such as weighted back projection suffer from disruptive artifacts with insufficient projections. Popularized by compressed sensing, sparsity-exploiting algorithms have been applied to experimental ET data and show promise for improving reconstruction quality or reducing the total beam dose applied to a specimen. Nevertheless, theoretical bounds for these methods have been less explored in the context of ET applications. Here, we perform numerical simulations to investigate performance of l_1-norm and total-variation (TV) minimization under various imaging conditions. From 36,100 different simulated structures, our results show specimens with more complex structures generally require more projections for exact reconstruction. However, once sufficient data is acquired, dividing the beam dose over more projections provides no improvements - analogous to the traditional dose-fraction theorem. Moreover, a limited tilt range of +-75 or less can result in distorting artifacts in sparsity-exploiting reconstructions. The influence of optimization parameters on reconstructions is also discussed. |
Tasks | Electron Tomography |
Published | 2019-04-04 |
URL | http://arxiv.org/abs/1904.02614v1 |
http://arxiv.org/pdf/1904.02614v1.pdf | |
PWC | https://paperswithcode.com/paper/sampling-limits-for-electron-tomography-with |
Repo | |
Framework | |
Anomaly Detection in Particulate Matter Sensor using Hypothesis Pruning Generative Adversarial Network
Title | Anomaly Detection in Particulate Matter Sensor using Hypothesis Pruning Generative Adversarial Network |
Authors | YeongHyeon Park, Won Seok Park, Yeong Beom Kim |
Abstract | World Health Organization (WHO) provides the guideline for managing the Particulate Matter (PM) level because when the PM level is higher, it threats the human health. For managing PM level, the procedure for measuring PM value is needed firstly. We use Tapered Element Oscillating Microbalance (TEOM)-based PM measuring sensors because it shows higher cost-effectiveness than Beta Attenuation Monitor (BAM)-based sensor. However, TEOM-based sensor has higher probability of malfunctioning than BAM-based sensor. In this paper, we call the overall malfunction as an anomaly, and we aim to detect anomalies for the maintenance of PM measuring sensors. We propose a novel architecture for solving the above aim that named as Hypothesis Pruning Generative Adversarial Network (HP-GAN). We experimentally compare the several anomaly detection architectures to certify ours performing better. |
Tasks | Anomaly Detection |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00583v2 |
https://arxiv.org/pdf/1912.00583v2.pdf | |
PWC | https://paperswithcode.com/paper/anomaly-detection-in-particulate-matter |
Repo | |
Framework | |