Paper Group ANR 1410
Optimal Uncertainty-guided Neural Network Training. Solar Event Tracking with Deep Regression Networks: A Proof of Concept Evaluation. Using Mapping Languages for Building Legal Knowledge Graphs from XML Files. SNRA: A Spintronic Neuromorphic Reconfigurable Array for In-Circuit Training and Evaluation of Deep Belief Networks. Trainability of ReLU n …
Optimal Uncertainty-guided Neural Network Training
Title | Optimal Uncertainty-guided Neural Network Training |
Authors | H M Dipu Kabir, Abbas Khosravi, Abdollah Kavousi-Fard, Saeid Nahavandi, Dipti Srinivasan |
Abstract | The neural network (NN)-based direct uncertainty quantification (UQ) methods have achieved the state of the art performance since the first inauguration, known as the lower-upper-bound estimation (LUBE) method. However, currently-available cost functions for uncertainty guided NN training are not always converging and all converged NNs are not generating optimized prediction intervals (PIs). Moreover, several groups have proposed different quality criteria for PIs. These raise a question about their relative effectiveness. Most of the existing cost functions of uncertainty guided NN training are not customizable and the convergence of training is uncertain. Therefore, in this paper, we propose a highly customizable smooth cost function for developing NNs to construct optimal PIs. The optimized average width of PIs, PI-failure distances and the PI coverage probability (PICP) are computed for the test dataset. The performance of the proposed method is examined for the wind power generation and the electricity demand data. Results show that the proposed method reduces variation in the quality of PIs, accelerates the training, and improves convergence probability from 99.2% to 99.8%. |
Tasks | |
Published | 2019-12-30 |
URL | https://arxiv.org/abs/1912.12761v1 |
https://arxiv.org/pdf/1912.12761v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-uncertainty-guided-neural-network |
Repo | |
Framework | |
Solar Event Tracking with Deep Regression Networks: A Proof of Concept Evaluation
Title | Solar Event Tracking with Deep Regression Networks: A Proof of Concept Evaluation |
Authors | Toqi Tahamid Sarker, Juan M. Banda |
Abstract | With the advent of deep learning for computer vision tasks, the need for accurately labeled data in large volumes is vital for any application. The increasingly available large amounts of solar image data generated by the Solar Dynamic Observatory (SDO) mission make this domain particularly interesting for the development and testing of deep learning systems. The currently available labeled solar data is generated by the SDO mission’s Feature Finding Team’s (FFT) specialized detection modules. The major drawback of these modules is that detection and labeling is performed with a cadence of every 4 to 12 hours, depending on the module. Since SDO image data products are created every 10 seconds, there is a considerable gap between labeled observations and the continuous data stream. In order to address this shortcoming, we trained a deep regression network to track the movement of two solar phenomena: Active Region and Coronal Hole events. To the best of our knowledge, this is the first attempt of solar event tracking using a deep learning approach. Since it is impossible to fully evaluate the performance of the suggested event tracks with the original data (only partial ground truth is available), we demonstrate with several metrics the effectiveness of our approach. With the purpose of generating continuously labeled solar image data, we present this feasibility analysis showing the great promise of deep regression networks for this task. |
Tasks | |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08350v1 |
https://arxiv.org/pdf/1911.08350v1.pdf | |
PWC | https://paperswithcode.com/paper/solar-event-tracking-with-deep-regression |
Repo | |
Framework | |
Using Mapping Languages for Building Legal Knowledge Graphs from XML Files
Title | Using Mapping Languages for Building Legal Knowledge Graphs from XML Files |
Authors | Ademar Crotti Junior, Fabrizio Orlandi, Declan O’Sullivan, Christian Dirschl, Quentin Reul |
Abstract | This paper presents our experience on building RDF knowledge graphs for an industrial use case in the legal domain. The information contained in legal information systems are often accessed through simple keyword interfaces and presented as a simple list of hits. In order to improve search accuracy one may avail of knowledge graphs, where the semantics of the data can be made explicit. Significant research effort has been invested in the area of building knowledge graphs from semi-structured text documents, such as XML, with the prevailing approach being the use of mapping languages. In this paper, we present a semantic model for representing legal documents together with an industrial use case. We also present a set of use case requirements based on the proposed semantic model, which are used to compare and discuss the use of state-of-the-art mapping languages for building knowledge graphs for legal data. |
Tasks | Knowledge Graphs |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07673v1 |
https://arxiv.org/pdf/1911.07673v1.pdf | |
PWC | https://paperswithcode.com/paper/using-mapping-languages-for-building-legal |
Repo | |
Framework | |
SNRA: A Spintronic Neuromorphic Reconfigurable Array for In-Circuit Training and Evaluation of Deep Belief Networks
Title | SNRA: A Spintronic Neuromorphic Reconfigurable Array for In-Circuit Training and Evaluation of Deep Belief Networks |
Authors | Ramtin Zand, Ronald F. DeMara |
Abstract | In this paper, a spintronic neuromorphic reconfigurable Array (SNRA) is developed to fuse together power-efficient probabilistic and in-field programmable deterministic computing during both training and evaluation phases of restricted Boltzmann machines (RBMs). First, probabilistic spin logic devices are used to develop an RBM realization which is adapted to construct deep belief networks (DBNs) having one to three hidden layers of size 10 to 800 neurons each. Second, we design a hardware implementation for the contrastive divergence (CD) algorithm using a four-state finite state machine capable of unsupervised training in N+3 clocks where N denotes the number of neurons in each RBM. The functionality of our proposed CD hardware implementation is validated using ModelSim simulations. We synthesize the developed Verilog HDL implementation of our proposed test/train control circuitry for various DBN topologies where the maximal RBM dimensions yield resource utilization ranging from 51 to 2,421 lookup tables (LUTs). Next, we leverage spin Hall effect (SHE)-magnetic tunnel junction (MTJ) based non-volatile LUTs circuits as an alternative for static random access memory (SRAM)-based LUTs storing the deterministic logic configuration to form a reconfigurable fabric. Finally, we compare the performance of our proposed SNRA with SRAM-based configurable fabrics focusing on the area and power consumption induced by the LUTs used to implement both CD and evaluation modes. The results obtained indicate more than 80% reduction in combined dynamic and static power dissipation, while achieving at least 50% reduction in device count. |
Tasks | |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02415v1 |
http://arxiv.org/pdf/1901.02415v1.pdf | |
PWC | https://paperswithcode.com/paper/snra-a-spintronic-neuromorphic-reconfigurable |
Repo | |
Framework | |
Trainability of ReLU networks and Data-dependent Initialization
Title | Trainability of ReLU networks and Data-dependent Initialization |
Authors | Yeonjong Shin, George Em Karniadakis |
Abstract | In this paper, we study the trainability of rectified linear unit (ReLU) networks. A ReLU neuron is said to be dead if it only outputs a constant for any input. Two death states of neurons are introduced; tentative and permanent death. A network is then said to be trainable if the number of permanently dead neurons is sufficiently small for a learning task. We refer to the probability of a network being trainable as trainability. We show that a network being trainable is a necessary condition for successful training and the trainability serves as an upper bound of successful training rates. In order to quantify the trainability, we study the probability distribution of the number of active neurons at the initialization. In many applications, over-specified or over-parameterized neural networks are successfully employed and shown to be trained effectively. With the notion of trainability, we show that over-parameterization is both a necessary and a sufficient condition for minimizing the training loss. Furthermore, we propose a data-dependent initialization method in the over-parameterized setting. Numerical examples are provided to demonstrate the effectiveness of the method and our theoretical findings. |
Tasks | |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09696v2 |
https://arxiv.org/pdf/1907.09696v2.pdf | |
PWC | https://paperswithcode.com/paper/trainability-and-data-dependent |
Repo | |
Framework | |
Causality and deceit: Do androids watch action movies?
Title | Causality and deceit: Do androids watch action movies? |
Authors | Dusko Pavlovic, Temra Pavlovic |
Abstract | We seek causes through science, religion, and in everyday life. We get excited when a big rock causes a big splash, and we get scared when it tumbles without a cause. But our causal cognition is usually biased. The ‘why’ is influenced by the ‘who’. It is influenced by the ‘self’, and by ‘others’. We share rituals, we watch action movies, and we influence each other to believe in the same causes. Human mind is packed with subjectivity because shared cognitive biases bring us together. But they also make us vulnerable. An artificial mind is deemed to be more objective than the human mind. After many years of science-fiction fantasies about even-minded androids, they are now sold as personal or expert assistants, as brand advocates, as policy or candidate supporters, as network influencers. Artificial agents have been stunningly successful in disseminating artificial causal beliefs among humans. As malicious artificial agents continue to manipulate human cognitive biases, and deceive human communities into ostensive but expansive causal illusions, the hope for defending us has been vested into developing benevolent artificial agents, tasked with preventing and mitigating cognitive distortions inflicted upon us by their malicious cousins. Can the distortions of human causal cognition be corrected on a more solid foundation of artificial causal cognition? In the present paper, we study a simple model of causal cognition, viewed as a quest for causal models. We show that, under very mild and hard to avoid assumptions, there are always self-confirming causal models, which perpetrate self-deception, and seem to preclude a royal road to objectivity. |
Tasks | |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04383v1 |
https://arxiv.org/pdf/1910.04383v1.pdf | |
PWC | https://paperswithcode.com/paper/causality-and-deceit-do-androids-watch-action |
Repo | |
Framework | |
TableNet: a multiplier-less implementation of neural networks for inferencing
Title | TableNet: a multiplier-less implementation of neural networks for inferencing |
Authors | Chai Wah Wu |
Abstract | We consider the use of look-up tables (LUT) to simplify the hardware implementation of a deep learning network for inferencing after weights have been successfully trained. The use of LUT replaces the matrix multiply and add operations with a small number of LUTs and addition operations resulting in a completely multiplier-less implementation. We compare the different tradeoffs of this approach in terms of accuracy versus LUT size and the number of operations and show that similar performance can be obtained with a comparable memory footprint as a full precision deep neural network, but without the use of any multipliers. We illustrate this with several architectures such as MLP and CNN. |
Tasks | |
Published | 2019-05-25 |
URL | https://arxiv.org/abs/1905.10601v2 |
https://arxiv.org/pdf/1905.10601v2.pdf | |
PWC | https://paperswithcode.com/paper/lutnet-speeding-up-deep-neural-network |
Repo | |
Framework | |
Unsupervised Robust Disentangling of Latent Characteristics for Image Synthesis
Title | Unsupervised Robust Disentangling of Latent Characteristics for Image Synthesis |
Authors | Patrick Esser, Johannes Haux, Björn Ommer |
Abstract | Deep generative models come with the promise to learn an explainable representation for visual objects that allows image sampling, synthesis, and selective modification. The main challenge is to learn to properly model the independent latent characteristics of an object, especially its appearance and pose. We present a novel approach that learns disentangled representations of these characteristics and explains them individually. Training requires only pairs of images depicting the same object appearance, but no pose annotations. We propose an additional classifier that estimates the minimal amount of regularization required to enforce disentanglement. Thus both representations together can completely explain an image while being independent of each other. Previous methods based on adversarial approaches fail to enforce this independence, while methods based on variational approaches lead to uninformative representations. In experiments on diverse object categories, the approach successfully recombines pose and appearance to reconstruct and retarget novel synthesized images. We achieve significant improvements over state-of-the-art methods which utilize the same level of supervision, and reach performances comparable to those of pose-supervised approaches. However, we can handle the vast body of articulated object classes for which no pose models/annotations are available. |
Tasks | Image Generation |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.10223v1 |
https://arxiv.org/pdf/1910.10223v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-robust-disentangling-of-latent-1 |
Repo | |
Framework | |
Just Add Functions: A Neural-Symbolic Language Model
Title | Just Add Functions: A Neural-Symbolic Language Model |
Authors | David Demeter, Doug Downey |
Abstract | Neural network language models (NNLMs) have achieved ever-improving accuracy due to more sophisticated architectures and increasing amounts of training data. However, the inductive bias of these models (formed by the distributional hypothesis of language), while ideally suited to modeling most running text, results in key limitations for today’s models. In particular, the models often struggle to learn certain spatial, temporal, or quantitative relationships, which are commonplace in text and are second-nature for human readers. Yet, in many cases, these relationships can be encoded with simple mathematical or logical expressions. How can we augment today’s neural models with such encodings? In this paper, we propose a general methodology to enhance the inductive bias of NNLMs by incorporating simple functions into a neural architecture to form a hierarchical neural-symbolic language model (NSLM). These functions explicitly encode symbolic deterministic relationships to form probability distributions over words. We explore the effectiveness of this approach on numbers and geographic locations, and show that NSLMs significantly reduce perplexity in small-corpus language modeling, and that the performance improvement persists for rare tokens even on much larger corpora. The approach is simple and general, and we discuss how it can be applied to other word classes beyond numbers and geography. |
Tasks | Language Modelling |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05421v1 |
https://arxiv.org/pdf/1912.05421v1.pdf | |
PWC | https://paperswithcode.com/paper/just-add-functions-a-neural-symbolic-language |
Repo | |
Framework | |
The Value of Collaboration in Convex Machine Learning with Differential Privacy
Title | The Value of Collaboration in Convex Machine Learning with Differential Privacy |
Authors | Nan Wu, Farhad Farokhi, David Smith, Mohamed Ali Kaafar |
Abstract | In this paper, we apply machine learning to distributed private data owned by multiple data owners, entities with access to non-overlapping training datasets. We use noisy, differentially-private gradients to minimize the fitness cost of the machine learning model using stochastic gradient descent. We quantify the quality of the trained model, using the fitness cost, as a function of privacy budget and size of the distributed datasets to capture the trade-off between privacy and utility in machine learning. This way, we can predict the outcome of collaboration among privacy-aware data owners prior to executing potentially computationally-expensive machine learning algorithms. Particularly, we show that the difference between the fitness of the trained machine learning model using differentially-private gradient queries and the fitness of the trained machine model in the absence of any privacy concerns is inversely proportional to the size of the training datasets squared and the privacy budget squared. We successfully validate the performance prediction with the actual performance of the proposed privacy-aware learning algorithms, applied to: financial datasets for determining interest rates of loans using regression; and detecting credit card frauds using support vector machines. |
Tasks | |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.09679v1 |
https://arxiv.org/pdf/1906.09679v1.pdf | |
PWC | https://paperswithcode.com/paper/the-value-of-collaboration-in-convex-machine |
Repo | |
Framework | |
A/B Testing Measurement Framework for Recommendation Models Based on Expected Revenue
Title | A/B Testing Measurement Framework for Recommendation Models Based on Expected Revenue |
Authors | Meisam Hejazinia, Majid Hosseini, Bryant Sih |
Abstract | We provide a method to determine whether a new recommendation system improves the revenue per visit (RPV) compared to the status quo. We achieve our goal by splitting RPV into conversion rate and average order value (AOV). We use the two-part test suggested by Lachenbruch to determine if the data generating process in the new system is different. In cases that this test does not give us a definitive answer about the change in RPV, we propose two alternative tests to determine if RPV has changed. Both of these tests rely on the assumption that non-zero purchase values follow a log-normal distribution. We empirically validate this assumption using data collected at different points in time from Staples.com. On average, our method needs a smaller sample size than other methods. Furthermore, it does not require any subjective outlier removal. Finally, it characterizes the uncertainty around RPV by providing a confidence interval. |
Tasks | |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.06390v1 |
https://arxiv.org/pdf/1906.06390v1.pdf | |
PWC | https://paperswithcode.com/paper/ab-testing-measurement-framework-for |
Repo | |
Framework | |
The Finite-Horizon Two-Armed Bandit Problem with Binary Responses: A Multidisciplinary Survey of the History, State of the Art, and Myths
Title | The Finite-Horizon Two-Armed Bandit Problem with Binary Responses: A Multidisciplinary Survey of the History, State of the Art, and Myths |
Authors | Peter Jacko |
Abstract | In this paper we consider the two-armed bandit problem, which often naturally appears per se or as a subproblem in some multi-armed generalizations, and serves as a starting point for introducing additional problem features. The consideration of binary responses is motivated by its widespread applicability and by being one of the most studied settings. We focus on the undiscounted finite-horizon objective, which is the most relevant in many applications. We make an attempt to unify the terminology as this is different across disciplines that have considered this problem, and present a unified model cast in the Markov decision process framework, with subject responses modelled using the Bernoulli distribution, and the corresponding Beta distribution for Bayesian updating. We give an extensive account of the history and state of the art of approaches from several disciplines, including design of experiments, Bayesian decision theory, naive designs, reinforcement learning, biostatistics, and combination designs. We evaluate these designs, together with a few newly proposed, accurately computationally (using a newly written package in Julia programming language by the author) in order to compare their performance. We show that conclusions are different for moderate horizons (typical in practice) than for small horizons (typical in academic literature reporting computational results). We further list and clarify a number of myths about this problem, e.g., we show that, computationally, much larger problems can be designed to Bayes-optimality than what is commonly believed. |
Tasks | |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.10173v1 |
https://arxiv.org/pdf/1906.10173v1.pdf | |
PWC | https://paperswithcode.com/paper/the-finite-horizon-two-armed-bandit-problem |
Repo | |
Framework | |
Did you miss it? Automatic lung nodule detection combined with gaze information improves radiologists’ screening performance
Title | Did you miss it? Automatic lung nodule detection combined with gaze information improves radiologists’ screening performance |
Authors | Guilherme Aresta, Carlos Ferreira, João Pedrosa, Teresa Araújo, João Rebelo, Eduardo Negrão, Margarida Morgado, Filipe Alves, António Cunha, Isabel Ramos, Aurélio Campilho |
Abstract | Early diagnosis of lung cancer via computed tomography can significantly reduce the morbidity and mortality rates associated with the pathology. However, search lung nodules is a high complexity task, which affects the success of screening programs. Whilst computer-aided detection systems can be used as second observers, they may bias radiologists and introduce significant time overheads. With this in mind, this study assesses the potential of using gaze information for integrating automatic detection systems in the clinical practice. For that purpose, 4 radiologists were asked to annotate 20 scans from a public dataset while being monitored by an eye tracker device and an automatic lung nodule detection system was developed. Our results show that radiologists follow a similar search routine and tend to have lower fixation periods in regions where finding errors occur. The overall detection sensitivity of the specialists was 0.67$\pm$0.07, whereas the system achieved 0.69. Combining the annotations of one radiologist with the automatic system significantly improves the detection performance to similar levels of two annotators. Likewise, combining the findings of radiologist with the detection algorithm only for low fixation regions still significantly improves the detection sensitivity without increasing the number of false-positives. The combination of the automatic system with the gaze information allows to mitigate possible errors of the radiologist without some of the issues usually associated with automatic detection system. |
Tasks | Lung Nodule Detection |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.03986v1 |
https://arxiv.org/pdf/1910.03986v1.pdf | |
PWC | https://paperswithcode.com/paper/did-you-miss-it-automatic-lung-nodule |
Repo | |
Framework | |
Early Estimation of User’s Intention of Tele-Operation Using Object Affordance and Hand Motion in a Dual First-Person Vision
Title | Early Estimation of User’s Intention of Tele-Operation Using Object Affordance and Hand Motion in a Dual First-Person Vision |
Authors | Motoki Kojima, Jun Miura |
Abstract | This paper describes a method of estimating the intention of a user’s motion in a robot tele-operation scenario. One of the issues in tele-operation is latency, which occurs due to various reasons such as a slow robot motion and a narrow communication channel. An effective way of reducing the latency is to estimate the human intention of motions and to move the robot proactively. To enable a reliable early intention estimation, we use both hand motion and object affordances in a dual first-person vision (robot and user) with an HMD. Experimental results in an object pickup scenario show the effectiveness of the method. |
Tasks | |
Published | 2019-10-05 |
URL | https://arxiv.org/abs/1910.02201v1 |
https://arxiv.org/pdf/1910.02201v1.pdf | |
PWC | https://paperswithcode.com/paper/early-estimation-of-users-intention-of-tele |
Repo | |
Framework | |
SalSi: A new seismic attribute for salt dome detection
Title | SalSi: A new seismic attribute for salt dome detection |
Authors | Muhammad Amir Shafiq, Tariq Alshawi, Zhiling Long, Ghassan AlRegib |
Abstract | In this paper, we propose a saliency-based attribute, SalSi, to detect salt dome bodies within seismic volumes. SalSi is based on the saliency theory and modeling of the human vision system (HVS). In this work, we aim to highlight the parts of the seismic volume that receive highest attention from the human interpreter, and based on the salient features of a seismic image, we detect the salt domes. Experimental results show the effectiveness of SalSi on the real seismic dataset acquired from the North Sea, F3 block. Subjectively, we have used the ground truth and the output of different salt dome delineation algorithms to validate the results of SalSi. For the objective evaluation of results, we have used the receiver operating characteristics (ROC) curves and area under the curves (AUC) to demonstrate SalSi is a promising and an effective attribute for seismic interpretation. |
Tasks | Seismic Interpretation |
Published | 2019-01-09 |
URL | http://arxiv.org/abs/1901.02937v1 |
http://arxiv.org/pdf/1901.02937v1.pdf | |
PWC | https://paperswithcode.com/paper/salsi-a-new-seismic-attribute-for-salt-dome |
Repo | |
Framework | |