Paper Group ANR 1528
Forecasting Wireless Demand with Extreme Values using Feature Embedding in Gaussian Processes. Intra-day Equity Price Prediction using Deep Learning as a Measure of Market Efficiency. Validation of a deep learning mammography model in a population with low screening rates. Two-phase flow regime prediction using LSTM based deep recurrent neural netw …
Forecasting Wireless Demand with Extreme Values using Feature Embedding in Gaussian Processes
Title | Forecasting Wireless Demand with Extreme Values using Feature Embedding in Gaussian Processes |
Authors | Chengyao Sun, Weisi Guo |
Abstract | Wireless traffic prediction is a fundamental enabler to proactive network optimisation in beyond 5G. Forecasting extreme demand spikes and troughs due to traffic mobility is essential to avoiding outages and improving energy efficiency. Current state-of-the-art deep learning forecasting methods predominantly focus on overall forecast performance and do not offer probabilistic uncertainty quantification (UQ). Whilst Gaussian Process (GP) models have UQ capability, it is not able to predict extreme values very well. Here, we design a feature embedding (FE) kernel for a GP model to forecast traffic demand with extreme values. Using real 4G base station data, we compare our FE-GP performance against both conventional naive GPs, ARIMA models, as well as demonstrate the UQ output. For short-term extreme value prediction, we demonstrated a 32% reduction vs. S-ARIMA and 17% reduction vs. Naive-GP. For long-term average value prediction, we demonstrated a 21% reduction vs. S-ARIMA and 12% reduction vs. Naive-GP. The FE kernel also enabled us to create a flexible trade-off between overall forecast accuracy against peak-trough accuracy. The advantage over neural network (e.g. CNN, LSTM) is that the probabilistic forecast uncertainty can inform us of the risk of predictions, as well as the full posterior distribution of the forecast. |
Tasks | Gaussian Processes, Traffic Prediction |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.06744v2 |
https://arxiv.org/pdf/1905.06744v2.pdf | |
PWC | https://paperswithcode.com/paper/forecasting-wireless-demand-with-extreme |
Repo | |
Framework | |
Intra-day Equity Price Prediction using Deep Learning as a Measure of Market Efficiency
Title | Intra-day Equity Price Prediction using Deep Learning as a Measure of Market Efficiency |
Authors | David Byrd, Tucker Hybinette Balch |
Abstract | In finance, the weak form of the Efficient Market Hypothesis asserts that historic stock price and volume data cannot inform predictions of future prices. In this paper we show that, to the contrary, future intra-day stock prices could be predicted effectively until 2009. We demonstrate this using two different profitable machine learning-based trading strategies. However, the effectiveness of both approaches diminish over time, and neither of them are profitable after 2009. We present our implementation and results in detail for the period 2003-2017 and propose a novel idea: the use of such flexible machine learning methods as an objective measure of relative market efficiency. We conclude with a candidate explanation, comparing our returns over time with high-frequency trading volume, and suggest concrete steps for further investigation. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08168v1 |
https://arxiv.org/pdf/1908.08168v1.pdf | |
PWC | https://paperswithcode.com/paper/intra-day-equity-price-prediction-using-deep |
Repo | |
Framework | |
Validation of a deep learning mammography model in a population with low screening rates
Title | Validation of a deep learning mammography model in a population with low screening rates |
Authors | Kevin Wu, Eric Wu, Yaping Wu, Hongna Tan, Greg Sorensen, Meiyun Wang, Bill Lotter |
Abstract | A key promise of AI applications in healthcare is in increasing access to quality medical care in under-served populations and emerging markets. However, deep learning models are often only trained on data from advantaged populations that have the infrastructure and resources required for large-scale data collection. In this paper, we aim to empirically investigate the potential impact of such biases on breast cancer detection in mammograms. We specifically explore how a deep learning algorithm trained on screening mammograms from the US and UK generalizes to mammograms collected at a hospital in China, where screening is not widely implemented. For the evaluation, we use a top-scoring model developed for the Digital Mammography DREAM Challenge. Despite the change in institution and population composition, we find that the model generalizes well, exhibiting similar performance to that achieved in the DREAM Challenge, even when controlling for tumor size. We also illustrate a simple but effective method for filtering predictions based on model variance, which can be particularly useful for deployment in new settings. While there are many components in developing a clinically effective system, these results represent a promising step towards increasing access to life-saving screening mammography in populations where screening rates are currently low. |
Tasks | Breast Cancer Detection |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00364v1 |
https://arxiv.org/pdf/1911.00364v1.pdf | |
PWC | https://paperswithcode.com/paper/validation-of-a-deep-learning-mammography |
Repo | |
Framework | |
Two-phase flow regime prediction using LSTM based deep recurrent neural network
Title | Two-phase flow regime prediction using LSTM based deep recurrent neural network |
Authors | Zhuoran Dang, Mamoru Ishii |
Abstract | Long short-term memory (LSTM) and recurrent neural network (RNN) has achieved great successes on time-series prediction. In this paper, a methodology of using LSTM-based deep-RNN for two-phase flow regime prediction is proposed, motivated by previous research on constructing deep RNN. The method is featured with fast response and accuracy. The built RNN networks are trained and tested with time-series void fraction data collected using impedance void meter. The result shows that the prediction accuracy depends on the depth of network and the number of layer cells. However, deeper and larger network consumes more time in predicting. |
Tasks | Time Series, Time Series Prediction |
Published | 2019-03-30 |
URL | http://arxiv.org/abs/1904.00291v1 |
http://arxiv.org/pdf/1904.00291v1.pdf | |
PWC | https://paperswithcode.com/paper/two-phase-flow-regime-prediction-using-lstm |
Repo | |
Framework | |
Hybrid Machine Learning Model of Extreme Learning Machine Radial basis function for Breast Cancer Detection and Diagnosis; a Multilayer Fuzzy Expert System
Title | Hybrid Machine Learning Model of Extreme Learning Machine Radial basis function for Breast Cancer Detection and Diagnosis; a Multilayer Fuzzy Expert System |
Authors | Sanaz Mojrian, Gergo Pinter, Javad Hassannataj Joloudari, Imre Felde, Narjes Nabipour, Laszlo Nadai, Amir Mosavi |
Abstract | Mammography is often used as the most common laboratory method for the detection of breast cancer, yet associated with the high cost and many side effects. Machine learning prediction as an alternative method has shown promising results. This paper presents a method based on a multilayer fuzzy expert system for the detection of breast cancer using an extreme learning machine (ELM) classification model integrated with radial basis function (RBF) kernel called ELM-RBF, considering the Wisconsin dataset. The performance of the proposed model is further compared with a linear-SVM model. The proposed model outperforms the linear-SVM model with RMSE, R2, MAPE equal to 0.1719, 0.9374 and 0.0539, respectively. Furthermore, both models are studied in terms of criteria of accuracy, precision, sensitivity, specificity, validation, true positive rate (TPR), and false-negative rate (FNR). The ELM-RBF model for these criteria presents better performance compared to the SVM model. |
Tasks | Breast Cancer Detection |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13574v1 |
https://arxiv.org/pdf/1910.13574v1.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-machine-learning-model-of-extreme |
Repo | |
Framework | |
Data-driven Neural Architecture Learning For Financial Time-series Forecasting
Title | Data-driven Neural Architecture Learning For Financial Time-series Forecasting |
Authors | Dat Thanh Tran, Juho Kanniainen, Moncef Gabbouj, Alexandros Iosifidis |
Abstract | Forecasting based on financial time-series is a challenging task since most real-world data exhibits nonstationary property and nonlinear dependencies. In addition, different data modalities often embed different nonlinear relationships which are difficult to capture by human-designed models. To tackle the supervised learning task in financial time-series prediction, we propose the application of a recently formulated algorithm that adaptively learns a mapping function, realized by a heterogeneous neural architecture composing of Generalized Operational Perceptron, given a set of labeled data. With a modified objective function, the proposed algorithm can accommodate the frequently observed imbalanced data distribution problem. Experiments on a large-scale Limit Order Book dataset demonstrate that the proposed algorithm outperforms related algorithms, including tensor-based methods which have access to a broader set of input information. |
Tasks | Time Series, Time Series Forecasting, Time Series Prediction |
Published | 2019-03-05 |
URL | http://arxiv.org/abs/1903.06751v1 |
http://arxiv.org/pdf/1903.06751v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-neural-architecture-learning-for |
Repo | |
Framework | |
Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search
Title | Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search |
Authors | Weiwen Jiang, Xinyi Zhang, Edwin H. -M. Sha, Lei Yang, Qingfeng Zhuge, Yiyu Shi, Jingtong Hu |
Abstract | A fundamental question lies in almost every application of deep neural networks: what is the optimal neural architecture given a specific dataset? Recently, several Neural Architecture Search (NAS) frameworks have been developed that use reinforcement learning and evolutionary algorithm to search for the solution. However, most of them take a long time to find the optimal architecture due to the huge search space and the lengthy training process needed to evaluate each candidate. In addition, most of them aim at accuracy only and do not take into consideration the hardware that will be used to implement the architecture. This will potentially lead to excessive latencies beyond specifications, rendering the resulting architectures useless. To address both issues, in this paper we use Field Programmable Gate Arrays (FPGAs) as a vehicle to present a novel hardware-aware NAS framework, namely FNAS, which will provide an optimal neural architecture with latency guaranteed to meet the specification. In addition, with a performance abstraction model to analyze the latency of neural architectures without training, our framework can quickly prune architectures that do not satisfy the specification, leading to higher efficiency. Experimental results on common data set such as ImageNet show that in the cases where the state-of-the-art generates architectures with latencies 7.81x longer than the specification, those from FNAS can meet the specs with less than 1% accuracy loss. Moreover, FNAS also achieves up to 11.13x speedup for the search process. To the best of the authors’ knowledge, this is the very first hardware aware NAS. |
Tasks | Neural Architecture Search |
Published | 2019-01-31 |
URL | http://arxiv.org/abs/1901.11211v1 |
http://arxiv.org/pdf/1901.11211v1.pdf | |
PWC | https://paperswithcode.com/paper/accuracy-vs-efficiency-achieving-both-through |
Repo | |
Framework | |
Deterministic Value-Policy Gradients
Title | Deterministic Value-Policy Gradients |
Authors | Qingpeng Cai, Ling Pan, Pingzhong Tang |
Abstract | Reinforcement learning algorithms such as the deep deterministic policy gradient algorithm (DDPG) has been widely used in continuous control tasks. However, the model-free DDPG algorithm suffers from high sample complexity. In this paper we consider the deterministic value gradients to improve the sample efficiency of deep reinforcement learning algorithms. Previous works consider deterministic value gradients with the finite horizon, but it is too myopic compared with infinite horizon. We firstly give a theoretical guarantee of the existence of the value gradients in this infinite setting. Based on this theoretical guarantee, we propose a class of the deterministic value gradient algorithm (DVG) with infinite horizon, and different rollout steps of the analytical gradients by the learned model trade off between the variance of the value gradients and the model bias. Furthermore, to better combine the model-based deterministic value gradient estimators with the model-free deterministic policy gradient estimator, we propose the deterministic value-policy gradient (DVPG) algorithm. We finally conduct extensive experiments comparing DVPG with state-of-the-art methods on several standard continuous control benchmarks. Results demonstrate that DVPG substantially outperforms other baselines. |
Tasks | Continuous Control |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.03939v2 |
https://arxiv.org/pdf/1909.03939v2.pdf | |
PWC | https://paperswithcode.com/paper/deterministic-value-policy-gradients |
Repo | |
Framework | |
A detailed study of recurrent neural networks used to model tasks in the cerebral cortex
Title | A detailed study of recurrent neural networks used to model tasks in the cerebral cortex |
Authors | C. Jarne, R. Laje |
Abstract | Recurrent Neural Networks or RNN are frequently used to model different aspects of brain regions. We studied the properties of RNN trained to perform temporal and flow control tasks with temporal stimuli. We present the results regarding three aspects: inner configuration sets, memory capacity with the scale and immunity to induced damage on trained networks. Our results allow us to quantify different aspects of these physical models, which are normally used as black boxes and must be understood previous to modeling the biological response of cerebral cortex. |
Tasks | |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.01094v3 |
https://arxiv.org/pdf/1906.01094v3.pdf | |
PWC | https://paperswithcode.com/paper/a-detailed-study-of-recurrent-neural-networks |
Repo | |
Framework | |
Assessing Supply Chain Cyber Risks
Title | Assessing Supply Chain Cyber Risks |
Authors | Alberto Redondo, Alberto Torres-Barrán, David Ríos Insua, Jordi Domingo |
Abstract | Risk assessment is a major challenge for supply chain managers, as it potentially affects business factors such as service costs, supplier competition and customer expectations. The increasing interconnectivity between organisations has put into focus methods for supply chain cyber risk management. We introduce a general approach to support such activity taking into account various techniques of attacking an organisation and its suppliers, as well as the impacts of such attacks. Since data is lacking in many respects, we use structured expert judgment methods to facilitate its implementation. We couple a family of forecasting models to enrich risk monitoring. The approach may be used to set up risk alarms, negotiate service level agreements, rank suppliers and identify insurance needs, among other management possibilities. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11652v1 |
https://arxiv.org/pdf/1911.11652v1.pdf | |
PWC | https://paperswithcode.com/paper/assessing-supply-chain-cyber-risks |
Repo | |
Framework | |
Distill-to-Label: Weakly Supervised Instance Labeling Using Knowledge Distillation
Title | Distill-to-Label: Weakly Supervised Instance Labeling Using Knowledge Distillation |
Authors | Jayaraman J. Thiagarajan, Satyananda Kashyap, Alexandros Karagyris |
Abstract | Weakly supervised instance labeling using only image-level labels, in lieu of expensive fine-grained pixel annotations, is crucial in several applications including medical image analysis. In contrast to conventional instance segmentation scenarios in computer vision, the problems that we consider are characterized by a small number of training images and non-local patterns that lead to the diagnosis. In this paper, we explore the use of multiple instance learning (MIL) to design an instance label generator under this weakly supervised setting. Motivated by the observation that an MIL model can handle bags of varying sizes, we propose to repurpose an MIL model originally trained for bag-level classification to produce reliable predictions for single instances, i.e., bags of size $1$. To this end, we introduce a novel regularization strategy based on virtual adversarial training for improving MIL training, and subsequently develop a knowledge distillation technique for repurposing the trained MIL model. Using empirical studies on colon cancer and breast cancer detection from histopathological images, we show that the proposed approach produces high-quality instance-level prediction and significantly outperforms state-of-the MIL methods. |
Tasks | Breast Cancer Detection, Instance Segmentation, Multiple Instance Learning, Semantic Segmentation |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1907.12926v1 |
https://arxiv.org/pdf/1907.12926v1.pdf | |
PWC | https://paperswithcode.com/paper/distill-to-label-weakly-supervised-instance |
Repo | |
Framework | |
Fourier Transform Approach to Machine Learning I: Fourier Regression
Title | Fourier Transform Approach to Machine Learning I: Fourier Regression |
Authors | Soheil Mehrabkhani |
Abstract | We propose a supervised learning algorithm for machine learning applications. Contrary to the model developing in the classical methods, which treat training, validation, and test as separate steps, in the presented approach, there is a unified training and evaluating procedure based on an iterative band filtering by the use of a fast Fourier transform. The presented approach does not apply the method of least squares, thus, basically typical ill-conditioned matrices do not occur at all. The optimal model results from the convergence of the performance metric, which automatically prevents the usual underfitting and overfitting problems. The algorithm capability is investigated for noisy data, and the obtained result demonstrates a reliable and powerful machine learning approach beyond the typical limits of the classical methods. |
Tasks | |
Published | 2019-03-31 |
URL | https://arxiv.org/abs/1904.00368v3 |
https://arxiv.org/pdf/1904.00368v3.pdf | |
PWC | https://paperswithcode.com/paper/fourier-transform-approach-to-machine |
Repo | |
Framework | |
BVS Corpus: A Multilingual Parallel Corpus of Biomedical Scientific Texts
Title | BVS Corpus: A Multilingual Parallel Corpus of Biomedical Scientific Texts |
Authors | Felipe Soares, Martin Krallinger |
Abstract | The BVS database (Health Virtual Library) is a centralized source of biomedical information for Latin America and Carib, created in 1998 and coordinated by BIREME (Biblioteca Regional de Medicina) in agreement with the Pan American Health Organization (OPAS). Abstracts are available in English, Spanish, and Portuguese, with a subset in more than one language, thus being a possible source of parallel corpora. In this article, we present the development of parallel corpora from BVS in three languages: English, Portuguese, and Spanish. Sentences were automatically aligned using the Hunalign algorithm for EN/ES and EN/PT language pairs, and for a subset of trilingual articles also. We demonstrate the capabilities of our corpus by training a Neural Machine Translation (OpenNMT) system for each language pair, which outperformed related works on scientific biomedical articles. Sentence alignment was also manually evaluated, presenting an average 96% of correctly aligned sentences across all languages. Our parallel corpus is freely available, with complementary information regarding article metadata. |
Tasks | Machine Translation |
Published | 2019-05-05 |
URL | https://arxiv.org/abs/1905.01712v1 |
https://arxiv.org/pdf/1905.01712v1.pdf | |
PWC | https://paperswithcode.com/paper/bvs-corpus-a-multilingual-parallel-corpus-of |
Repo | |
Framework | |
Theories of Parenting and their Application to Artificial Intelligence
Title | Theories of Parenting and their Application to Artificial Intelligence |
Authors | Sky Croeser, Peter Eckersley |
Abstract | As machine learning (ML) systems have advanced, they have acquired more power over humans’ lives, and questions about what values are embedded in them have become more complex and fraught. It is conceivable that in the coming decades, humans may succeed in creating artificial general intelligence (AGI) that thinks and acts with an open-endedness and autonomy comparable to that of humans. The implications would be profound for our species; they are now widely debated not just in science fiction and speculative research agendas but increasingly in serious technical and policy conversations. Much work is underway to try to weave ethics into advancing ML research. We think it useful to add the lens of parenting to these efforts, and specifically radical, queer theories of parenting that consciously set out to nurture agents whose experiences, objectives and understanding of the world will necessarily be very different from their parents’. We propose a spectrum of principles which might underpin such an effort; some are relevant to current ML research, while others will become more important if AGI becomes more likely. These principles may encourage new thinking about the development, design, training, and release into the world of increasingly autonomous agents. |
Tasks | |
Published | 2019-03-14 |
URL | http://arxiv.org/abs/1903.06281v1 |
http://arxiv.org/pdf/1903.06281v1.pdf | |
PWC | https://paperswithcode.com/paper/theories-of-parenting-and-their-application |
Repo | |
Framework | |
Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment
Title | Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment |
Authors | Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare |
Abstract | This paper provides an empirical evaluation of recently developed exploration algorithms within the Arcade Learning Environment (ALE). We study the use of different reward bonuses that incentives exploration in reinforcement learning. We do so by fixing the learning algorithm used and focusing only on the impact of the different exploration bonuses in the agent’s performance. We use Rainbow, the state-of-the-art algorithm for value-based agents, and focus on some of the bonuses proposed in the last few years. We consider the impact these algorithms have on performance within the popular game Montezuma’s Revenge which has gathered a lot of interest from the exploration community, across the the set of seven games identified by Bellemare et al. (2016) as challenging for exploration, and easier games where exploration is not an issue. We find that, in our setting, recently developed bonuses do not provide significantly improved performance on Montezuma’s Revenge or hard exploration games. We also find that existing bonus-based methods may negatively impact performance on games in which exploration is not an issue and may even perform worse than $\epsilon$-greedy exploration. |
Tasks | Atari Games, Montezuma’s Revenge |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.02388v2 |
https://arxiv.org/pdf/1908.02388v2.pdf | |
PWC | https://paperswithcode.com/paper/benchmarking-bonus-based-exploration-methods |
Repo | |
Framework | |