Paper Group ANR 521
Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case. Oracle Efficient Estimation of Structural Breaks in Cointegrating Regressions. Workload Prediction of Business Processes – An Approach Based on Process Mining and Recurrent Neural Networks. Dynamic Inference: A New Approach Toward Efficient Video Action Recognition …
Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case
Title | Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case |
Authors | Neo Wu, Bradley Green, Xue Ben, Shawn O’Banion |
Abstract | In this paper, we present a new approach to time series forecasting. Time series data are prevalent in many scientific and engineering disciplines. Time series forecasting is a crucial task in modeling time series data, and is an important area of machine learning. In this work we developed a novel method that employs Transformer-based machine learning models to forecast time series data. This approach works by leveraging self-attention mechanisms to learn complex patterns and dynamics from time series data. Moreover, it is a generic framework and can be applied to univariate and multivariate time series data, as well as time series embeddings. Using influenza-like illness (ILI) forecasting as a case study, we show that the forecasting results produced by our approach are favorably comparable to the state-of-the-art. |
Tasks | Time Series, Time Series Forecasting |
Published | 2020-01-23 |
URL | https://arxiv.org/abs/2001.08317v1 |
https://arxiv.org/pdf/2001.08317v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-transformer-models-for-time-series |
Repo | |
Framework | |
Oracle Efficient Estimation of Structural Breaks in Cointegrating Regressions
Title | Oracle Efficient Estimation of Structural Breaks in Cointegrating Regressions |
Authors | Karsten Schweikert |
Abstract | In this paper, we propose an adaptive group lasso procedure to efficiently estimate structural breaks in cointegrating regressions. It is well-known that the group lasso estimator is not simultaneously estimation consistent and model selection consistent in structural break settings. Hence, we use a first step group lasso estimation of a diverging number of breakpoint candidates to produce weights for a second adaptive group lasso estimation. We prove that parameter changes are estimated consistently by group lasso and show that the number of estimated breaks is greater than the true number but still sufficiently close to it. Then, we use these results and prove that the adaptive group lasso has oracle properties if weights are obtained from our first step estimation. Simulation results show that the proposed estimator delivers the expected results. An economic application to the long-run US money demand function demonstrates the practical importance of this methodology. |
Tasks | Model Selection |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.07949v2 |
https://arxiv.org/pdf/2001.07949v2.pdf | |
PWC | https://paperswithcode.com/paper/oracle-efficient-estimation-of-structural |
Repo | |
Framework | |
Workload Prediction of Business Processes – An Approach Based on Process Mining and Recurrent Neural Networks
Title | Workload Prediction of Business Processes – An Approach Based on Process Mining and Recurrent Neural Networks |
Authors | Fabrizio Albertetti, Hatem Ghorbel |
Abstract | Recent advances in the interconnectedness and digitization of industrial machines, known as Industry 4.0, pave the way for new analytical techniques. Indeed, the availability and the richness of production-related data enables new data-driven methods. In this paper, we propose a process mining approach augmented with artificial intelligence that (1) reconstructs the historical workload of a company and (2) predicts the workload using neural networks. Our method relies on logs, representing the history of business processes related to manufacturing. These logs are used to quantify the supply and demand and are fed into a recurrent neural network model to predict customer orders. The corresponding activities to fulfill these orders are then sampled from history with a replay mechanism, based on criteria such as trace frequency and activities similarity. An evaluation and illustration of the method is performed on the administrative processes of Heraeus Materials SA. The workload prediction on a one-year test set achieves an MAPE score of 19% for a one-week forecast. The case study suggests a reasonable accuracy and confirms that a good understanding of the historical workload combined to articulated predictions are of great help for supporting management decisions and can decrease costs with better resources planning on a medium-term level. |
Tasks | |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.11675v1 |
https://arxiv.org/pdf/2002.11675v1.pdf | |
PWC | https://paperswithcode.com/paper/workload-prediction-of-business-processes-an |
Repo | |
Framework | |
Dynamic Inference: A New Approach Toward Efficient Video Action Recognition
Title | Dynamic Inference: A New Approach Toward Efficient Video Action Recognition |
Authors | Wenhao Wu, Dongliang He, Xiao Tan, Shifeng Chen, Yi Yang, Shilei Wen |
Abstract | Though action recognition in videos has achieved great success recently, it remains a challenging task due to the massive computational cost. Designing lightweight networks is a possible solution, but it may degrade the recognition performance. In this paper, we innovatively propose a general dynamic inference idea to improve inference efficiency by leveraging the variation in the distinguishability of different videos. The dynamic inference approach can be achieved from aspects of the network depth and the number of input video frames, or even in a joint input-wise and network depth-wise manner. In a nutshell, we treat input frames and network depth of the computational graph as a 2-dimensional grid, and several checkpoints are placed on this grid in advance with a prediction module. The inference is carried out progressively on the grid by following some predefined route, whenever the inference process comes across a checkpoint, an early prediction can be made depending on whether the early stop criteria meets. For the proof-of-concept purpose, we instantiate three dynamic inference frameworks using two well-known backbone CNNs. In these instances, we overcome the drawback of limited temporal coverage resulted from an early prediction by a novel frame permutation scheme, and alleviate the conflict between progressive computation and video temporal relation modeling by introducing an online temporal shift module. Extensive experiments are conducted to thoroughly analyze the effectiveness of our ideas and to inspire future research efforts. Results on various datasets also evident the superiority of our approach. |
Tasks | Action Recognition In Videos, Temporal Action Localization |
Published | 2020-02-09 |
URL | https://arxiv.org/abs/2002.03342v1 |
https://arxiv.org/pdf/2002.03342v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-inference-a-new-approach-toward |
Repo | |
Framework | |
Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis
Title | Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis |
Authors | Jennifer Williams, Joanna Rownicka, Pilar Oplustil, Simon King |
Abstract | We aim to characterize how different speakers contribute to the perceived output quality of multi-speaker Text-to-Speech (TTS) synthesis. We automatically rate the quality of TTS using a neural network (NN) trained on human mean opinion score (MOS) ratings. First, we train and evaluate our NN model on 13 different TTS and voice conversion (VC) systems from the ASVSpoof 2019 Logical Access (LA) Dataset. Since it is not known how best to represent speech for this task, we compare 8 different representations alongside MOSNet frame-based features. Our representations include image-based spectrogram features and x-vector embeddings that explicitly model different types of noise such as T60 reverberation time. Our NN predicts MOS with a high correlation to human judgments. We report prediction correlation and error. A key finding is the quality achieved for certain speakers seems consistent, regardless of the TTS or VC system. It is widely accepted that some speakers give higher quality than others for building a TTS system: our method provides an automatic way to identify such speakers. Finally, to see if our quality prediction models generalize, we predict quality scores for synthetic speech using a separate multi-speaker TTS system that was trained on LibriTTS data, and conduct our own MOS listening test to compare human ratings with our NN predictions. |
Tasks | Speech Synthesis, Text-To-Speech Synthesis, Voice Conversion |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12645v1 |
https://arxiv.org/pdf/2002.12645v1.pdf | |
PWC | https://paperswithcode.com/paper/comparison-of-speech-representations-for |
Repo | |
Framework | |
A new approach for trading based on Long Short Term Memory technique
Title | A new approach for trading based on Long Short Term Memory technique |
Authors | Zineb Lanbouri, Saaid Achchab |
Abstract | The stock market prediction has always been crucial for stakeholders, traders and investors. We developed an ensemble Long Short Term Memory (LSTM) model that includes two-time frequencies (annual and daily parameters) in order to predict the next-day Closing price (one step ahead). Based on a four-step approach, this methodology is a serial combination of two LSTM algorithms. The empirical experiment is applied to 417 NY stock exchange companies. Based on Open High Low Close metrics and other financial ratios, this approach proves that the stock market prediction can be improved. |
Tasks | Stock Market Prediction |
Published | 2020-01-10 |
URL | https://arxiv.org/abs/2001.03333v1 |
https://arxiv.org/pdf/2001.03333v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-approach-for-trading-based-on-long |
Repo | |
Framework | |
Modality Compensation Network: Cross-Modal Adaptation for Action Recognition
Title | Modality Compensation Network: Cross-Modal Adaptation for Action Recognition |
Authors | Sijie Song, Jiaying Liu, Yanghao Li, Zongming Guo |
Abstract | With the prevalence of RGB-D cameras, multi-modal video data have become more available for human action recognition. One main challenge for this task lies in how to effectively leverage their complementary information. In this work, we propose a Modality Compensation Network (MCN) to explore the relationships of different modalities, and boost the representations for human action recognition. We regard RGB/optical flow videos as source modalities, skeletons as auxiliary modality. Our goal is to extract more discriminative features from source modalities, with the help of auxiliary modality. Built on deep Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) networks, our model bridges data from source and auxiliary modalities by a modality adaptation block to achieve adaptive representation learning, that the network learns to compensate for the loss of skeletons at test time and even at training time. We explore multiple adaptation schemes to narrow the distance between source and auxiliary modal distributions from different levels, according to the alignment of source and auxiliary data in training. In addition, skeletons are only required in the training phase. Our model is able to improve the recognition performance with source data when testing. Experimental results reveal that MCN outperforms state-of-the-art approaches on four widely-used action recognition benchmarks. |
Tasks | Optical Flow Estimation, Representation Learning, Temporal Action Localization |
Published | 2020-01-31 |
URL | https://arxiv.org/abs/2001.11657v1 |
https://arxiv.org/pdf/2001.11657v1.pdf | |
PWC | https://paperswithcode.com/paper/modality-compensation-network-cross-modal |
Repo | |
Framework | |
Efficient Structure-preserving Support Tensor Train Machine
Title | Efficient Structure-preserving Support Tensor Train Machine |
Authors | Kirandeep Kour, Sergey Dolgov, Martin Stoll, Peter Benner |
Abstract | Deploying the multi-relational tensor structure of a high dimensional feature space, more efficiently improves the performance of machine learning algorithms. One encounters the \emph{curse of dimensionality}, and working with vectorized data fails to preserve the data structure. To mitigate the nonlinear relationship of tensor data more economically, we propose the \emph{Tensor Train Multi-way Multi-level Kernel (TT-MMK)}. This technique combines kernel filtering of the initial input data (\emph{Kernelized Tensor Train (KTT)}), stable reparametrization of the KTT in the Canonical Polyadic (CP) format, and the Dual Structure-preserving Support Vector Machine (\emph{SVM}) Kernel for revealing nonlinear relationships. We demonstrate numerically that the TT-MMK method is more reliable computationally, is less sensitive to tuning parameters, and gives higher prediction accuracy in the SVM classification compared to similar tensorised SVM methods. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.05079v1 |
https://arxiv.org/pdf/2002.05079v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-structure-preserving-support-tensor |
Repo | |
Framework | |
Human Action Recognition and Assessment via Deep Neural Network Self-Organization
Title | Human Action Recognition and Assessment via Deep Neural Network Self-Organization |
Authors | German I. Parisi |
Abstract | The robust recognition and assessment of human actions are crucial in human-robot interaction (HRI) domains. While state-of-the-art models of action perception show remarkable results in large-scale action datasets, they mostly lack the flexibility, robustness, and scalability needed to operate in natural HRI scenarios which require the continuous acquisition of sensory information as well as the classification or assessment of human body patterns in real time. In this chapter, I introduce a set of hierarchical models for the learning and recognition of actions from depth maps and RGB images through the use of neural network self-organization. A particularity of these models is the use of growing self-organizing networks that quickly adapt to non-stationary distributions and implement dedicated mechanisms for continual learning from temporally correlated input. |
Tasks | Continual Learning, Temporal Action Localization |
Published | 2020-01-04 |
URL | https://arxiv.org/abs/2001.05837v2 |
https://arxiv.org/pdf/2001.05837v2.pdf | |
PWC | https://paperswithcode.com/paper/human-action-recognition-and-assessment-via |
Repo | |
Framework | |
Fairness by Learning Orthogonal Disentangled Representations
Title | Fairness by Learning Orthogonal Disentangled Representations |
Authors | Mhd Hasan Sarhan, Nassir Navab, Abouzar Eslami, Shadi Albarqouni |
Abstract | Learning discriminative powerful representations is a crucial step for machine learning systems. Introducing invariance against arbitrary nuisance or sensitive attributes while performing well on specific tasks is an important problem in representation learning. This is mostly approached by purging the sensitive information from learned representations. In this paper, we propose a novel disentanglement approach to invariant representation problem. We disentangle the meaningful and sensitive representations by enforcing orthogonality constraints as a proxy for independence. We explicitly enforce the meaningful representation to be agnostic to sensitive information by entropy maximization. The proposed approach is evaluated on five publicly available datasets and compared with state of the art methods for learning fairness and invariance achieving the state of the art performance on three datasets and comparable performance on the rest. Further, we perform an ablative study to evaluate the effect of each component. |
Tasks | Representation Learning |
Published | 2020-03-12 |
URL | https://arxiv.org/abs/2003.05707v2 |
https://arxiv.org/pdf/2003.05707v2.pdf | |
PWC | https://paperswithcode.com/paper/fairness-by-learning-orthogonal-disentangled |
Repo | |
Framework | |
Security of Deep Learning based Lane Keeping System under Physical-World Adversarial Attack
Title | Security of Deep Learning based Lane Keeping System under Physical-World Adversarial Attack |
Authors | Takami Sato, Junjie Shen, Ningfei Wang, Yunhan Jack Jia, Xue Lin, Qi Alfred Chen |
Abstract | Lane-Keeping Assistance System (LKAS) is convenient and widely available today, but also extremely security and safety critical. In this work, we design and implement the first systematic approach to attack real-world DNN-based LKASes. We identify dirty road patches as a novel and domain-specific threat model for practicality and stealthiness. We formulate the attack as an optimization problem, and address the challenge from the inter-dependencies among attacks on consecutive camera frames. We evaluate our approach on a state-of-the-art LKAS and our preliminary results show that our attack can successfully cause it to drive off lane boundaries within as short as 1.3 seconds. |
Tasks | Adversarial Attack |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01782v1 |
https://arxiv.org/pdf/2003.01782v1.pdf | |
PWC | https://paperswithcode.com/paper/security-of-deep-learning-based-lane-keeping |
Repo | |
Framework | |
Applying Tensor Decomposition to image for Robustness against Adversarial Attack
Title | Applying Tensor Decomposition to image for Robustness against Adversarial Attack |
Authors | Seungju Cho, Tae Joon Jun, Mingu Kang, Daeyoung Kim |
Abstract | Nowadays the deep learning technology is growing faster and shows dramatic performance in computer vision areas. However, it turns out a deep learning based model is highly vulnerable to some small perturbation called an adversarial attack. It can easily fool the deep learning model by adding small perturbations. On the other hand, tensor decomposition method widely uses for compressing the tensor data, including data matrix, image, etc. In this paper, we suggest combining tensor decomposition for defending the model against adversarial example. We verify this idea is simple and effective to resist adversarial attack. In addition, this method rarely degrades the original performance of clean data. We experiment on MNIST, CIFAR10 and ImageNet data and show our method robust on state-of-the-art attack methods. |
Tasks | Adversarial Attack |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12913v2 |
https://arxiv.org/pdf/2002.12913v2.pdf | |
PWC | https://paperswithcode.com/paper/applying-tensor-decomposition-to-image-for |
Repo | |
Framework | |
Adversarial Ranking Attack and Defense
Title | Adversarial Ranking Attack and Defense |
Authors | Mo Zhou, Zhenxing Niu, Le Wang, Qilin Zhang, Gang Hua |
Abstract | Deep Neural Network (DNN) classifiers are vulnerable to adversarial attack, where an imperceptible perturbation could result in misclassification. However, the vulnerability of DNN-based image ranking systems remains under-explored. In this paper, we propose two attacks against deep ranking systems, i.e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations. Specifically, the expected ranking order is first represented as a set of inequalities, and then a triplet-like objective function is designed to obtain the optimal perturbation. Conversely, a defense method is also proposed to improve the ranking system robustness, which can mitigate all the proposed attacks simultaneously. Our adversarial ranking attacks and defense are evaluated on datasets including MNIST, Fashion-MNIST, and Stanford-Online-Products. Experimental results demonstrate that a typical deep ranking system can be effectively compromised by our attacks. Meanwhile, the system robustness can be moderately improved with our defense. Furthermore, the transferable and universal properties of our adversary illustrate the possibility of realistic black-box attack. |
Tasks | Adversarial Attack |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11293v1 |
https://arxiv.org/pdf/2002.11293v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-ranking-attack-and-defense |
Repo | |
Framework | |
Weakly Supervised Segmentation of Cracks on Solar Cells using Normalized Lp Norm
Title | Weakly Supervised Segmentation of Cracks on Solar Cells using Normalized Lp Norm |
Authors | Martin Mayr, Mathis Hoffmann, Andreas Maier, Vincent Christlein |
Abstract | Photovoltaic is one of the most important renewable energy sources for dealing with world-wide steadily increasing energy consumption. This raises the demand for fast and scalable automatic quality management during production and operation. However, the detection and segmentation of cracks on electroluminescence (EL) images of mono- or polycrystalline solar modules is a challenging task. In this work, we propose a weakly supervised learning strategy that only uses image-level annotations to obtain a method that is capable of segmenting cracks on EL images of solar cells. We use a modified ResNet-50 to derive a segmentation from network activation maps. We use defect classification as a surrogate task to train the network. To this end, we apply normalized Lp normalization to aggregate the activation maps into single scores for classification. In addition, we provide a study how different parameterizations of the normalized Lp layer affect the segmentation performance. This approach shows promising results for the given task. However, we think that the method has the potential to solve other weakly supervised segmentation problems as well. |
Tasks | |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11248v1 |
https://arxiv.org/pdf/2001.11248v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-segmentation-of-cracks-on |
Repo | |
Framework | |
iDCR: Improved Dempster Combination Rule for Multisensor Fault Diagnosis
Title | iDCR: Improved Dempster Combination Rule for Multisensor Fault Diagnosis |
Authors | Nimisha Ghosh, Sayantan Saha, Rourab Paul |
Abstract | Data gathered from multiple sensors can be effectively fused for accurate monitoring of many engineering applications. In the last few years, one of the most sought after applications for multi sensor fusion has been fault diagnosis. Dempster-Shafer Theory of Evidence along with Dempsters Combination Rule is a very popular method for multi sensor fusion which can be successfully applied to fault diagnosis. But if the information obtained from the different sensors shows high conflict, the classical Dempsters Combination Rule may produce counter-intuitive result. To overcome this shortcoming, this paper proposes an improved combination rule for multi sensor data fusion. Numerical examples have been put forward to show the effectiveness of the proposed method. Comparative analysis has also been carried out with existing methods to show the superiority of the proposed method in multi sensor fault diagnosis. |
Tasks | Sensor Fusion |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03639v1 |
https://arxiv.org/pdf/2002.03639v1.pdf | |
PWC | https://paperswithcode.com/paper/idcr-improved-dempster-combination-rule-for |
Repo | |
Framework | |