April 1, 2020

3027 words 15 mins read

Paper Group ANR 521

Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case. Oracle Efficient Estimation of Structural Breaks in Cointegrating Regressions. Workload Prediction of Business Processes – An Approach Based on Process Mining and Recurrent Neural Networks. Dynamic Inference: A New Approach Toward Efficient Video Action Recognition …

Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case


Title	Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case
Authors	Neo Wu, Bradley Green, Xue Ben, Shawn O’Banion
Abstract	In this paper, we present a new approach to time series forecasting. Time series data are prevalent in many scientific and engineering disciplines. Time series forecasting is a crucial task in modeling time series data, and is an important area of machine learning. In this work we developed a novel method that employs Transformer-based machine learning models to forecast time series data. This approach works by leveraging self-attention mechanisms to learn complex patterns and dynamics from time series data. Moreover, it is a generic framework and can be applied to univariate and multivariate time series data, as well as time series embeddings. Using influenza-like illness (ILI) forecasting as a case study, we show that the forecasting results produced by our approach are favorably comparable to the state-of-the-art.
Tasks	Time Series, Time Series Forecasting
Published	2020-01-23
URL	https://arxiv.org/abs/2001.08317v1
PDF	https://arxiv.org/pdf/2001.08317v1.pdf
PWC	https://paperswithcode.com/paper/deep-transformer-models-for-time-series
Repo
Framework

Oracle Efficient Estimation of Structural Breaks in Cointegrating Regressions


Title	Oracle Efficient Estimation of Structural Breaks in Cointegrating Regressions
Authors	Karsten Schweikert
Abstract	In this paper, we propose an adaptive group lasso procedure to efficiently estimate structural breaks in cointegrating regressions. It is well-known that the group lasso estimator is not simultaneously estimation consistent and model selection consistent in structural break settings. Hence, we use a first step group lasso estimation of a diverging number of breakpoint candidates to produce weights for a second adaptive group lasso estimation. We prove that parameter changes are estimated consistently by group lasso and show that the number of estimated breaks is greater than the true number but still sufficiently close to it. Then, we use these results and prove that the adaptive group lasso has oracle properties if weights are obtained from our first step estimation. Simulation results show that the proposed estimator delivers the expected results. An economic application to the long-run US money demand function demonstrates the practical importance of this methodology.
Tasks	Model Selection
Published	2020-01-22
URL	https://arxiv.org/abs/2001.07949v2
PDF	https://arxiv.org/pdf/2001.07949v2.pdf
PWC	https://paperswithcode.com/paper/oracle-efficient-estimation-of-structural
Repo
Framework

Workload Prediction of Business Processes – An Approach Based on Process Mining and Recurrent Neural Networks


Title	Workload Prediction of Business Processes – An Approach Based on Process Mining and Recurrent Neural Networks
Authors	Fabrizio Albertetti, Hatem Ghorbel
Abstract	Recent advances in the interconnectedness and digitization of industrial machines, known as Industry 4.0, pave the way for new analytical techniques. Indeed, the availability and the richness of production-related data enables new data-driven methods. In this paper, we propose a process mining approach augmented with artificial intelligence that (1) reconstructs the historical workload of a company and (2) predicts the workload using neural networks. Our method relies on logs, representing the history of business processes related to manufacturing. These logs are used to quantify the supply and demand and are fed into a recurrent neural network model to predict customer orders. The corresponding activities to fulfill these orders are then sampled from history with a replay mechanism, based on criteria such as trace frequency and activities similarity. An evaluation and illustration of the method is performed on the administrative processes of Heraeus Materials SA. The workload prediction on a one-year test set achieves an MAPE score of 19% for a one-week forecast. The case study suggests a reasonable accuracy and confirms that a good understanding of the historical workload combined to articulated predictions are of great help for supporting management decisions and can decrease costs with better resources planning on a medium-term level.
Tasks
Published	2020-02-14
URL	https://arxiv.org/abs/2002.11675v1
PDF	https://arxiv.org/pdf/2002.11675v1.pdf
PWC	https://paperswithcode.com/paper/workload-prediction-of-business-processes-an
Repo
Framework

Dynamic Inference: A New Approach Toward Efficient Video Action Recognition


Title	Dynamic Inference: A New Approach Toward Efficient Video Action Recognition
Authors	Wenhao Wu, Dongliang He, Xiao Tan, Shifeng Chen, Yi Yang, Shilei Wen
Abstract	Though action recognition in videos has achieved great success recently, it remains a challenging task due to the massive computational cost. Designing lightweight networks is a possible solution, but it may degrade the recognition performance. In this paper, we innovatively propose a general dynamic inference idea to improve inference efficiency by leveraging the variation in the distinguishability of different videos. The dynamic inference approach can be achieved from aspects of the network depth and the number of input video frames, or even in a joint input-wise and network depth-wise manner. In a nutshell, we treat input frames and network depth of the computational graph as a 2-dimensional grid, and several checkpoints are placed on this grid in advance with a prediction module. The inference is carried out progressively on the grid by following some predefined route, whenever the inference process comes across a checkpoint, an early prediction can be made depending on whether the early stop criteria meets. For the proof-of-concept purpose, we instantiate three dynamic inference frameworks using two well-known backbone CNNs. In these instances, we overcome the drawback of limited temporal coverage resulted from an early prediction by a novel frame permutation scheme, and alleviate the conflict between progressive computation and video temporal relation modeling by introducing an online temporal shift module. Extensive experiments are conducted to thoroughly analyze the effectiveness of our ideas and to inspire future research efforts. Results on various datasets also evident the superiority of our approach.
Tasks	Action Recognition In Videos, Temporal Action Localization
Published	2020-02-09
URL	https://arxiv.org/abs/2002.03342v1
PDF	https://arxiv.org/pdf/2002.03342v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-inference-a-new-approach-toward
Repo
Framework

Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis


Title	Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis
Authors	Jennifer Williams, Joanna Rownicka, Pilar Oplustil, Simon King
Abstract	We aim to characterize how different speakers contribute to the perceived output quality of multi-speaker Text-to-Speech (TTS) synthesis. We automatically rate the quality of TTS using a neural network (NN) trained on human mean opinion score (MOS) ratings. First, we train and evaluate our NN model on 13 different TTS and voice conversion (VC) systems from the ASVSpoof 2019 Logical Access (LA) Dataset. Since it is not known how best to represent speech for this task, we compare 8 different representations alongside MOSNet frame-based features. Our representations include image-based spectrogram features and x-vector embeddings that explicitly model different types of noise such as T60 reverberation time. Our NN predicts MOS with a high correlation to human judgments. We report prediction correlation and error. A key finding is the quality achieved for certain speakers seems consistent, regardless of the TTS or VC system. It is widely accepted that some speakers give higher quality than others for building a TTS system: our method provides an automatic way to identify such speakers. Finally, to see if our quality prediction models generalize, we predict quality scores for synthetic speech using a separate multi-speaker TTS system that was trained on LibriTTS data, and conduct our own MOS listening test to compare human ratings with our NN predictions.
Tasks	Speech Synthesis, Text-To-Speech Synthesis, Voice Conversion
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12645v1
PDF	https://arxiv.org/pdf/2002.12645v1.pdf
PWC	https://paperswithcode.com/paper/comparison-of-speech-representations-for
Repo
Framework

A new approach for trading based on Long Short Term Memory technique


Title	A new approach for trading based on Long Short Term Memory technique
Authors	Zineb Lanbouri, Saaid Achchab
Abstract	The stock market prediction has always been crucial for stakeholders, traders and investors. We developed an ensemble Long Short Term Memory (LSTM) model that includes two-time frequencies (annual and daily parameters) in order to predict the next-day Closing price (one step ahead). Based on a four-step approach, this methodology is a serial combination of two LSTM algorithms. The empirical experiment is applied to 417 NY stock exchange companies. Based on Open High Low Close metrics and other financial ratios, this approach proves that the stock market prediction can be improved.
Tasks	Stock Market Prediction
Published	2020-01-10
URL	https://arxiv.org/abs/2001.03333v1
PDF	https://arxiv.org/pdf/2001.03333v1.pdf
PWC	https://paperswithcode.com/paper/a-new-approach-for-trading-based-on-long
Repo
Framework


Title	Modality Compensation Network: Cross-Modal Adaptation for Action Recognition
Authors	Sijie Song, Jiaying Liu, Yanghao Li, Zongming Guo
Abstract	With the prevalence of RGB-D cameras, multi-modal video data have become more available for human action recognition. One main challenge for this task lies in how to effectively leverage their complementary information. In this work, we propose a Modality Compensation Network (MCN) to explore the relationships of different modalities, and boost the representations for human action recognition. We regard RGB/optical flow videos as source modalities, skeletons as auxiliary modality. Our goal is to extract more discriminative features from source modalities, with the help of auxiliary modality. Built on deep Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) networks, our model bridges data from source and auxiliary modalities by a modality adaptation block to achieve adaptive representation learning, that the network learns to compensate for the loss of skeletons at test time and even at training time. We explore multiple adaptation schemes to narrow the distance between source and auxiliary modal distributions from different levels, according to the alignment of source and auxiliary data in training. In addition, skeletons are only required in the training phase. Our model is able to improve the recognition performance with source data when testing. Experimental results reveal that MCN outperforms state-of-the-art approaches on four widely-used action recognition benchmarks.
Tasks	Optical Flow Estimation, Representation Learning, Temporal Action Localization
Published	2020-01-31
URL	https://arxiv.org/abs/2001.11657v1
PDF	https://arxiv.org/pdf/2001.11657v1.pdf
PWC	https://paperswithcode.com/paper/modality-compensation-network-cross-modal
Repo
Framework

Efficient Structure-preserving Support Tensor Train Machine


Title	Efficient Structure-preserving Support Tensor Train Machine
Authors	Kirandeep Kour, Sergey Dolgov, Martin Stoll, Peter Benner
Abstract	Deploying the multi-relational tensor structure of a high dimensional feature space, more efficiently improves the performance of machine learning algorithms. One encounters the \emph{curse of dimensionality}, and working with vectorized data fails to preserve the data structure. To mitigate the nonlinear relationship of tensor data more economically, we propose the \emph{Tensor Train Multi-way Multi-level Kernel (TT-MMK)}. This technique combines kernel filtering of the initial input data (\emph{Kernelized Tensor Train (KTT)}), stable reparametrization of the KTT in the Canonical Polyadic (CP) format, and the Dual Structure-preserving Support Vector Machine (\emph{SVM}) Kernel for revealing nonlinear relationships. We demonstrate numerically that the TT-MMK method is more reliable computationally, is less sensitive to tuning parameters, and gives higher prediction accuracy in the SVM classification compared to similar tensorised SVM methods.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05079v1
PDF	https://arxiv.org/pdf/2002.05079v1.pdf
PWC	https://paperswithcode.com/paper/efficient-structure-preserving-support-tensor
Repo
Framework

Human Action Recognition and Assessment via Deep Neural Network Self-Organization


Title	Human Action Recognition and Assessment via Deep Neural Network Self-Organization
Authors	German I. Parisi
Abstract	The robust recognition and assessment of human actions are crucial in human-robot interaction (HRI) domains. While state-of-the-art models of action perception show remarkable results in large-scale action datasets, they mostly lack the flexibility, robustness, and scalability needed to operate in natural HRI scenarios which require the continuous acquisition of sensory information as well as the classification or assessment of human body patterns in real time. In this chapter, I introduce a set of hierarchical models for the learning and recognition of actions from depth maps and RGB images through the use of neural network self-organization. A particularity of these models is the use of growing self-organizing networks that quickly adapt to non-stationary distributions and implement dedicated mechanisms for continual learning from temporally correlated input.
Tasks	Continual Learning, Temporal Action Localization
Published	2020-01-04
URL	https://arxiv.org/abs/2001.05837v2
PDF	https://arxiv.org/pdf/2001.05837v2.pdf
PWC	https://paperswithcode.com/paper/human-action-recognition-and-assessment-via
Repo
Framework

Fairness by Learning Orthogonal Disentangled Representations


Title	Fairness by Learning Orthogonal Disentangled Representations
Authors	Mhd Hasan Sarhan, Nassir Navab, Abouzar Eslami, Shadi Albarqouni
Abstract	Learning discriminative powerful representations is a crucial step for machine learning systems. Introducing invariance against arbitrary nuisance or sensitive attributes while performing well on specific tasks is an important problem in representation learning. This is mostly approached by purging the sensitive information from learned representations. In this paper, we propose a novel disentanglement approach to invariant representation problem. We disentangle the meaningful and sensitive representations by enforcing orthogonality constraints as a proxy for independence. We explicitly enforce the meaningful representation to be agnostic to sensitive information by entropy maximization. The proposed approach is evaluated on five publicly available datasets and compared with state of the art methods for learning fairness and invariance achieving the state of the art performance on three datasets and comparable performance on the rest. Further, we perform an ablative study to evaluate the effect of each component.
Tasks	Representation Learning
Published	2020-03-12
URL	https://arxiv.org/abs/2003.05707v2
PDF	https://arxiv.org/pdf/2003.05707v2.pdf
PWC	https://paperswithcode.com/paper/fairness-by-learning-orthogonal-disentangled
Repo
Framework

Security of Deep Learning based Lane Keeping System under Physical-World Adversarial Attack


Title	Security of Deep Learning based Lane Keeping System under Physical-World Adversarial Attack
Authors	Takami Sato, Junjie Shen, Ningfei Wang, Yunhan Jack Jia, Xue Lin, Qi Alfred Chen
Abstract	Lane-Keeping Assistance System (LKAS) is convenient and widely available today, but also extremely security and safety critical. In this work, we design and implement the first systematic approach to attack real-world DNN-based LKASes. We identify dirty road patches as a novel and domain-specific threat model for practicality and stealthiness. We formulate the attack as an optimization problem, and address the challenge from the inter-dependencies among attacks on consecutive camera frames. We evaluate our approach on a state-of-the-art LKAS and our preliminary results show that our attack can successfully cause it to drive off lane boundaries within as short as 1.3 seconds.
Tasks	Adversarial Attack
Published	2020-03-03
URL	https://arxiv.org/abs/2003.01782v1
PDF	https://arxiv.org/pdf/2003.01782v1.pdf
PWC	https://paperswithcode.com/paper/security-of-deep-learning-based-lane-keeping
Repo
Framework

Applying Tensor Decomposition to image for Robustness against Adversarial Attack


Title	Applying Tensor Decomposition to image for Robustness against Adversarial Attack
Authors	Seungju Cho, Tae Joon Jun, Mingu Kang, Daeyoung Kim
Abstract	Nowadays the deep learning technology is growing faster and shows dramatic performance in computer vision areas. However, it turns out a deep learning based model is highly vulnerable to some small perturbation called an adversarial attack. It can easily fool the deep learning model by adding small perturbations. On the other hand, tensor decomposition method widely uses for compressing the tensor data, including data matrix, image, etc. In this paper, we suggest combining tensor decomposition for defending the model against adversarial example. We verify this idea is simple and effective to resist adversarial attack. In addition, this method rarely degrades the original performance of clean data. We experiment on MNIST, CIFAR10 and ImageNet data and show our method robust on state-of-the-art attack methods.
Tasks	Adversarial Attack
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12913v2
PDF	https://arxiv.org/pdf/2002.12913v2.pdf
PWC	https://paperswithcode.com/paper/applying-tensor-decomposition-to-image-for
Repo
Framework

Adversarial Ranking Attack and Defense


Title	Adversarial Ranking Attack and Defense
Authors	Mo Zhou, Zhenxing Niu, Le Wang, Qilin Zhang, Gang Hua
Abstract	Deep Neural Network (DNN) classifiers are vulnerable to adversarial attack, where an imperceptible perturbation could result in misclassification. However, the vulnerability of DNN-based image ranking systems remains under-explored. In this paper, we propose two attacks against deep ranking systems, i.e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations. Specifically, the expected ranking order is first represented as a set of inequalities, and then a triplet-like objective function is designed to obtain the optimal perturbation. Conversely, a defense method is also proposed to improve the ranking system robustness, which can mitigate all the proposed attacks simultaneously. Our adversarial ranking attacks and defense are evaluated on datasets including MNIST, Fashion-MNIST, and Stanford-Online-Products. Experimental results demonstrate that a typical deep ranking system can be effectively compromised by our attacks. Meanwhile, the system robustness can be moderately improved with our defense. Furthermore, the transferable and universal properties of our adversary illustrate the possibility of realistic black-box attack.
Tasks	Adversarial Attack
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11293v1
PDF	https://arxiv.org/pdf/2002.11293v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-ranking-attack-and-defense
Repo
Framework

Weakly Supervised Segmentation of Cracks on Solar Cells using Normalized Lp Norm


Title	Weakly Supervised Segmentation of Cracks on Solar Cells using Normalized Lp Norm
Authors	Martin Mayr, Mathis Hoffmann, Andreas Maier, Vincent Christlein
Abstract	Photovoltaic is one of the most important renewable energy sources for dealing with world-wide steadily increasing energy consumption. This raises the demand for fast and scalable automatic quality management during production and operation. However, the detection and segmentation of cracks on electroluminescence (EL) images of mono- or polycrystalline solar modules is a challenging task. In this work, we propose a weakly supervised learning strategy that only uses image-level annotations to obtain a method that is capable of segmenting cracks on EL images of solar cells. We use a modified ResNet-50 to derive a segmentation from network activation maps. We use defect classification as a surrogate task to train the network. To this end, we apply normalized Lp normalization to aggregate the activation maps into single scores for classification. In addition, we provide a study how different parameterizations of the normalized Lp layer affect the segmentation performance. This approach shows promising results for the given task. However, we think that the method has the potential to solve other weakly supervised segmentation problems as well.
Tasks
Published	2020-01-30
URL	https://arxiv.org/abs/2001.11248v1
PDF	https://arxiv.org/pdf/2001.11248v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-segmentation-of-cracks-on
Repo
Framework

iDCR: Improved Dempster Combination Rule for Multisensor Fault Diagnosis


Title	iDCR: Improved Dempster Combination Rule for Multisensor Fault Diagnosis
Authors	Nimisha Ghosh, Sayantan Saha, Rourab Paul
Abstract	Data gathered from multiple sensors can be effectively fused for accurate monitoring of many engineering applications. In the last few years, one of the most sought after applications for multi sensor fusion has been fault diagnosis. Dempster-Shafer Theory of Evidence along with Dempsters Combination Rule is a very popular method for multi sensor fusion which can be successfully applied to fault diagnosis. But if the information obtained from the different sensors shows high conflict, the classical Dempsters Combination Rule may produce counter-intuitive result. To overcome this shortcoming, this paper proposes an improved combination rule for multi sensor data fusion. Numerical examples have been put forward to show the effectiveness of the proposed method. Comparative analysis has also been carried out with existing methods to show the superiority of the proposed method in multi sensor fault diagnosis.
Tasks	Sensor Fusion
Published	2020-02-10
URL	https://arxiv.org/abs/2002.03639v1
PDF	https://arxiv.org/pdf/2002.03639v1.pdf
PWC	https://paperswithcode.com/paper/idcr-improved-dempster-combination-rule-for
Repo
Framework