A machine learning approach to circumventing the curse of dimensionality in discontinuous time series machine data

2020 
Abstract The growing interest in artificial intelligence has led to current data-driven predictive maintenance (PdM) relying on machine learning (ML) algorithms. Although ML algorithms are useful for data-intensive analysis, research shows that their performance and reliability are reduced when high-dimensional data is used for training and testing. Raw machine data can be high-dimensional due to multi-sensor measurements and discontinuous due to the wide ranges of parameter variations during continuous sensor measurements. While standard dimension reduction methods such as principal component analysis are often applied to circumvent high-dimensionality, they are often unreliable when the data is discontinuous. This paper presents a ML-based dimension reduction framework to circumvent the challenges of high-dimensional discontinuous machine data. This framework minimizes discontinuity by clustering observations based on the dataset’s modality. The modality is identified using a kernel density estimation parameterized using a heat diffusion solution to the approximate mean integrated squared error. Then, low-dimension representations of each cluster are learned using Laplacian eigenmaps embedding. Finally, the original time sequence of observations across the low-dimensional clusters is used to re-index the observations into a continuous low-dimension feature set. We demonstrate the framework’s utility on common ML-based PdM analysis using the Commercial Modular Aero-Propulsion System Simulation dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    54
    References
    12
    Citations
    NaN
    KQI
    []