Get Complete Project Material File(s) Now! »
Fuzzy Logic models
Fuzzy logic is a generalization of standard logic, where the truth of a concept can be anywhere between 0.0 and 1.0. It is the fuzzy set theory proposed by Lotfi Zadeh in (Lotfi Zadeh, 1965). However, the study of fuzzy logic began in the 1920s, and in the 1960s, Dr. Lotfi Zadeh of the University of California, Berkeley, first introduced the concept of fuzzy logic as infinite value logic which is now largely developed in many fields (Pelletier, 2000). It is a popular model structure for its simplicity and flexibility. It can handle problems with imprecise and incomplete data. It uses simple mathematics for nonlinear, integrated and complex systems.
Recently, several researchers used this logic to diagnostics and prognostics applications systems. Cosme et al (2018) proposed a prognostic approach based on interacting multiple model filters and fuzzy systems. In (Jiang, 2019), the author described a novel ensemble fuzzy model for degradation prognostics of rolling element bearings. Kang researched on Remaining Useful Life Prognostics based on Fuzzy Evaluation-Gaussian Process Regression Method in (Kang, 2020). Škrjanc et al (2019) given a detailed overview in his survey which is evolving fuzzy and neuro-fuzzy approaches in clustering, regression, identification and classification.
The Fuzzy logic sometimes works with Neural Networks as it mimics how a person would make decisions, only much faster. A brief overview of Neural Networking models is given below.
Neural Networking models
Neural networking is another popular detection technique that can be used for similar perspectives (of using Fuzzy logic) in which it works by simulating a huge number of interconnected processing units that resemble abstract versions of neurons.
The Neural network is a model that has specialized algorithms to identify the underlying relationships in a set of data by mimicking the processes of the human brain. An artificial neural network is consisting of neurons or nodes in the modern sense of solving artificial intelligence problems. A brief overview of recent forecasting efforts to date using NN and ANN is provided below.
The preliminary theoretical base for modern neural networks was proposed by Alexander Bain et al (1873) and William James et al (1890). After that, it has been used in many applications in different fields. Recently, several PHM applications are found in the literature that has been proposed using this model structure. Li et al (2018) offered a prognostic technique by using deep convolution neural networks. The author used neural networking based deep learning method on the popular C-MAPSS dataset (Saxena, 2008) for predicting the RUL of aero-engine units accurately. Palau et al (2018) proposed a recurrent neural networking model for real-time distributed collaborative prognostics. The author demonstrates the basic implementation of real-time distributed collaborative learning, where collaboration limited to sharing trajectories to failure in real-time among clusters of similar assets. In (Khera, 2018), the author offers the ANN for prognostics of aluminum electrolytic. The training is done off-line with experimental data using the back-propagation learning algorithm. Further, the weighted ANN is used to estimate the equivalent series resistance of the system. Guo et al (2017) developed a recurrent neural network-based health indicator for remaining useful life prediction of bearings. He used a feature extraction method to map the classical time and frequency domain features with diversity ranges to some target features ranging from 0 to 1.
A couple of survey papers (Yi, 2018; Marugán, 2018) presented a detailed overview of neural network applications in the PHM domain. Marugán et al (2018) present an exhaustive review of artificial neural networks used in wind energy systems. He identified the methods most employed for different applications and demonstrates that Artificial Neural Networks can be an alternative to conventional methods in many cases. Yi, (2018) provide a brief review of the PHM for special vehicles where he highlighted the neural networking technologies behind the prognostic applications with their benefits. Recently, bidirectional Long Short-Term Memory (BiLSTM) approach for Remaining Useful Life (RUL) estimation is proposed in (Wang, 2018) which benefits of taking sequence data in bidirectional.
The neural network model is flexible in both regression and classification problems. A well-trained neural network model is quite fast at prediction. The mathematical basis behind the model allows for the processing of non-linear data along with any number of inputs and layers. However, since this model structure relies on a large amount of training data, it can lead to overfitting and generalization problems. Another important limitation is that it is a black box process. It is impossible to know how much the independent variable affects the dependent variable, or how the entire hidden layer of likelihood evolution proceeds.
Markov Chain notations
Markov Chain (MC) gives the probability of sequences of random states, each of which can take values from a given set. It assumes future states based on the current state of matters. The states before the current one has no influence on the future, except through the present state (Keselj, 2009). Let’s assume a system being assumed as in one of the states, {𝑠1,𝑠1,…,𝑠𝑁}, 𝑁 is the number of states. We denote the time instants associated with the state transitions as (𝑋1,𝑋2,…,𝑋𝐾), where 𝑋1 holds a state at the first time-instant and 𝑋𝐾 holds a state at the last time instant. If the current time instant defined as 𝑘 where 1≤𝑘≤𝐾 then the current transition probability would be: 𝑃(𝑋𝑘=𝑠𝑗|𝑋𝑘−1=𝑠𝑖), 1≤𝑖,𝑗≤𝑁.
Hidden Markov Model
The systems generally produce observable emissions that can be characterized by signals (temperature, vibrations, sound signals, etc.). In the last decades, research in artificial intelligence has focused on how to characterize such signals. Among the many methods for modelling such real phenomena, HMMs have proven to be particularly effective. It is a Markov chain in which the states are no longer directly observable. That is why it called the hidden states which can be observed by the observations. The hidden states and the observations are linked to each other in a probabilistic way. The Hidden Markov Model considers observation data where the probability distribution of the observed symbol depends on the underlying state.
Left-right model
The left-right model is a specific type of HMM where there are no transitions from a higher indexed state to a lower indexed state. That means there is no back transitions. It also called the Bakis model (Yuan, 2018). The degradation process of a system always evolves towards bad states. By means of which, if a system goes from any state 𝑠𝑖 to another state 𝑠𝑗 where 𝑖<=𝑗, then it cannot go back to the previous state 𝑠𝑖. The transition will only happen when from left to right graphically.
The Baum Welch algorithm
The Baum Welch algorithm first described in the late 1960s by Lloyd R. Welch and Leonard E. Baum [Baum 1960]. However, it is used in the 1980s for the first time in speech recognition. One of the problems of HMM is to determine that Λ=(𝐴,𝐵,𝜋) knows the sequence of observations Y. It is a search for which parameters to maximize P(Y| Λ). In this case, the Baum-Welch algorithm is used. It is a dynamic programming type of expectation-maximization algorithm. The expectation step computes the expected state occupancy count and the expected state transition count based on current probabilities of A and B. The maximization step uses the expected counts from E-step and update the probabilities of A and B. It can eventually converge to a local minimum.
The EM algorithm uses the FB algorithm to solve this problem in an iterative way. It starts with an initial probability of the parameters and adjusts the parameters iteratively.
The maximization problem is algorithmically complex. Using the previous algorithm, we use Eq. 4 to calculate the probability of HMM 𝛬 generating all Y sequences. 𝑃(𝑌𝑘|𝛬)=Σ𝛼𝑖(𝑋𝑘) 𝛽𝑖(𝑋𝑘)𝑁𝑖=1;∀𝑘 (4).
Multiple outputs cases
Alternatively, the better option is to observe multiple outputs (i.e. vibration, temperature, speed, etc.) simultaneously for a better system-modelling. In this case, each of the outputs produces multiple observation sequences. Let us assume, this time the system is monitored by observing its vibration and the temperature both (Fig. 16). So, two sets of observation sequences can be used to model the same system instead of one output show in Fig. 15. In this book, the outputs are considered as independents. However, as future work the dependence between the outputs, and with inputs can be considered.
Numerical Illustration (IOHMM learning)
To show the proposed methodology a numerical application is simulated. The application is assumed to have such complexity that covers several challenges to explore the importance of the proposed methods. Different uncertainties are handled in the model training (e.g. data uncertainty, small dataset, missing data, model size, operating conditions, etc.). The numerical problems are handled by scaling the small values and applied the logarithm method. The training is also done by using the bootstrap method which is useful to provide confidence over the parameter estimation and give a reasonable result for small datasets.
This application assumed to have two observation outputs and one operating condition with two operating modes. For example, if the speed of a system considered an operating condition then two operating modes can be the high and the low speed. Two operating modes provide two stochastic matrices to describe two different transition probabilities for the system’s degradation. The degradation of the system assumed to have three hidden states (good, moderate, bad) in simulations for easy and simple computation. Each of the states emits two outputs with two probabilities which are represented by two emission matrices. There are three discrete variables considered as the emitted symbols.
The goal is to use a simulated dataset and training the model to estimate the parameters of the model considering different issues of uncertainties and constraints. The training is done in three different phases to solve different issues.
– Modeling under multiple operating conditions and output observations. It is the classical problem in which the dataset assumed as a complete dataset that does not have any incomplete or missing data sequences. The adapted algorithms (Eq. 1 to Eq. 5) are used in this training phase.
– Modeling under missing data. The missing data is a typical challenge in a data-driven approach. In this phase, a solution is proposed to handle the dataset with missing elements. The adapted algorithms are modified again in this phase for managing the missing data.
– Use the bootstrap method for having the confidence over the estimated model. Bootstrap method can provide a scale of confidence for the estimated parameters even from a small amount of data. Usually, the data amount is small for diagnostic and prognostic applications. In this phase, the bootstrap method is implemented to train the model from a small data amount.
Modeling under missing data
Phase two covers the limitation of the first phase which is considering the missing data in model training. It is usual that the sensors sometimes misread the observation for different reasons. Misreading observation contains both the missing measurement and sensor saturation. The main contribution in this phase is to present a technique based on the IOHMM adapted algorithms that handles the missing data. Typically, if a dataset contains data sequences with some missing elements, the sequences can be excluded from the analysis. As a result, the data set becomes smaller which may lose some valuable information. This strategy is known as list-wise deletion or case-wise deletion (Allison, 2001), but it is less suitable for a small amount of dataset. The method followed in this section includes the missing data sequences into the analysis by simulating the missing portion of the sequence to produce a complete set of data. A technique such as the maximum likelihood is applied to estimate IOHMM parameters that offer substantial improvements over list-wise deletion.
Table of contents :
1 State of the Art
1.1 Maintenance
1.2 Degradation
1.3 Diagnostic
1.4 Prognostic
1.5 PHM approaches
1.5.1 Model-based prognostic approaches
1.5.2 Data driven prognostic approaches
1.5.3 Hybrid approaches
1.5.4 Conclusion
1.6 Model types
1.6.1 Deterministic models
1.6.2 Stochastic models
1.6.3 Hybrid models
1.6.4 Conclusion
1.7 Stochastic models
1.7.1 Fuzzy Logic models
1.7.2 Neural Networking models
1.7.3 Markov Models
1.7.4 Conclusion
2 Background of the Model from MC to IOHMM
2.1 Markov Chain notations
2.2 Hidden Markov Model
2.2.1 HMM Structure
2.2.2 The Forward-backward (FB) algorithm
2.2.3 The Baum Welch algorithm
2.2.4 The Viterbi algorithm
2.3 Input-Output Hidden Markov Model
2.4 Conclusion
3 The First Contribution: Learning Model Parameters
3.1 The learning algorithms adaptation
3.1.1 Multiple input conditions
3.1.2 Multiple inputs case
3.1.3 Multiple sequences case
3.1.4 Multiple outputs cases
3.1.5 Normalization
3.1.6 The Baum Welch adaptation
3.2 Numerical Illustration (IOHMM learning)
3.2.1 Modeling under multiple operating conditions
3.2.2 Modeling under missing data
3.2.3 Modeling by using the bootstrap method
3.3 Conclusion
4 The Second Contribution: Diagnostic and Prognostic
4.1 Diagnostic
4.2 Prognostic: RUL prediction
4.3 Offline and Online Operation
4.4 Application
4.4.1 The first application: Diagnostic and prognostic under multiple operating conditions
4.4.2 The second application: Managing the RUL
4.5 Conclusion
5 The Third Contribution: Estimating RUL of Aircraft
5.1 C-MAPSS
5.2 Model Structure
5.2.1 The operating conditions
5.2.2 Degradation indicator
5.2.3 Emitted symbols
5.2.4 Defined IOHMM
5.3 Model evaluation
5.4 Cross Validation
5.5 Results
5.5.1 Parameter Learning
5.5.2 Diagnostic: current health state estimation
5.5.3 Prognostic: the meantime RUL estimation
5.5.4 Benchmarking Between Different Models
5.5.5 Cross Validations
5.6 Conclusion
6 The Fourth Contribution: Estimating RUL of Structured Systems
6.1 Model construction for prognosing the system RUL
6.1.1 Series structure of two components with HMM models
6.1.2 Series structure of two components with IOHMM models
6.1.3 Parallel structure of two components with HMM models
6.1.4 Parallel structure of two components with IOHMM models
6.1.5 A drinking water network illustration
6.1.6 Diagnostic
6.2 Application
6.2.1 Data simulation
6.2.2 Model Learning
6.2.3 Diagnostic
6.2.4 Prognostic
6.3 Conclusion
Conclusion
Perspectives
Reference