Online classiﬁcation for spalling detection and vibratory behavior monitoring

– Vibration analysis is the most widely used tool in industrial application for machine’s health condition assessment. Bearings, however, are very sensitive and require special attention. Vibration analysis employs various signal processing methods such as spectral analysis, time-scale, time frequency analysis, etc. these methods are used to analyze bearings’ vibratory behavior by monitoring the evolution of statistical indicators. However, diagnosing the bearing depending on traditional features only isn’t suﬃcient to assure eﬀective or reliable assessment of the component’ health condition. This paper proposes a multi-features online dynamic classiﬁcation as a new method for fault detection and health condition monitoring for bearings; this technique uses multiple features, including traditional features extracted from the raw signal, two special features extracted by wavelet analysis, the spectral kurtosis, coupled with a nonlinear principal component analysis and a dynamic classiﬁcation to capitalize on the hidden information in the time evolution of the features. Through this article, we introduce diﬀerent measures and techniques used to characterize the health state of rolling, then we deploy a methodology using dynamic classiﬁcation to detect early defect. To ensure an almost continuous surveillance, this methodology is based on a real-time analysis, and uses speciﬁc statistical indicators adapted to the experimental bench. Then, the monitoring of the degradation is achieved through the resulting class of the state of degradation. New parameters such as the speed of the class, the position of the class, the shape of the class will be discussed to inform the state of damage. The suggested methodology is validated by analyzing several fatigue tests from a fatigue bench bearing thrust ball referenced SNR51207.


Introduction
Bearings are prevalent components in domestic and industrial applications. Bearings, however, are one of the most easily damaged accessories in a rolling machine; in fact, bearing failure has a great impact on the maintenance cost and the machine's downtime. Therefore, a correct and continuous monitoring of their health condition is vital for maintaining a smooth and uninterrupted functioning.
Vibration analysis is the most common and reliable method for maintaining a closer look at machine health condition. Actually every operating faulty bearing exhibits a change in its vibratory behavior by demonstrating non-stationary characteristics. The crux of bearing diagnosis is detecting and monitoring those non-stationary characteristics.
a Corresponding author: sanaa.kerroumi@univ-reims.fr Traditional diagnosis techniques based on vibration analysis extract statistical features from the raw signal in its temporal and spectral form [1,2]; these features are called "fault indicators". However, due to all the nonlinear factors that affect the rotating machine and add to the complexity of the system -such as speed, loads, stiffness, and friction, among others - [3], effective diagnosis or monitoring techniques cannot depend only on traditional fault indicators [4,5]. Hence, there is great interest in finding alternative and complementary tools, the majority of which are originated from two domains: signal processing techniques, and data mining methods.
Among all the time-frequency analysis methods, wavelets have been established as the most widespread tool in many areas of signal processing, due to their flexibility, their efficient computational implementation [6], and their excellent capability of detecting transients. In addition to their introduction for general vibration analysis, specific case studies have been reported for Article published by EDP Sciences bearing fault detection [7,8], as well as for other machine components.
Aside from the original purpose of the wavelet transform as a non-stationary analysis method, another application of the wavelet in machine fault diagnosis occurred; wavelets became a fault feature extraction technique [9].
Spectral kurtosis (SK) [10] has been proven to be efficient in detecting incipient faults buried in large noise, and it has demonstrated its efficiency as a fault indicator [11].
The traditional features extracted from time and frequency analysis will be combined with special features extracted by wavelets analysis and spectral kurtosis. Naturally, the combination will exhibit high correlation and redundancy hence the need for a non-linear component analysis, namely kernel principal component analysis KPCA [12].
Once the right fault indicators are extracted and processed, the fault detection and performance assessment of the machine becomes a pattern recognition problem. For this purpose, various classification methods have been used, namely artificial neural network (ANN) [5], decision tree [13], support vector machine (SVM) [4,14], and k-means [15], among others. These methods showed more or less satisfactory results, but they all neglected one aspect of the fault indicators extracted from vibratory signals, i.e. the fact that these features are just like any data issued from any evolving system; they are constantly changing over time. Therefore, using a static classification method deprives us of the information conveyed in the temporal evolution of the indicators; such as the speed with which the classes are created, the form and surface of the classes, etc. This paper is organized as follows: in Section 2, a brief technical review of the employed signal processing techniques will be given. In Section 3, the dynamic classification method is presented. In Section 4, the application's detailed description of the suggested diagnosis process is unfolded plus the technical description of the experimental fatigue bench and along with the discussion of the results. Finally in Section 5, a brief conclusion and future perspectives will be disclosed.

Time domain features
The traditional time domain analysis calculates characteristics' features from time waveform signals as descriptive statistics -such as mean, peak, standard deviation, and crest factor -, and high order statistics (Tab. 1), such as root mean square (RMS), skewness, and kurtosis among others [1][2][3]. Some of these features show better sensitivity to the fault than others, namely RMS, kurtosis, crest factor and impulse factor.

Frequency domain features
Frequency domain analysis is based on the transformed signal in frequency domain. Its main advantage over time domain analysis is its ability to isolate certain frequency components of interest that enable the localization of bearing faults. The same descriptive statistics can be extracted from the transformed signal (Tab. 1), such as RMS frequency, frequency standard deviation, and central frequency, among others. To identify the location of the defect, other indicators such as SPRI or SPRO can be calculated. Since the main objective of this paper is to detect the defect and monitor its evolution, the indicators of fault location will not be used.

Wavelet analysis
The wavelet transform is a linear transform, whose physical pattern is to use a series of oscillating functions with different frequencies as window functions ψ a,b (t) in order to scan and translate the signal of x(t).
The continuous wavelet transform CWT is described as follow The wavelet coefficient W (a, b) measures the similarity between the signal x(t) and the analyzing wavelet ψ(t) at different time positions as defined by the parameter, and different time position as defined by the parameter b. The factor a −1/2 is used for energy preservation [9]. Wavelet can also be used as de-noising methods, filters and feature extraction methods. In this paper two indicators will be extracted from the signal using wavelets: W RMS the wavelets coefficients spectrum' RMS and PCWT represents the sum of all the spectrum lines.
Where WS(j) corresponds to the spectral density of the max coefficients of the continuous wavelet transform for j = 1, 2, ..., K, K is the number of spectrum lines, f j is the frequency value of the jth spectrum line. W RMS ' unit is Hz and PCWT' unit is mV 2 .

Spectral kurtosis
The spectral kurtosis SK is an extension of the concept of kurtosis as a global value to a function of frequency Standard deviation frequency is a spectrum for j = 1, 2,. . . , K. K is the number of spectrum lines, fj is the frequency value of the jth that indicates the distribution of signal impulsiveness in the frequency domain. The fourth-order statistic is a spectral descriptor originally developed to overcome the inefficiency of the power spectral density (DSP), and to detect and characterize transients in a signal [1,10] by computing kurtosis at "each frequency line" so to discover the presence of hidden non-stationarities and indicate in which frequency bands these occur.
The kurtosis can be calculated by taking the fourth power of H(t, f ) at each time and averaging its value along the record, then normalizing it by the square of the mean square value. Then the constant 2 is will be subtracted from this ratio (2 instead of 3 as in the classical kurtosis comes from the fact that H(t, f ) is complex). The result will be zero for a Gaussian signal [16].
The spectral kurtosis provides means of determining which frequency band contains a signal of maximum impulsivity.

Kernel principal component analysis KPCA
In view of the high correlation and redundancy exhibited by the matrix formed of all the signal extracted features, methods need to be applied to correct this problem and increase the accuracy of the diagnosis [4]. Principal component analysis (PCA) is a well-known linear method for feature extraction and dimensionality reduction; it reduces the redundancy by calculating the eigenvectors of the covariance matrix of the input. Traditional PCA only allows linear dimensionality reduction. However, if the data have more complicated structures that cannot be simplified in a linear sub-space, traditional PCA will become invalid. To overcome the linearity of the PCA, several variations have been introduced. One such method that is directly related to PCA is called kernel PCA (KPCA) [12]. The basic idea of KPCA is to first map input n data points x i R d into some new feature space F , typically via a non-linear function Φ (polynomial of degree p) (Eq. (5)), and then the linear PCA is performed in the mapped space whose dimension is assumed to be larger than the number of training samples The PCA can be computed such that the vectors Φ (x i ) appear only within scalar products. Thus, mapping (Eq. (5)) can be omitted. Instead, a kernel function k(x, y), replaces the scalar product (Φ(x)·Φ (y)). In kernel PCA, an eigenvector V of the covariance matrix in F is a linear combination of points Φ (x i ) (Eq. (6)).
The vectorsΦ(x i ) are chosen such that they are centered on the origin in F . The α i are the components of a vector α, which is the eigenvector of the matrix To computeK,Φ is substituted according to equation (7) soK ij becomes a function of the kernel matrix k(x i x j ):

Dynamic classification
The main objective of pattern recognition or classification is the study of how machines can observe the environment, learn to distinguish the interesting patterns of their background, ignore the non-informing ones, and make sound and reasonable decisions about the categories of patterns [17][18][19].
Pattern recognition is a two stages process: preprocessing and classification. The preprocessing is used to find the necessary but minimum set of features to build the representation space. Therefore, a pattern can be seen as a point in the representation space. Groups of patterns can be formed to represent a state or a functioning mode of the system.
The difference between static classification or static pattern recognition method and dynamic ones resides in the representation of classes; in static classification: classes are static, in dynamic classification: classes evolve; they change over time [20], they may be created if needed, or deleted if the need disappeared, they may move, rotate, similar classes can be merged to form new ones, etc. this changing in classes translates a changing in the data which mean a changing in the observed system.
Static classification, however, omits the time changes exhibited by data and so the classes as well, which results in one inert distribution of data into static classes. Indeed a static classification of changing data couldn't be considered throughout representation of a real life system that naturally changes over time to switch from normal to abnormal functioning.
Another key point is that the classes of an evolving system are dynamic; their characteristics change over time, in a slow, progressive way or in abrupt way. The change in classes' behavior is directly linked to the state of the functioning system. In the bearing monitoring case, abrupt change is always associated with the existence of a fault.
In dynamic classification, the right classifier has to be capable of detecting all changes in the classes' behavior, such as fusion, drift, creation and splitting, among others. The classifier has to be able to adjust its parameters over time.
There are three types of classifiers, i.e. supervised, unsupervised and semi-supervised; and the adequate one depends on prior knowledge of the system and of the classes' behavior. When labeled patterns are accessible (e.g. pattern with classes' assignment), then the classifier is supervised; in this case, the pattern recognition is a two stages process: learning and classification [19,20]. On the contrary, when no prior knowledge of the classes is available; an unsupervised classifier is used, and it depends on similarities to build classes. When only a few labeled patterns are available, the classifier can be semi-supervised; this type of classifiers combines the first two types, as it uses the known information to estimate the classes' characteristics (supervised process) and the unsupervised learning to detect new classes and learn their membership functions.
Since the prior knowledge of the system (bearings vibratory' behavior) is limited, if any, the choice of classifiers is narrowed down to semi-supervised and unsupervised.
In this paper, an unsupervised static classifier has been chosen then modified to incorporate the aspect of dynamic classes and of the real time classification.

Dynamic density-based spatial clustering of applications with noise (D-DBSCAN)
In this article, an unsupervised dynamic method based on DBSCAN is developed to monitor the evolution of bearing health condition. It was interesting to develop DBSCAN for the case of real time monitoring since it is a simple but efficient unsupervised classifying method.
The choice of DBCAN as our unsupervised classifier was made in respect of the bearing vibration signal characteristics; a comparison between different unsupervised classifiers showed that the separation of the classes is more accurate when the density is used as separator rather than a simple distance, hence the choice of DBSCAN.

Density-based spatial clustering of applications with noise (D-DBSCAN)
Density estimation methods are among the most likely methods to detect data clusters of complex shapes. These methods are borrowed from statistics, and their basic idea is to locate high density regions and separate them from each other with low density regions. Each region is a cluster defined as a densely connected component (Fig. 1).
The DBSCAN algorithm has the advantage of finding by itself the evolution of the number of clusters. It can also manage any type of data and consider outliers that are not assigned to any identified cluster [16][17][18]. Moreover, it has a few parameters to adjust and it is insensitive to noise [21,22]. Using DBSCAN requires defining two parameters: Eps and MinPts. Eps defines the neighborhood radius or the maximum distance between two points of the same cluster. Minpts defines the density threshold that corresponds to the minimum number of objects in the neighborhood of a point. The DBSCAN clustering is initialized by the arbitrary choice of a point p, then it performs a search for all patterns that are at a distance less than or equal to Eps so to form a cluster. If the number of patterns found is greater than or equal to MinPts, it will be considered as part of a cluster. It then goes through the neighborhood step by step to find the set of points in the cluster.
DBSCAN as any other method who requires the initialization of parameters, its classification' results depend on the initialization of Eps and Minpts, and specially of Eps, however there is an heuristic that can help to find the right value of Eps called K-distance graph [23,24].

Dynamic density-based spatial clustering of applications with noise (D-DBSCAN)
Generally, the selection of a pattern recognition method has to be realized according to the system on which the method is applied. However, DBSCAN is a standard classification method for this reason this dynamic method can be applied for other purposes.
D-DBSCAN is developed in order to detect the classes' evolution and to adapt its membership functions accordingly. The proposed dynamic classification method' algorithm is composed of three phases which are described in Figure 2.

Classification phase
As the chosen classifier is unsupervised there is no learning phase, the method starts right away with the classification. The classification goes on normally with the ordinary DBSCAN except that whenever a pattern is classified the class's parameters are updated.
• Surface S C of each class C according to all attributes.
n is the number of attributes, x CK is kth pattern for the C class.
• Density Dc of each class C is the number of pattern per surface unit.
N P is the number of the patterns classified in the class C. • Slope SG CA of each class C according to the previous class' gravity center.
CG i , CG j is the gravity center of class i and class j in this order. The slope is updated every time a pattern is classified in the class C since the gravity center must be updated for very new added pattern in the class.

Detection and adaptation phase
The class which receives a new pattern x is the one which can be evolving. But before classifying the pattern x in a class C, two conditions should be verified; first the neighborhood radius if not verified, the Minpts is checked if not verified too the pattern x is temporarily assigned to the nearest class, if the Minpts is reached a new class is created with all the patterns in the neighborhood of x, and all the class' parameters(surface, density, the slope) are updated for the old class to which the pattern x was temporarily assigned, and initialized for the new created class.

Validation phase
In this phase, two types of classes evolution are treated; the case of fusion, and deletion.
Two classes must merged if the overlapping surface between the two classes is higher than a threshold (th f ) defined by the user, the value of th f should be specified in percentage (a th f = 10 means that the overlapping surface is equal to 10% of one of the two classes' surface).
In case of deletion, another threshold is specified th d which stands for the number of classified patterns since the class C has received its last pattern (if th d = 30 means that 30 patterns passed since it has received its last pattern).

Experimental device
The proposed techniques have been implemented on an experimental bench (Fig. 3) consisting of a motor driving the shaft in rotation at 1800 rpm, a bearing hosting one of the two tested thrust ball bearing's raceway, a piston on which the other raceway is placed, and a hydraulic jack for exerting the preload by means of the piston. Two piezoelectric accelerometers are placed radially  and axially on the frame that is holding the race of the thrust ball bearing.
The operation consists of placing one race of the thrust ball on the fixed landing, and the other race with the balls on the loose bearing. The plunger is then actuated to turn the assembly of the races in contact. Pressure is adjusted to obtain a load of 30 000 N. The system is rotated through the engine until the occurrence of a spalling defect on the track of the thrust ball (noise characteristic). At this stage, we carried out a visual inspection of the defect size scaling steadily (Fig. 3). After the inspection, the thrust ball bearing is returned to its place and the system is reset. This is repeated until one considers that the fault becomes too large, or the thrust ball is ruined. The tested thrust ball bearing is a thrust ball bearing single direction SNR51207 reference. It has 12 balls and a dynamic load capacity ISO C of 39 000 N (Fig. 4).
To implement the classification methods, we were particularly interested in a trial that lasted 120 h.

The online diagnosis and monitoring process
The diagnosis and monitoring process is a three phases' process ( Fig. 5): • Phase 1 (signal extraction): it starts with extracting simultaneously a vibration' signal from each sensor placed on bearing • Phase 2 (features extraction): • Step1: once the signals are received, the extraction phase starts; all the features are concurrently calculated and separately for each sensor' signal • Step2: as soon as all the features are calculated for both sensor' signals, a matrix of all the features from both signals is formed and the KPCA is computed. The first three principal components are selected since they contain most of the information • Phase 3 (classification and decision making): • Step 1: the classification by DDBSCAN starts by assigning every arriving pattern to its class, in the meanwhile the classes parameters are updated and stored. • Step 2 (decision making): if no new class' evolution is detected the bearing health state will considered as stable, once a new class is created the monitoring process starts; since the birth of the defect is always associated with the creation of a new class, if any evolution is detected then all the parameters are monitored to observe the behavior of the defected bearing (the increase of the class 's surface and the decrease of its density declare the growth of the defect surface), once the class's surface along with density starts to stabilize this signifies that bearing is in its final stage and a maintenance session should be scheduled.

Results and discussion
During the 120 h of the trial, 138 vibrations signals were extracted (79 for each sensor), 10 traditional fault features were computed for each sensor's signal in both frequency and time domain, the spectral kurtosis and the two wavelet features were calculated as well, once the features were ready, the data processing starts by applying the KPCA (Fig. 6).
The visual inspection of KPCA results compared to PCA's shows that KPCA can indeed offer better separation. Figure 6 shows that KPCA results in forming clusters with different densities while PCA results in one big undefined cluster.
D-DBSCAN parameters were chosen accordingly to the test bench specifications and the nature of vibrations  signal, Eps was set to 0.2, Minpts to 10 (one signal is represented by 10 patterns), th f to 20 and th d to 100. Figure 7 shows different stages of classification, in the final stage: 3 classes are found, the green one is dense, its surface is rather small compared with the others two, and that class represent a healthy bearing, once the fault is born a new class is born with it and along with it the class' density and surface change, in the final stage of bearing life a third class is created as well as the class's surface and its density drop announcing the final stage in both class evolution and bearing's life.
Another parameter (speed) can be calculated to accentuate the pace with which the class is evolving, this parameter can be used to be compared to the speed with which the bearing is deteriorated, which may come handy for bearings prognostic.
The D-DBSCAN clustering method allows real-time dynamic monitoring of bearing condition and assesses the severity of the defect. It also enables the automation of monitoring indicators, which will facilitate its integration into a monitoring system that is similar to an expert system.

Conclusion
A new diagnosis and monitoring method was proposed in this paper, a method that ensures diagnosis accuracy by using both traditional and original features combined with an online dynamic classification by DDBSCAN; an unsupervised classification method specially developed for bearings monitoring by vibration's signals, this classification method provided us with two new parameters that are directly related to the bearing health condition allowing a real time online and accurate assessment of the bearing.
The D-Dbscan classification results show that it can indeed give more information than a static classification, its several parameters which were calculated and updated during the classification allowed a better visibility of the bearing health condition. However, this method is not yet exploited to its fullest; other parameters can be calculated and other classes' movements are still not yet interpreted or explained by the physical changes of the bearings.
We are currently relating the classes' evolution to the bearings behavior changes, we are working also on the integration of new other indicators of defect by exploiting cyclostationarity bearing vibration signatures. Moreover, we investigate the possibility of using this method of classification in the context of a predictive maintenance strategy.