Complexity based on synchrosqueezing analysis in gear diagnosis

– The Synchrosqueezing is a special case of the reassignment method and concentrates the time-frequency representation (TFR) in a scale dimension. Compared to other TFR enhancement methods, the synchrosqueezing oﬀers better adaptability, less deformation for IF proﬁle and an exact reconstruction formula for constituent components. This paper deals with the investigation of descriptors based on the combination of the synchrosqueezing transform (SST) and Lempel-Ziv complexity methods. This last one transforms the analyzed signal into a data sequence. In the ﬁrst part, the vibration signal components are extracted by using the synchrosqueezing transform and the reconstruction method. Afterward, the Lempel-Ziv complexity values are calculated. Since the complexity values are not dependent on the magnitude of the measured signal, the proposed method is less sensitive to the data sets measured under diﬀerent data acquisition conditions. This approach is applied for monitoring and diagnosing the defects during a fatigue test on a ﬁrst gear reducer and also when varying the load on a second gear reducer by using the recorded vibration signals. It can also provide a new way for feature extraction and recognition of gear system faults.


Introduction
Gear vibration signals are commonly modelled as a sum of AM/FM components. The meshing frequency and its harmonic components are modulated by the shaft rotational frequency and its harmonics for a faulty gearbox. This is often identified via detecting the presence of the sidebands [1]. As the understanding of the theory advanced, amplitude and frequency-modulation (AM-FM) decompositions have been applied in a large range of problems: they are usually analyzed by the short-time Fourier transform (STFT) [2] or the continuous Wavelet transform (CWT) [3], while most quadratic methods can be seen as variations of the Wigner-Ville distribution (WVD), with squared STFTs (spectrograms) and WTs (scalograms) as special cases. It is well-known that both transforms for such signals draw strips in the timefrequency (TF) or time-scale (TS) plane, centered at the ridges corresponding to the instantaneous frequencies [4]. Kodera et al. [5] proposed the Modified Moving Window Method in the STFT. Their approach takes into account the phase information to enhance the dispersion of the STFT magnitude. In addition, the Wigner Distributions can be adapted to have a perfect localization of the signal a Corresponding author: marc.thomas@etsmtl.ca with specific frequency modulations. However, some limitations in using these techniques were sometimes encountered. As an example, resolution problems in both time and frequency depending on the frequency range when trying to locate sharp peaks and low frequency features. To avoid the difficulties encountered by implementing these two methods, Auger and Flandrin [6,7] presented an efficient method for computing the times and frequencies for the reassigned spectrogram without explicitly computing the partial derivatives of phases. Hence, the reassignment is substituted by a combination of STFTs with appropriate windows. The Synchrosqueezing transform, introduced in references [8][9][10], is a special case of the reassignment method that purposes to refine a timescale representation, while remaining invertible. Its purpose is similar to that of reassignment with the further improvement of allowing for reconstruction. Relatively to mechanical fault diagnosing, the feature of two sideband frequencies close to the fundamental frequency of rotor fault signal or the feature of the AM-FM gearbox fault signal cannot be directly extracted from the blurred time-frequency representation (TFR). This is caused by the occurrence of noise and transients from faulty machine components. This can be addressed by considering the extraction of meaningful components from the mixed signals. The other way is to keep the signals as they evolve without decom-posing them into different components and instead make use of the TF analysis provided by the SST method. To analyse the TF features of the stator current signal, Pu et al. [11] performed the SST method in order to detect and diagnose the rotor faults. Hazra et al. [12] applied the SST for monitoring and diagnosing the gearbox faults, where Chuan et al. [13] brought another alternative to provide the TFR by implementing the generalised SST to detect gear faults. All of them studied the instantaneous frequency profile with concentrated TFR offered by the SST method. As announced in Daubechies et al. [8], the SST method can be used as a method to decompose a vibration signal like the EMD method [14,15]; we propose to benefit from this advantage to diagnose gear faults. This paper is composed in two sections. The first part gives a thorough discussion of the TFRs, and their properties, providing all the information that is needed to understand and apply them effectively then the procedure to use the SST method like EMD to decompose a multicomponent signal are exposed. In addition, the complexity approach is more detailed and made in a comprehensive procedure to calculate it. The second part deals with the performance of the proposed method on the experimental gear vibration signals.

Synchrosqueezing based Wavelet transform
A multicomponent signal can be considered as expressed by: x m (t) + r(t), with x m (t) = a m (t)e iφm(t) (1) for some finite M , where a m (t) > 0 is a continuously differentiable function, φ m (t) is a two times continuously differentiable function satisfying φ m (t) > 0 and φ m+1 (t) > φ m (t) for all t. r(t) is a residual signal, which represents noise or measurement error.
Synchrosqueezing as presented by Daubechies et al. [8], is a nonlinear operator that sharpens the timefrequency plot of a signal's continuous wavelet transform and enhances the TFR in a manner similar to reassignment method [9], but it still enables mode retrieval as in the empirical mode decomposition (EMD) [14,15]. So the CWT of the analytic signal x(t) is given by: where a is the scale, b the time offset, and ψ a properly chosen analytical mother wavelet that is concentrated on the positive frequency axis. Then, the instantaneous frequency ω x (a, b) assuming W x (a, b) = 0 can be derived as: During the frequency mapping, (a, b) → (ω x (a, b), b) [8,16], the synchrosqueezing is applied to reassign the time-scale representation to the TF plane as: Finally, the instantaneous angular frequency is normalized by 2π as IF f = ω 2π . To reconstruct the signal, two ways are principally used, direct and ridges reconstruction [16,17]. It is important to know that ridge reconstruction appears to be more robust to interference and noise, while direct reconstruction performs better in the case of considerable amplitude/frequency modulation [18]. This is quite understandable, since, as we saw, the AM/FM components can be represented as a sum of tones with particular amplitude, phase and frequency relationships, so the amplitude and frequency modulation can be viewed as arising due to interference between these AM/FM induced tones.

Simulated signal (principle of gearbox diagnosis)
When a defect occurs in a gearbox, the meshing frequency (and its harmonics) will be modulated by the fault shaft's rotation frequency or its harmonics. The vibration signal of the faulty gearbox can thus be expressed as [19]: where M is the number of the vibration harmonics, A m is the amplitude of the mth meshing harmonic, B m is the modulus of the mth amplitude modulation, f s is the rotational frequency, Z is the number of teeth, and φ(t) is the frequency modulation function. r(t) is a residual signal that presents the noise or measurement error. A simulated gearbox signal is analyzed as follows to test the performance of the proposed method. According to Equation (5), the simulated signal of the fault gear has been chosen similar with three harmonics: where A 1 = 1. The raw signal x(t) is sampled at 4096 Hz for 1 s. The signal composed by the three components and the noise are plotted in Figure 1 with their respective spectrums. The three original components are compared to those retrieved by the SST method (see Fig. 2). The zoom is applied on features to see the effectiveness of the method to reconstruct exactly the same components.

Real gear vibration signal
To improve the efficiency of the SST method to extract the exact components from real signals, a gear vibration signal obtained during a fatigue test performed by CETIM (Senlis, Fr) [15,[20][21][22] is decomposed by the method. Figure 3 shows the real signal x(t), its first four retrieved modes (RM s) and their respective spectrums. As we can see, the frequency spectrum of each component RM i is also found on the signal spectrum. For instance, the first component RM 1 reports the first harmonic of the meshing frequency f e = 340 Hz with the sideband frequencies.

Complexity analysis
The complexity analysis is based on the Lempel-Ziv (LZ) definition [14,23,24]. Before calculating the LZ complexity measure C(n), the signal must be transformed into a finite symbol sequence. This approach transforms the analysed signal into a data sequence. In the context of a gear vibration signal with a known mean value, typically the discrete-time signal x(n) is converted into a new binary sequence S(n). This new sequence is rebuilt by comparing the value of each sample of the previous sequence x(n) within the mean value. If the value of the sample is larger, it is set to one, otherwise to zero. In order to compute LZ complexity, the sequence S(n) is scanned from left to right and the complexity counter is increased by one unit every time a new subsequence of consecutive characters is encountered. Thus, the Lempel-Ziv complexity reflects the number of all different subsequences contained in the original sequence. sake, normalized complexity C(n) is often used to obtain a measure independent of the sequence length. As an example, we consider a real gearbox vibration signal constituted by 10 000 samples and the rate sampling is 20 kHz. The sampled signal in Figure 5a represents the sequence x(n). To obtain a new binary sequence S(n), each sample of x(n) is compared to the mean of x(n). If the value of the sample is larger, it is set to one, otherwise to zero. The new sequence S(n) is shown in Figure 5b. The LZ complexity reflects the number of all different subsequences contained in this new sequence. Following the steps explained above in the diagram (Fig. 4), 296 subsequences were obtained and the normalized complexity value for this example can be calculated by: C(n) = c(n) N/ log 2 (N ) = 296 10 000/ log 2 (10 000) = 0.3933 (7) As mentioned in references [23,24] the c(n) calculation can be viewed as independent of the data sequence length if the data length considered is greater than N points. To confirm this assumption, two real gear vibration signals are studied to examine the effect of the number of data samples. The rate sampling of the first signal is 20 kHz [15,[20][21][22], and 24 kHz for the second [25]. The normalised LZ complexity of both signals was computed following the procedure shown in Figure 4. As revealed in Figure 6, the normalized complexity remains invariable since the data length N > 6000.

Experimental tests
In this part we propose an approach based on the SST method and the Normalized LZ Complexity. Two different bases of gear vibration signals are used to corroborate this focus.

Application 1: fatigue test
SST and complexity methods were performed for monitoring and diagnosing the defects during a fatigue test performed by CETIM (Senlis, Fr) on a gear reducer by using the recorded vibration signals [15,[20][21][22]. The number of teeth is respectively 20 teeth on the first wheel and 21 teeth on the second wheel. The rotation speed of the input shaft of the reducer is 1000 rpm which gives rotation frequencies of f 1 = 16.67 Hz and f 2 = 15.87 Hz on the output shaft. The meshing frequency is f e = 333.33 Hz. The experiment was carried out for a period of 12 days. During experimentation, the testing gear was drived from a well operating state to a deteriorated gear state. Every  Chipping across the full width of the tooth day, the test bench is stopped after recording the vibration signals in order to correlate the gear teeth state with the measured signals (Tab. 1). G ear (1/2) chipping was observed at the 6th day and gear (15/16) spalling was observed since the 8th day. On the last day, the damage was in an advanced stage close to the breakage of two teeth. The data recording was set-up with the following parameters: the length of records was 60 000 samples and the sampling frequency is 20 kHz. In this study, the proposed method is performed on 32 768 samples. The meshing signal is principally caused by shocks between teeth of wheels that compose the reducer. The meshing signal is modulated in amplitude and frequency by the signals emitted from the two wheels whose frequencies are respectively f 1 and f 2 .

Statistical study
Traditionnaly, time statistical descriptors used in industry are RMS, Peak to Peak, Crest Factor and Kurtosis [26]. In order to investigate their performance, RMS and kurtosis of vibration signals were computed in order to follow gear fault development for the 12 days of operation. It may be observed in Figure 7 that Kurtosis becomes sensitive only the 10th day of operation and that RMS is not very sensitive until the 11th day.

Proposed method
To enhance the damage detection, this part proposes to combine two methods: the SST and the normalized LZ complexity. The procedure follows these steps: 1. Perform the SST method to obtain the retrieved modes (RM ) for the 12 days signals. 2. Obtain the energy and Kurtosis in each RM and find the α best modes for each day signal by maximizing energy and kurtosis. 3. Compute normalized complexity of each day signal C(x(n)) and each RM, C(RM i (n)). 4. Calculate the kurtosis for each day signal Ku (x(n)) and each extracted mode Ku(RM i (n)). 5. Finally, Equation (8) gives the complexity-Kurtosis ratio (indicator) for each day signal x(n).
Ku(x(n)) ⎞ ⎟ ⎟ ⎠ (8) To select α best modes (step 2), the energy level is used to obtain the RM with maximum energy [24]. The energy contained in all RMs is calculated by: where N represents the number of all RMs, and E RMi (n) represents the amount of energy contained in the ith RM which is defined as: The RM s that contain relatively the highest amount of energy are selected as the representative RMs since faultrelated RMs contain a higher amount of energy than the rest of the RMs [27]. However, the only use of energy level is not a good criterion to select the interest RM because a higher energy level is also related with higher retrieved modes of the SST. As known mechanical faults like gear faults produce impulses in the measured faulty signals. The energy level does not always reflect this nature while it is calculated based on the amplitude of the RM instead of the impulsiveness of a signal [24]. To overcome this shortcoming in this study, the energy and the kurtosis criteria are jointly used for modes selection since the kurtosis is a good indicator of impulsiveness. Thus, the best RM that reflects the signal nature of the high energy and the impulsiveness should be selected. The best RMs can be found by: The complexity-Kurtosis ratio is applied for the 12 days signals. Figure 8 shows the evolution of this descriptor established for the vibration signals to detect the damage evolution. If we accept a normal variation of the indicator between 0.36 and 0.43 with a mean at 0.39, the results show that the descriptor level starts to decrease with the default evolution since the 8th day and a more sensitive degradation after due to a greater damage at the 11th and the 12th day. When a defect appears and becomes stronger, more impacts would be generated when the defected tooth touches the others, leading to a longer period of defect-related vibration. More frequency components will be contained within the frequency spectrum then the complexity and kurtosis values change. As a result, the Complexity-Kurtosis ratio will decrease.

Application 2: dependence on load
To confirm the assessment, the proposed approach is applied to gear vibration signals of the experimental test of Peter Rig (University of New Wouth Wales, Kensington, Australia and LASPI Senlis, Fr [25]), see Figure 9. the driven wheel. Different sensors were used to record the data signals on the gear reducer; 7 accelerometers and 1 tachymeter. The data length for vibration signals is 100 000 samples and 24 kHz as the sampling frequency. In this study the measurement of three sensors (s1, s2 and s7) is considered with three load values; 25, 50 and 60 Nm and 10 Hz as the rotating frequency. Thus, three measurements are recorded for each sensor to evaluate the effect of load on the gear fault signals and the data length used to compute the normalized complexity is 32 768 samples.

Statistical study
The time indicators: RMS and Kurtosis are computed for each case. Figure 10 illustrates the evolution of the two descriptors. The RMS descriptor rises gradually and falls under different load values at 600 rpm rotational speed. Contrary to the RMS, the Kurtosis drops then rises considerably with the increase of the load values. Since, the evolutions of these descriptors are not enough sensitive under loads changes.

The proposed method
Following the five steps of the proposed method below (Sect. 3.1.2), the same procedure is applied. For each sensor three measures was recorded; 1. Perform the SST method for each record to obtain the RM s. 2. Maximize the energy and the Kurtosis for each RM and find the α best modes. 3. Compute normalized complexity of record signal C(x(n)) and each RM, C(RM i (n)).

Calculate the Kurtosis for each record signal Ku(x(n))
and each retrieved mode Ku(RM i (n)). 5. Finally, compute the Indicator values like Equation (8) do.
Three features are obtained in Figure 11 respectively for each sensor (1, 2 and 7). The proposed indicator reveals an amplitude decrease matching the load values increase. The indicator seems also more sensitive to the load changes. When the load increases, the impact of the defected tooth touching the others teeth gain in amplitude, influencing the frequency components of the signal and more sideband frequencies will be added to the other frequency components. So, the normalized complexity of the recorded signal changes in value. In the other hand, the impulsiveness will be more intensive and the kurtosis value increases. Finally, the indicator ratio swings and reacts with falling gradually.

Conclusion
This study presents an investigation of the Synchrosqueezing method and the Normalized Complexity for monitoring the gear fault evolution. First, numerical signals were studied to perform the effectiveness of the SST method to decompose a gear simulated signal. The robustness of the method towards noise was established by the perfect correlation between the retrieved and the original components. Secondly, this method is applied to vibration data obtained from two experimental tests; fatigue test conducted by CETIM (France) during 12 days and Peter Rig conducted by the UNWK (Australia). The Lempel-Ziv complexity, a non-dimensional index ranging roughly between 0 and 1, is jointly considered with the Kurtosis (indicator of impulsiveness). A new descriptor based on the complexity-Kurtosis ratio was developed and performed on the retrieved modes signals obtained by the SST method and adopted as a fault severity measurement. Regarding the test fatigue, standards methods such as RMS and Kurtosis were sensitive to the defect only from the 10th day, while the proposed descriptor appears to be sensitive since the beginning of the spalling from the 8th day. In the other hand, when applied on vibration signals recorded from the Peter rig, this descriptor is revealed to be sensitive to the variation of load. The results indicate that the fault severity can be assessed reasonably well by the proposed method.