Issue 
Mechanics & Industry
Volume 20, Number 1, 2019



Article Number  106  
Number of page(s)  14  
DOI  https://doi.org/10.1051/meca/2018038  
Published online  08 April 2019 
Regular Article
A method based on DempsterShafer theory and support vector regressionparticle filter for remaining useful life prediction of crusher roller sleeve^{★}
^{1}
School of Mechanical and Electrical Engineering, Jiangxi University of Science and Technology, Gan Zhou 341000, PR China
^{2}
Faculty of Foreign Studies, Jiangxi University of Science and Technology, Gan Zhou 341000, PR China
^{*} email: 1834575793@qq.com
Received:
13
April
2017
Accepted:
30
September
2018
In order to solve the problem of accurately predicting the remaining useful life (RUL) of crusher roller sleeve under the partially observable and nonlinear nonstationary running state, a new method of RUL prediction based on DempsterShafer (DS) data fusion and support vector regressionparticle filter (SVRPF) is proposed. First, it adopts the correlation analysis to select the features of temperature and vibration signal, and subsequently utilize wavelet to denoising the features. Lastly, comparing the prediction performance of the proposed method integrates temperature and vibration signal sources to predict the RUL with the prediction performance of single source and other prediction methods. The experiment results indicate that the proposed prediction method is capable of fusing different data sources to predict the RUL and the prediction accuracy of RUL can be improved when data are less available.
Key words: DS theory / data fusion / RUL prediction / support vector regression / particle filter
© AFM, EDP Sciences 2019
1 Introduction
Roller sleeve as an important component is widely used in crusher, the status of its running performance directly affects the health of the whole equipment [1]. However, due to the complex working load, dust and other harsh working conditions, the service life of crusher roller sleeve is not long, and the accurate life of high speed spindle in all kinds of crushers is only thousands of hours. Once the working hours exceed the service life limit, the operation precision of roller will drop sharply and further cause that the machine cannot work properly. So it is very important to improve the reliability, safety and work efficiency of the crusher roller sleeve by means of prognostics and health management (PHM). It is an important part of PHM to predict the RUL of equipment and evaluate the performance of devices [2].
It is critical to establish an appropriate model in the life prediction process. In brief, conditionbased monitoring is becoming more and more significant, especially in the RUL prediction. Vibration signal online monitoring is one of the most effective methods to monitor the state of crusher roller sleeve health condition (SOH) [3]. The RUL prediction of crusher roller sleeve based on vibration monitoring data is divided into two steps: firstly, construct an indicator to accurately assess the performance degradation of crusher roll sleeve; subsequently, establish an effective model to predict the RUL of crusher roller sleeve.
How to establish an appropriate model under partially observable state is the key to predict the RUL accurately, and it is also the urgent demand for the industrial production. SVRPF is such a machine learning algorithm to make classification and prediction under small samples [4]. This method based on the statistics theory has been successfully applied to the prediction in the financial, electric and other systems [5,6].
In this paper, a method using acceleration and temperature data is proposed firstly to solve the challenge of low RUL prediction precision based on single data source. However, the noise and vibration interferences caused by other mechanical and systems may severely obscure the roller sleeve signal collected from sensors and make it very challenging to reliably detect the effective components. For the above reasons, a variety of signal analysis methods have been proposed by researchers, such as timedomain, frequencydomain and timefrequency technique. Wavelet analysis is such a widely accepted approach. Then it selects the sensitive features of two signals as input and constructs the SVRPF model to solve the problem that is difficult to predict with finite state data. The proposed method is evaluated using experimental data respectively. Finally, after assessing the prediction result errors, the conclusion is given in Table 5.
2 Theory introduction of DS data fusion and SVRPF
2.1 Theory introduction of DS data fusion
DS theory fuses data from different sources through the basic probability assignment (BPA) function, and analyses the belief of all the possible propositions in the identification framework, so as to achieve the goal of data fusion [7–10].
To set the identification framework consists of evidence B and C, m_{1} and m_{2} are two BPA functions in the identification framework, m_{1,2} is the fused BPA function, then the DS data fusion can be expressed as:$${m}_{1,2}(\mathrm{\xd8})=0$$(1) $${m}_{1,2}(A)=\frac{1}{1K}{\displaystyle {\displaystyle \sum}_{B\cap C=A}}{m}_{1}(B){m}_{2}(C)$$(2)where K is the degree of conflict between the evidences B and C $$K={\displaystyle {\displaystyle \sum}_{B\cap C=\mathrm{\varnothing}}}{m}_{1}(B){m}_{2}(C).$$(3)
DS data fusion theory combines with the same view for different sources of the same problem and eliminates the all conflicting views at once, so that a more reliable fused posterior BPA function can be obtained.
The RUL prediction of crusher roll sleeve based on DS data fusion has two data sources: (1) the RUL prediction based on temperature data; (2) the RUL prediction based on acceleration data. In this paper, the proposed prediction method based on DS data fusion and SVRPF fuses the results of two prediction methods to gain the fused RUL prediction result.
The whole identified framework is set as Ω$$\mathrm{\Omega}=\{T,a\}.$$(4)
Because there's no intersection between T and a, prediction by acceleration data and prediction by temperature data are independent events, then the power set can be expressed as 2^{Ω}.$${2}^{\Omega}=\{\mathrm{\varnothing},\{T\},\{a\},\{T\hspace{0.17em}\cup \hspace{0.17em}a\left\}\right\}.$$(5)
The meaning of all the propositions in the power set 2^{Ω} is explained as follows.

{T} represents the RUL prediction credibility obtained by temperature data;

{a} represents the RUL prediction credibility obtained by acceleration data;

{T ∪ a } represents the RUL prediction credibility obtained by acceleration or temperature data.
Meanwhile, the BPA functions m_{1} and m_{2} defined in the power set 2^{Ω} mean that:

m_{1} represents the prediction credibility distribution obtained by temperature data in the power set 2^{Ω};

m_{2} represents the prediction credibility distribution obtained by acceleration data in the power set 2^{Ω}.
The combination of the BPA function based on data fusion is shown in Table 1.
From Table 1, the posterior BPA function based on data fusion can be expressed as:$$m\left(T\right)=\frac{1}{1K}{\displaystyle {\displaystyle \sum}_{B\cap C=T}}{m}_{1}(B){m}_{2}(C)=\frac{b}{1K}$$(6) $$m(a)=\frac{1}{1K}{\displaystyle {\displaystyle \sum}_{B\cap C=a}}{m}_{1}(B){m}_{2}(C)=\frac{c}{1K}$$(7)where$$b={m}_{1}(T){m}_{2}(T)+{m}_{1}(T\hspace{0.17em}\cup \hspace{0.17em}a){m}_{2}(T)+{m}_{1}(T){m}_{2}(T\hspace{0.17em}\cup \hspace{0.17em}a)$$(8) $$c={m}_{1}(a){m}_{2}(a)+{m}_{1}(T\hspace{0.17em}\cup \hspace{0.17em}a){m}_{2}(a)+{m}_{1}(a){m}_{2}(T\hspace{0.17em}\cup \hspace{0.17em}a)$$(9) $$K={\displaystyle {\displaystyle \sum}_{B\cap C=\varnothing}}{m}_{1}(B){m}_{2}(C)={m}_{1}(a){m}_{2}(T)+{m}_{1}(T){m}_{2}(a).$$(10)
That's the proposed RUL prediction method for crusher roller sleeve, it uses the posterior fusion BPA function and combines the prediction results of two prediction methods to gain a more accurate prediction result.
BPA function combination based on DS data fusion.
2.2 Basic theory of SVRPF
2.2.1 Basic theory of particle filter
On the basis of the recursive Bayesian estimation [11–13], particle filter becomes a universal algorithm drawing samples from posterior distributions and assigns weights to all the particles by using the Monte Carlo method [13–16].
Particle filter has more excellent performance on nonlinear and nonGaussian system than Kalman filter which only has good performance on liner and Gaussian system [17].
The particle filter system state space model can be described as:$$\{\begin{array}{l}{x}_{k}=f({x}_{k1},{v}_{k1})\\ {z}_{k}=h({x}_{k},{n}_{k})\end{array}$$(11)where x_{k} is the system state, z_{k} is either the system output or the measurement, v_{k−1} is the system noise, and n_{k} is the measurement noise.
We assume that the prior distribution $p({x}_{0:k1}^{i}{z}_{1:k1})$ of system is known and N samples from the posterior distribution of system (11) have drawn. The posterior distribution can be approximately described as:$$p({x}_{0:k}{z}_{1:k})\approx {\displaystyle \sum _{i=1}^{N}}{w}_{k}^{i}\delta ({x}_{0:k}{x}_{0:k}^{i})$$(12)where $\left\{{x}_{k}^{i}\right\}$ is the sample, $\left\{{w}_{k}^{i}\right\}$ is the sample weight, which have ${\sum}_{i}^{N}{w}_{k}^{i}=1$. The higher is the weight, the higher is the sample probability. δ(⋅) represents the DiracDelta function.
In order to solve the problem that is very difficult to sample directly from a posterior distribution, a good deal of the problem is the importance sampling technique. It can draw samples directly from the importance distribution. The importance distribution can be described as:$$q({x}_{0:k}{z}_{1:k})\approx {\displaystyle \sum _{i=1}^{N}}\delta ({x}_{0:k}{x}_{0:k}^{i}).$$(13)Plugging the importance distribution (13) into (12), then the weight can be updated:$${w}_{k}^{i}=\frac{p({z}_{k}{x}_{k}^{i})p({x}_{k}^{i}{x}_{k1}^{i})p({x}_{0:k1}^{i}{z}_{1:k1})}{q({x}_{k}^{i}{x}_{0:k1}^{i},{z}_{1:k})q({x}_{0:k1}^{i}{z}_{1:k1})}={w}_{k1}^{i}\frac{p({z}_{k}{x}_{k}^{i})p({x}_{k}^{i}{x}_{k1}^{i})}{q({x}_{k}^{i}{x}_{0:k1}^{i},{z}_{1:k})}$$(14)where $p({z}_{k}{x}_{k}^{i})$ is the likelihood function,$p({x}_{k}^{i}{x}_{k1}^{i})$ is the state transfer distribution. If system (11) subjects to the Markov process, the weight update equation (14) can be reduced to:$${w}_{k}^{i}={w}_{k1}^{i}\frac{p({z}_{k}{x}_{k}^{i})p({x}_{k}^{i}{x}_{k1}^{i})}{q({x}_{k}^{i}{x}_{k1}^{i},{z}_{k})}.$$(15)
We set state transfer distribution as the importance distribution:$$q({x}_{k}^{i}{x}_{k1}^{i},{z}_{k})=p({x}_{k}^{i}{x}_{k1}^{i}).$$(16)
If the likelihood function $p({z}_{k}{x}_{k}^{i})$ and the prior weights are used to update the new weights [15], the weight renew equation can be reduced to equation (17):$${w}_{k}^{i}={w}_{k1}^{i}p({z}_{k}{x}_{k}^{i}).$$(17)
There is a wider problem of PF, which is known as degeneracy phenomenon. In order to avoid the problem, resampling is a suitable method. If the system iterates without resampling, the weight of some particles will tend to zero, and all efforts for the weights calculation become meaningless.
The standard method to avoid the degeneracy phenomenon is to renormalize the distribution by removing the small weight particles and duplicating the large weight particles. The weights of all the particles are set to 1/N (N is the number of particles). The resampling algorithm of the standard PF is shown above.$${N}_{eff}=\frac{N}{1+\mathrm{\backslash rm\; var}({w}_{k}^{i})}\approx \frac{1}{{\displaystyle \sum _{i=1}^{N}}{({w}_{k}^{i})}^{2}}$$(18)where N_{eff} is the threshold of resampling.
2.2.2 Support vector regressionparticle filter
The standard PF algorithm eliminates the small weight particles and duplicates the large weight particles to avoid the degeneracy phenomenon that would cause the loss of particle diversity. Which would make most particles aggregate around the larger weighted ones, so the degeneracy phenomenon still exists. In view of this problem, a new resampling algorithm known as SVR is introduced to rebuild a posterior distribution [18], which has an extremely fast learning speed and advantageous generalization capability. Moreover, SVR has a commendable performance in both classification and regression with a simple structure. Compared with other methods, SVR can avoid the degeneracy phenomenon and keep the diversity of particles in the case of limited samples. What's more, the training speed of SVR is much faster while obtaining better generalization. In view of these advantages, the SVR is selected in this paper to establish RUL prediction model. The application of the SVR is detailed in some studies [19,20].
The fundamental principle of SVR is known as an optimization problem expressed by a regularized functional with constraints [21], the form can be described as:$$\{\begin{array}{l}\Omega ={(f,f)}_{H}\\ s.t.{\displaystyle \underset{x}{\mathrm{sup}}}\leftF(x){F}_{l}(x)\right={\displaystyle \underset{x}{\mathrm{sup}}}\left{F}_{l}(x){\displaystyle \underset{\infty}{\overset{x}{\int}}}f(t)dt\right={\sigma}_{l}<\u03f5\end{array}$$(19)where the regularized functional defined in Hilbert space and generated by σ_{l} is represented by Ω = (f, f)_{H}. The error between the distribution functions F(x) and their estimation F_{l}(x) is represented by σ_{l}. The constraint is represented by ϵ.
The estimated probability density function (PDF) of distribution F_{l}(x) is F(x). Only the points x_{i}(i = 1, 2,…, m) in the particle set should be considered, so equation (19) can be reduced to:$$\underset{i}{\mathrm{max}}}{\left{F}_{l}(x){\displaystyle \underset{\infty}{\overset{x}{\int}}}f(t)dt\right}_{x={x}_{i}}={\sigma}_{l}<\u03f5.$$(20)If the PDF f(x) is described by kernel functions:$$f(x)={\displaystyle \sum _{i=1}^{m}}{\beta}_{i}K({x}_{i},x)$$(21)
Kernel function K(x_{i} , x) = ϕ_{ T (xi)ϕ(x)} satisfies Mercer's condition. Then the regularized functional can be described as:$$\Omega (f)={(f,f)}_{H}={\displaystyle \sum _{i=1}^{m}}{\displaystyle \sum _{j=1}^{m}}{\beta}_{i}{\beta}_{j}K({x}_{i},{x}_{j}).$$(22)
The posterior distribution prediction can be described as an optimization problem with constraints:$$\{\begin{array}{l}\mathrm{min}{w}_{p}(\beta )={\displaystyle \sum _{i=1}^{m}}{\displaystyle \sum _{j=1}^{m}}{\beta}_{i}{\beta}_{j}K({x}_{i},{x}_{j})\\ s.t.{\displaystyle \underset{i}{\mathrm{max}{\left{F}_{i}(x){\displaystyle \sum _{j=1}^{m}}{\beta}_{j}{\displaystyle \underset{\infty}{\overset{x}{\int}}}K({x}_{j},t)dt\right}_{x={x}_{j}}}}={\sigma}_{l}& .\end{array}$$(23)
Set y_{i} =F_{i}(x_{i})_{, w = }[_{β1, β2,…, βm] T }, ${z}_{j}(x)={\int}_{\infty}^{{x}_{i}}K(x,t)dt$, z_{i} = (z_{i} (x_{1}),z_{i} (x_{2}),…, z_{i} (x_{m} ))_{, ξi } and ${\xi}_{i}^{*}$ are nonnegative slack variables, then equation (23) can be reduced to a quadratic programming problem:$$\{\begin{array}{l}\mathrm{min}J(w,{\xi}_{i},{\xi}_{i}^{*})=\frac{1}{2}{w}^{T}w+C({\displaystyle \sum _{i=1}^{m}}{\xi}_{i}+{\displaystyle \sum _{i=1}^{m}}{\xi}_{i}^{*})\\ s.t.{w}^{T}{z}_{i}{y}_{i}\le {\sigma}_{l}+{\xi}_{i}\\ {y}_{i}{w}^{T}{z}_{i}\le {\sigma}_{l}+{\xi}_{i}^{*}\\ {\xi}_{i},{\xi}_{i}^{*}\ge 0,i=1,2,\dots ,m\end{array}$$(24)where C is the penalty coefficient. By introducing Lagrange coefficients a_{i} _{,} ${a}_{i}^{*}$ to equation (24), we get:$$\{\begin{array}{l}\mathrm{max}w({a}_{i},{a}_{i}^{*})=\frac{1}{2}{\displaystyle \sum _{i=1}^{m}}{\displaystyle \sum _{j=1}^{m}}({a}_{i}^{*}{a}_{i})({a}_{j}^{*}{a}_{j})({z}_{i}^{T}{z}_{j}){\sigma}_{l}{\displaystyle \sum _{i=1}^{m}}({a}_{i}^{*}+{a}_{i})\\ +{\displaystyle \sum _{i=1}^{m}}{y}_{i}({a}_{i}^{*}{a}_{i})\\ s.t.{\displaystyle \sum _{i=1}^{m}}({a}_{i}^{*}{a}_{i})=0,0\le {a}_{i},{a}_{i}\le C,i=1,2,\dots ,m& .\end{array}$$(25)
Now the solution of equation (25) can be described as:$${\beta}_{j}={\displaystyle \sum _{i=1}^{m}}({a}_{i}^{*}ai){z}_{i}({x}_{j}).$$(26)
In equation (26), x_{i} is the support vector and the corresponding parameter of nonzero coefficients ${a}_{i}^{*}$, ai. Substituting equation (26) into (21), the solution can be transformed into a posterior distribution estimation of an optimization problem.
As discussed above, the PF algorithm can be modified into a new PF algorithm by integrating SVR, which can be described as follows.
Resampling of the posterior distribution starts once the effective sample N_{eff} below the threshold. The two training groups are particle ${x}_{k}^{i}$ and corresponding weight ${w}_{k}^{i}={F}_{l}({x}_{k}^{i})$. The resampling posterior distribution is rebuild by these groups. The flow chart of the SVRPF algorithm is shown in Figure 1.
In Figure 1, the rebuilt particles and weights are represented by ${\tilde{{x}}}_{k}^{1},\dots ,{\tilde{\mathrm{x}}}_{k}^{\mathit{m}}$ _{and ${\tilde{\mathrm{w}}}_{k}^{1},\dots ,{\tilde{\mathrm{w}}}_{k}^{\mathit{m}}$.}
Fig. 1 Fundamental illustration of SVRPF. 
3 Proposed prediction method
The proposed method mainly consists of three parts, feature construction, feature signal processing and RUL prediction, see Table 2 for details.
BPA function combination based on DS data fusion.
3.1 Feature construction
Feature signals are extracted from respective original vibration and temperature signal of crusher roller sleeves. Since the definition of the original signal in different stage is relatively vague, it is crucial to select a significant sensitive feature that can fully reflect the degradation of roller sleeve. The proposed method evaluates the degradation of roller sleeves by calculating the tendency degree between each feature and running time, which is defined as the Karl Pearson coefficient of the feature.
The Karl Pearson coefficient uses the rank to evaluate the tendency degree of a feature. It cannot only evaluate the nonlinear relationship but the monotonicity of the features$$R=\frac{{\displaystyle \sum _{i=1}^{{N}_{s}}}\left({x}_{i}{\displaystyle \stackrel{\u203e}{x}}\right)\left({y}_{i}{\displaystyle \stackrel{\u203e}{y}}\right)}{\sqrt{{{\displaystyle \sum _{i=1}^{{N}_{s}}}\left({x}_{i}{\displaystyle \stackrel{\u203e}{x}}\right)}^{2}{\left({y}_{i}{\displaystyle \stackrel{\u203e}{y}}\right)}^{2}}}$$(27)where x_{i} and y_{i} are the ranks of the time t_{i}, and the ith feature, respectively. N_{i} is the length of the time sequence. $\overline{\mathit{x}}$ and $\overline{\mathit{y}}$ are the means of x_{i} and y_{i}, respectively.
The sensitive feature is chosen from the features which has the highest tendency value, i.e., the original feature has the most obvious monotonic trends.
3.2 Feature signal processing
In the signal processing part, original vibration and temperature signals are usually formed by the superposition of the characteristic signal and the noise signal, and the random disturbance signal in the noise affects the precision of the prediction result deeply. So, it is important to process the signal for a more accurate prediction result. Signal processing contains the removal of outliers, eliminates the trend item and denoising.
In general, the detection of outliers in signal is based on the previous normal monitoring data. The least squares polynomial is established to estimate the value of the observation data at the next moment, the absolute value of the estimated value subtracts the actual data at current time and the difference is further determined whether the difference is more than a given threshold. If the difference is more than the threshold, it is considered that the observation data are outlier, otherwise they are considered normal data.
In measurement process $\widehat{\mathit{x}}(n)=\mathrm{x}(\mathrm{n}1)+\frac{1}{2}$ or signal processing, set x (n − 4)_{, x (n − 3), x (n − 2)}, x (n − 1) are four consecutive data of signal x(n) before time point n. The estimated value of the current time $\widehat{\mathit{x}}(n)$ can be obtained by the linear extrapolation of the least square estimation [22–25].$$\widehat{\mathit{x}}(n)=\mathit{x}(\mathit{n}1)+\frac{1}{2}\mathit{x}(\mathit{n}2)\frac{1}{2}\mathrm{x}(\mathit{n}4).$$(28)
Calculating the absolute value of $\widehat{\mathit{x}}(n)$ subtracts the measured value, and compares it with the threshold value δ, i.e.$$\widehat{\mathit{x}}(\mathit{n})\mathit{x}(\mathit{n})\le \mathit{\delta}$$(29)where x(n) are current data, $\widehat{\mathit{x}}(n)$ are the estimated value of the current data obtained by the linear extrapolation through the least square estimation; σ is the standard deviation of measured data residuals. If the equation (29) is established, x(n) is the normal value, otherwise, it is the outlier.
The trend term in the measurement signal is the frequency component of the signal, which is larger than the sampling length of the signal. It is generally the result of a slow change of the time sequence in the measurement system. Except the working frequency of the original signal collected by the sensor, there are some random interference signals. The existence of these trends, will cause great error in the correlation analysis or power spectrum analysis in space domain, even distort the low frequency completely. If the measurement signal without removing the trend term is directly used to predict the RUL of roller sleeve, it will directly affect the forecast results, make inappropriate judgments and conclusions. So the extraction and elimination of the measurement signal trend term is an important part of tested data processing.
The original signal is x(t), which can get a discrete time series x(n) by uniformlyspaced sampling. The least square method is used to construct a pthorder polynomial [26–28].$$y(t)={a}_{0}+{a}_{1}t+{a}_{2}{t}^{2}+\cdots +{a}_{p}{t}^{p}={\displaystyle \sum _{k=0}^{p}}{a}_{k}{t}^{k}$$(30)where p is a positive integer, means the order of polynomial, and the selection of p value is based on the estimation of the signal trend. If the trend of the signal is linear, choose p = 1. With y(t), we subtract the original signal x(t) by the polynomial trend term y(t), i.e.$$\widehat{\mathrm{y}}(\mathit{t})\mathit{x}(\mathit{t})\mathit{y}(\mathit{t})$$(31)where $\widehat{\mathrm{y}}(\mathit{t})$ is the signal that removes the trend item.
Because of the existence of so lot of noise, the field monitoring signal may submerge in other vibration signals and random noise, which can further causes great impact on the online monitoring. In this paper, wavelet is used to denoise the signals. The basic idea of wavelet denoising is to decompose and reconstruct the signal. Because signal and noise at different wavelet spectrum scales have different expressions, we remove the spectral components especially the dominant portions generated by noise at different scales. The wavelet spectrum preserved in this way is the wavelet spectrum of the original signal, basically. We reconstruct the original signal using the reconstruction algorithm of wavelet transform at last [29–32].
Signal x(t) can be expressed as:$$x(t)={\displaystyle \sum _{k=\infty}^{+\infty}}{c}_{j,k}\varphi (tk)+{\displaystyle \sum _{k=\infty}^{+\infty}}{\displaystyle \sum _{j=0}^{+\infty}}{d}_{j,k}{c}_{k}\varphi ({2}^{j}tk)$$(32)where c_{j,k} =⟨ x(t), ϕ_{j,k}(t) ⟩ is the scale coefficient, d_{j,k} =⟨ x(t), ϕ_{j,k}(t) ⟩ is the wavelet coefficient.
In the multiscale decomposition process, x(t) is always progressively decomposed to two subspaces V_{j} and W_{j} from the space V_{j−1} _{.} According to the twoscale equation, we can get the fast recursive algorithm about projection coefficient from c_{j−1,k} of x(t) in V_{j−1} to c_{j,k} and d_{j,k} of x(t) in V_{j} and W_{j} _{.} $${c}_{j,k}={\displaystyle \sum _{m\in z}}h\left(m2k\right){c}_{j1,m}$$(33) $${d}_{j,k}={\displaystyle \sum _{m\in z}}g\left(m2k\right){c}_{j1,m.}$$(34)
On the contrary, c_{j−1,k} also can be reconstructed by c_{j,k} and d_{j,k}, and the reconstruction formula is as follows.$${c}_{j1,k}={\displaystyle \sum _{m}}{c}_{j,m}h\left(k2m\right)+{\displaystyle \sum _{m}}{d}_{j,m}g\left(k2m\right).$$(35)
3.3 RUL prediction
3.3.1 Initial state of fusion prediction
Set N as the initial time of prediction, a_{N} is the acceleration in the Nth time point, ${a}_{T,N}^{*}$ is the acceleration prediction obtained by temperature data in the Nth time point,${a}_{a,N}^{*}$ is the acceleration prediction obtained by acceleration data in the Nth time point.
The first step is calculating the initial value of m_{1,i}(T) BPA function. From the central limit theorem, a large number of temperature data measurement errors obeys normal distribution, i.e.$${T}_{T,N}\sim N\left({\mu}_{{T}_{T,N}},{\sigma}_{T}^{2}\right)$$(36)where ${\mu}_{{T}_{T,N}}$ is the mean value of temperature, ${\sigma}_{T}^{2}$ is the variance.
From the experimental result we can see that there is a strong linear relationship between acceleration signal and temperature signal. So, the a_{T,N} estimation also obeys normal distribution, i.e.$${a}_{T,N}\sim N({\mu}_{{a}_{T,N}},{\sigma}_{T}^{2})$$(37)where$${\mu}_{{a}_{t},N}={\alpha}_{N}*{\mu}_{T,N}+{\beta}_{N}={a}_{N}$$(38) $${\sigma}_{T}^{2}={\alpha}_{N}^{2}*{\sigma}_{T}^{2}.$$(39)
Thus the initial value of the BPA function m_{1,i}(T) can be described as m_{1,N+1}(T)$${m}_{1,N+1}(T)=\frac{1}{\sqrt{2\pi}{\sigma}_{T}}\mathrm{exp}\left[\frac{{\left({a}_{T,N}^{*}{a}_{N}\right)}^{2}}{2{\sigma}_{T}^{2}}\right].$$(40)
The second step is calculating the initial value of the BPA function m_{2,i}(a). From the central limit theorem, a large number of acceleration data measurement errors obeys the normal distribution, i.e.$${a}_{a,N}\sim N({\mu}_{{a}_{a},N},{\sigma}_{a}^{2})$$(41)where μ_{a,N} is the mean value of acceleration; ${\sigma}_{a}^{2}$ is the variance. If μ_{a,N} meets:$${\mu}_{{a}_{a},N}={a}_{N}.$$(42)
The initial value of the BPA function m_{2,i}(a) can be described as m_{2,N+1}(a)$${m}_{2,N+1}(a)=\frac{1}{\sqrt{2\pi}{\mathit{\sigma}}_{\mathit{a}}}\mathrm{exp}[\frac{({\mathit{a}}_{\mathit{a},\mathit{N}}^{*}{\mathit{a}}_{\mathit{N}}{)}^{2}}{{2\mathrm{\sigma}}_{\mathrm{a}}2}]\mathrm{.}$$(43)
After getting the value of BPA function m_{1,N+1}(a) and m_{2,N+1}(a), then we calculate the value of the other BPA functions. Because there is no correlation between the two kinds of prediction methods, it's easy to know that:$${m}_{1,N+1}(a)={m}_{2,N+1}(T)=0.$$(44)
According to the properties of BPA function:$${m}_{1,N+1}(T\hspace{0.17em}\cup \hspace{0.17em}a)=1{m}_{1,N+1}(T)\hspace{0.17em}{m}_{1,N+1}(a)=1{m}_{1,N+1}(T)$$(45) $${m}_{2,N+1}(T\hspace{0.17em}\cup \hspace{0.17em}a)=1{m}_{2,N+1}(T){m}_{2,N+1}(a)=1{m}_{2,N+1}(a).$$(46)
After determining all the values of the BPA function, the third step is calculating the posterior fusion BPA function. The fusion BPA function can be described as m, from formulas (6) and (7), it's easy to know that:$$\begin{array}{ll}{m}_{N+1}(T)\hfill & =\frac{1}{1{K}_{N+1}}{\displaystyle {\displaystyle \sum}_{B\cap C=T}}{m}_{1,N+1}(B){m}_{2,N+1}(C)\hfill \\ \hfill & =\frac{{m}_{1,N+1}(T){m}_{2,N+1}(T\hspace{0.17em}\cup \hspace{0.17em}a)}{1{K}_{N+1}}\hfill \end{array}$$(47) $$\begin{array}{ll}{m}_{N+1}(a)\hfill & =\frac{1}{1{K}_{N+1}}{\displaystyle {\displaystyle \sum}_{B\cap C=a}}{m}_{1,N+1}(B){m}_{2,N+1}(C)\hfill \\ \hfill & =\frac{{m}_{1,N+1}(T\hspace{0.17em}\cup \hspace{0.17em}a){m}_{2,N+1}(a)}{1{K}_{N+1}}\hfill \end{array}$$(48)where$${K}_{N+1}={\displaystyle {\displaystyle \sum}_{B\cap C=\varnothing}}{m}_{1,N+1}(B){m}_{2,N+1}(C)={m}_{1,N+1}(a){m}_{2,N+1}(T)+{m}_{1,N+1}(T){m}_{2,N+1}\left(a\right)={m}_{1,N+1}(T){m}_{2,N+1}(a).$$(49)
With the fusion posterior BPA function, the acceleration can be predicted in the N+1th time point. Set ${a}_{T,N+1}^{*}$ is the acceleration prediction obtained by temperature data in the N+1th time point, ${a}_{a,N+1}^{*}$ is the acceleration prediction obtained by acceleration data in the N+1 ^{th} time point, ${a}_{N+1}^{*}$ is the acceleration prediction obtained by data fusion method in the N+1th time point. Based on formulas (47)–(49), ${a}_{N+1}^{*}$ meets:$${a}_{N+1}^{*}={m}_{N+1}(T){a}_{T,N+1}^{*}+{m}_{N+1}(a){a}_{a,N+1}^{*}.$$(50)
Thus we get the initial state of RUL prediction based on DS data fusion and SVRPF:$$\text{Initial}={[{m}_{1,N+1}(T),{m}_{1,N+1}(a),{m}_{1,N+1}(T\hspace{0.17em}\cup \hspace{0.17em}a),{m}_{2,N+1}(T),{m}_{2,N+1}(a)\cdots {m}_{2,N+1}(T\hspace{0.17em}\cup \hspace{0.17em}a),{m}_{N+1}(T),{m}_{N+1}(a),{a}_{N+1}^{*}]}^{T}.$$(51)
3.3.2 Fusion prediction process
Step 1: calculating the BPA functions. Set N_{EOL} is the end of roller sleeve working life, ${a}_{k}^{*}$ _{is the} acceleration prediction obtained by data fusion in kth time point, ${a}_{T,k}^{*}$ is the acceleration prediction obtained by the temperature data, ${a}_{a,k}^{*}$ is the acceleration prediction obtained by the acceleration data. Where N + 1 ≤ k ≤ N_{EOL}, the BPA function can be described as m_{1,k+1}(T) and m_{2,k+1}(a).$${m}_{1,k+1}(T)=\frac{1}{\sqrt{2\pi}{\sigma}_{T}}\mathrm{exp}\left[\frac{({\mathit{a}}_{\mathit{T},\mathit{k}}^{*}{\mathit{a}}_{\mathit{k}}^{*}{)}^{2}}{\mathit{2}\mathit{\sigma}{\mathit{T}}^{2}}\right]$$(52) $${m}_{2,k+1}(a)=\frac{1}{\sqrt{2\pi}{\sigma}_{a}}\mathrm{exp}[\frac{({\mathit{a}}_{\mathit{a},\mathit{k}}^{*}{\mathit{a}}_{\mathit{k}}^{*})}{{2\mathrm{\sigma}}_{\mathit{a}}^{*}}].$$(53)
Because there is no correlation between the two prediction models, it's easy to know that:$${m}_{1,k+1}(a)={m}_{2,k+1}(T)=0.$$(54)
Based on the properties of BPA functions, there are:$${m}_{1,k+1}(T\cup a)=1{m}_{1,k+1}(T)$$(55) $${m}_{2,k+1}(T\cup a)=1{m}_{2,k+1}(a).$$(56)
Step 2: calculating the acceleration prediction in the k+1th time point. From the formulas (6) and (7), formulas (52)–(56), the posterior fusion BPA function m_{k+1}(T) and m_{k+1}(a) can be described as:$$\begin{array}{ll}{m}_{k+1}(T)\hfill & =\frac{1}{1{K}_{k+1}}{\displaystyle {\displaystyle \sum}_{B\cap C=T}}{m}_{1,k+1}(B){m}_{2,k+1}(C)\hfill \\ \hfill & =\frac{{m}_{1,k+1}(T){m}_{2,k+1}(T\cup a)}{1{k}_{k+1}}\hfill \end{array}$$(57) $$\begin{array}{l}{m}_{k+1}(a)=\frac{1}{1{K}_{k+1}}{\displaystyle {\displaystyle \sum}_{B\cap C=a}}{m}_{1,k+1}(B){m}_{2,k+1}(C)\\ \text{\hspace{0.22em}}\text{\hspace{0.22em}}\text{\hspace{0.22em}}\text{\hspace{0.22em}}\text{\hspace{0.22em}}\text{\hspace{0.22em}}\text{\hspace{0.22em}}\text{\hspace{0.22em}}\text{\hspace{0.22em}}\text{\hspace{0.22em}}=\frac{{m}_{1,k+1}(T\cup a){m}_{2,k+1}(a)}{1{k}_{k+1}}.\end{array}$$(58)
Thus we obtain the acceleration prediction ${a}_{k+1}^{*}$ in the K+1th time point through DS data fusion.$${a}_{k+1}^{*}={m}_{k+1}(a){a}_{T,k+1}^{*}+{m}_{k+1}(a){a}_{a,k+1}^{*}.$$(59)
Step 3: determining whether the acceleration reaches the threshold. If not, then go back to step 1 and continue the prediction; otherwise, calculating the RUL prediction $\overline{\mathit{L}}*$:$$\overline{\mathit{L}}*={\mathit{N}}_{\mathit{E}\mathit{O}\mathit{L}}\mathit{N}=(\mathit{k}+1)\mathit{N}.$$(60)
In summary, the proposed RUL prediction method based on DS data fusion and SVRPF can be described as flow chart in Figure 2.
A prediction model based on DS data fusion and SVRPF is established as:$$\{\begin{array}{c}\hfill {X}_{k+1}=f\left({X}_{k},{V}_{k}\right)\hfill \\ \hfill {Y}_{k}=g\left({X}_{k},{N}_{k}\right)\hfill \end{array}$$(61)where X_{k} is the prediction state, X_{k}, V_{k} are noise.$${X}_{k}={[{X}_{,k}^{T},{X}_{a,k}^{T},{X}_{DS,k}^{T}]}^{T}$$(62)where ${X}_{T,k}^{T}$ is the state obtained by the analysis of temperature data; ${X}_{a,k}^{T}$ is the state obtained by the analysis of acceleration data; ${X}_{DS,k}^{T}$ is the state obtained by the analysis of DS data fusion, and there are:$${X}_{T,k}={[{\lambda}_{T,k}^{*},{T}_{T,k}^{*},{a}_{T,k}^{*}]}^{T}$$(63) $${X}_{a,k}={[{\lambda}_{{T}_{T,k}}^{*},{T}_{T,k}^{*},{a}_{a,k}^{*}]}^{T}$$(64) $${X}_{DS,k}={[{m}_{1,k}(T),{m}_{1,k}(T\cup a),{m}_{2,k}(a),{m}_{2,k}(T\cup a)\dots {m}_{k}(T),{m}_{k}(a),{a}_{k}^{*}]}^{T}.$$(65)
In formulas (63)–(65): λ_{T,k} is the degradation parameter of temperature in the kth time point [33]; T_{T,k}, ${\lambda}_{{T}_{T,k}}$ are the temperature degradation parameters in the kth time point [34]; * is the prediction of each corresponding variable.
Thus, in combination with the RUL prediction model of literature [33,34] we can obtain the state equation of prediction model (61), the partial state equation of prediction by temperature data can be expressed as follows.$$\{\begin{array}{l}{\lambda}_{{T}_{T,k+1}}^{*}={\lambda}_{{T}_{T,k}}^{*}+{v}_{a,k}\\ {T}_{T,k+1}^{*}={T}_{T,k}^{*}\mathrm{exp}\left({\lambda}_{{T}_{T,{k}^{\Delta k}}}^{*}\right)+{v}_{b,k}\\ {a}_{T,k+1}^{*}={\alpha}_{N}*{T}_{T,k}+{\beta}_{N}+{v}_{c,k}\end{array}$$(66)where α_{N}, β_{N} is the degradation parameter prediction of acceleration in the Nth time point through the analysis of the temperature data. The partial state equation of prediction by acceleration data can be described as follows.$$\{\begin{array}{l}{\lambda}_{{T}_{T,k}}^{*}={\lambda}_{T,k}^{*}+{v}_{1,k}^{*}\\ {T}_{T,k+1}^{*}={T}_{T,k}^{*}\mathrm{exp}\left({\lambda}_{{T}_{T},k}^{*}\right)+{v}_{2,k}^{*}\\ {a}_{a,k+1}^{*}={T}_{T,k+1}^{*}\mathrm{exp}\left({\lambda}_{{T}_{T,k+1}}^{*}\right)+{v}_{3,k}^{*}\end{array}$$(67)
The partial state equation of prediction by data fusion can be described as follows.$$\{\begin{array}{l}{m}_{1,k+1}\left(T\right)=\frac{1}{\sqrt{2\pi}{\sigma}_{T}}\mathrm{exp}\left[\frac{({\mathit{a}}_{\mathit{T},\mathit{k}}^{*}{\mathit{a}}_{\mathit{k}}^{*}\mathrm{^2}}{{2\mathrm{\sigma}}_{\mathit{T}}^{2}}\right]\\ {m}_{2,k+1}(a)=\frac{1}{\sqrt{2\pi}{\sigma}_{a}}\mathrm{exp}\left[\frac{({\mathit{a}}_{\mathit{a},\mathit{k}}^{*}{\mathit{a}}_{\mathit{k}}^{*}\mathrm{^2}}{{2\mathrm{\sigma}}_{\mathit{a}}^{2}}\right]\\ {m}_{1,k+1}(T\hspace{0.17em}\cup \hspace{0.17em}a)=1{m}_{1,k+1}(T)\\ {m}_{2,k+1}(T\hspace{0.17em}\cup \hspace{0.17em}a)=1{m}_{2,k+1}(a)\\ {K}_{k+1}={m}_{1,k+1}(T){m}_{2,k+1}(a)\\ {m}_{k+1}(T)=\frac{{m}_{1,k+1}(T){m}_{2,k+1}(T\hspace{0.17em}\cup \hspace{0.17em}a)}{1{K}_{k+1}}\\ {m}_{k+1}(a)=\frac{{m}_{1,k+1}(T\hspace{0.17em}\cup \hspace{0.17em}a){m}_{2,k+1}(a)}{1{K}_{k+1}}\\ {\mathit{a}}_{\mathit{k}+1}^{*}={m}_{k+1}(T){\mathit{a}}_{\mathit{T},\mathit{k}+1}^{*}+{m}_{k+1}(a){\mathit{a}}_{\mathit{a},\mathit{k}+1}^{*}\end{array}$$(68)
Combining formulas (66)–(68) can obtain the state equation of prediction model (61), the measurement equation (61) can be described as follows.$$\{\begin{array}{l}{\widehat{\mathrm{T}}}_{\mathit{T},\mathit{k}}^{*}={\mathit{T}}_{T,k}^{*}+{\mathit{n}}_{\mathit{a},\mathit{k}}\\ {\mathit{T}}_{\mathit{a},\mathit{k}}={T}_{T,k}^{*}+{\mathit{n}}_{1,k}^{*}\\ {a}_{k}^{*}={\mathit{m}}_{\mathit{k}}(\mathrm{T}){\mathit{a}}_{\mathit{T},\mathit{k}}^{*}+{\mathit{m}}_{\mathit{k}}(\mathit{a}){\mathit{a}}_{\mathit{a},k}^{*}\end{array}$$(69)where ${\widehat{T}}_{T,k}$ is the temperature prediction obtained by temperature data, and T_{a,k} is the degradation parameter prediction of acceleration obtained by temperature data.
Fig. 2 Flowchart of the proposed method. 
4 Experimental demonstration
4.1 Introduction to the data acquisition platform
The test platform layout is shown in Figure 3, named PRONOSTIA [35]. The testing platform is designed by the AS2M Department of the FEMTOST Association, the full life test of the roller sleeve is carried out on the data acquisition platform of the rolling bearing, the vibration signal is collected by the 3035B Dytran acceleration sensor (the maximum acquisition range is 50 g), temperature signal acquisition using JCJ100TLB temperature sensor (maximum acquisition range is 200 °C). Because the acceleration signal is more severe than the temperature signal, so the fulllife test process stops if the acceleration signal amplitude is found to exceed 20 g. Even if the roller sleeve does not out of work, in order to avoid the test platform damage caused by the roller sleeve, we determine the failure of the roller sleeve, stop testing. The acceleration test sampling frequency is 25.6 kHz, each 10 s stores a set of data, each group of data 2560 points, the temperature test sampling frequency is 10 Hz, and each 10 s stores a set of data, each group of data 100 points.
In the test, the testing roller sleeve is 22,324 tapered roller bearing, the roller lifetest is carried out 4 times, each time one test roll is damaged. The 1st test and the 2nd test are worked under the condition of radial load 4000 kN, speed 1800 rpm/min; the 3rd test and the 4th test are worked under the condition of radial load 4200 kN, speed 1650 rpm/min. The test results are shown in Table 3.
Due to the different working state of the 4 roller sleeves and the different structure of the roller sleeves, the experimental results are different, which is conform to the actual engineering facts. Then we use the measured experimental data to verify the performance of the proposed RUL prediction method. Because the 4 groups tests are carried out on the same platform, and the analysis methods of each roller sleeve is same, below takes the 1st roller sleeve as the research object to make explanation. The measured temperature and vibration data are shown in Figures 4 and 5.
Fig. 3 Overview of the data acquisition platform. 
Information about 4 groups of experimental failure roller sleeves.
Fig. 4 Original temperature data. 
Fig. 5 Original acceleration data. 
4.2 Feature construction
In order to compare the prediction performance of proposed DS data fusion and SVRPF prediction method with the prediction method using the single acceleration data and the prediction method with the temperature data with finite data available, the first key step is to select a good feature signal.
In this paper, the features are selected by calculating the Karl Pearson correlation coefficient between the timedomain characteristics of temperature and acceleration and RUL. As an indicator the features whose correlation coefficient is highest are selected. As a result, the root mean square (RMS) feature of vibration and the absolute mean value feature of temperature are selected. The Karl Pearson correlation coefficient result is shown in Table 4. The feature signal of acceleration and temperature are shown in Figures 6 and 7.
BPA function combination based on DS data fusion.
Fig. 6 Temperature feature signal. 
Fig. 7 Acceleration feature signal. 
4.3 Feature signal processing
4.3.1 Removal of outliers
The first step of signal processing is the removal of outliers. For the nonlinear and nonstationary signal, the existence of outliers can produce spurious harmonic components, further can influence the prediction accuracy. According to the statistical properties of the original data, 3σ criterion is used to remove the outliers here. If the residuals in equation (29) exceed 3σ the outliers can be eliminated. The temperature and vibration signals after removing the outliers are shown in Figures 8 and 9.
Fig. 8 Temperature signal after removing outliers. 
Fig. 9 Acceleration signal after removing outliers. 
4.3.2 Remove the trend of a smooth
Because of the zero drift of the amplifier caused by temperature variation, the performance of low frequency which exceeds the frequency range of the sensor is not stable with ambient interference around the sensor etc., which caused the collected data of vibration signal and temperature signal in lifetest will often deviate from the baseline, and even the degree of the deviation from the baseline will vary over time. The whole process of the deviation from the baseline directly affects the correctness of the signal and should be removed as the trend term. This paper from the perspective of engineering application adopts a simple and practical method to remove the trend items − the modified function method. The temperature and vibration signals after removing the trend term are shown in Figures 10 and 11.
Fig. 10 Smoothed temperature signal. 
Fig. 11 Smoothed acceleration signal. 
4.3.3 Denoising
Wavelet analysis is known as the microscope of signal processing. The key of wavelets analysis is the selection of wavelet basis and the decomposition level. Decomposition layer has great influence on the effect of denoising. The more is the decomposition layer, the lower is the noise–signal ratio. Meanwhile, when the layer increases, the processing becomes slow. Although few decomposition layer has high noisesignal ratio, the signal is decomposed to very small frequency bandwidths. Only the high frequency coefficients can be processed to remove the corresponding noise, while the corresponding low frequency noise is all reserved. Therefore, the choice of wavelet decomposition layer should be neither too large as considering the improvement of the noisesignal ratio nor too small as considering the suppression of low frequency noise. The purpose of denoising is to get the useful feature signal, so wavelet coefficients can reflect the minimum frequency components in the useful signal. The wavelet decomposition is to decompose the signal into various independent bands, high detail coefficient reflects the lowfrequency part of the signal. So this paper is based on the minimum frequency signal to determine the maximum level of wavelet decomposition. In this paper, sym8 wavelet is chosen as the wavelet base and soft thresholds are used to denoise. The temperature and vibration signal denoising processes are shown in Figures 12–17.
Fig. 12 Low frequency temperature signal. 
Fig. 13 High frequency temperature signal. 
Fig. 14 Low frequency acceleration signal. 
Fig. 15 High frequency acceleration signal. 
Fig. 16 Denoising temperature signal. 
Fig. 17 Denoising acceleration signal. 
4.4 RUL prediction
We compare the prediction performance of proposed prediction method which based on DS data fusion and SVRPF with prediction method which uses single data source and other prediction methods. Roller sleeve 1 was used to predict in three cases, predicted by acceleration data, predicted by temperature data and predicted by fused data.
From Figures 18–20, it's easy to know that the results predicted by the proposed prediction method are more accurate than other prediction methods.
Fig. 18 Predicted acceleration by temperature. 
Fig. 19 Predicted acceleration by acceleration. 
Fig. 20 Fusion prediction. 
5 Conclusion
In view of the engineering problem that is difficult to accurately predict the remaining useful life under partially observed state, a new method based on DS data fusion and SVRPF is proposed.
From Table 5, we can see that compared to other prediction methods such as obtained by the temperature or accelerationbased datadriven, the prediction accuracy of proposed method is significantly improved. Meanwhile, it provides a basis to make maintenance decision for the equipment working under severe conditions, further reduces the maintenance cost, improves the utilization rate and the reliability of the equipment, which has good practicability and popularization value.
Comparison of results of residual life prediction.
Nomenclature
PHM: Prognostics and health management
BPA: Basic probability assignment
m_{1}(·) : BPA function 1 under the identification framework
m_{2}(·): BPA function 2 under the identification framework
m_{1,2}(·): BPA function after fusion m_{1} and m_{2} under the identification framework
K : Degree of conflict between the two evidences
a_{N} : Acceleration magnitude at time N
${a}_{T,N}^{*}$ : Prediction acceleration magnitude at time N obtained by temperature data
a_{T,N} : Acceleration magnitude at time N obtained by temperature data
${a}_{a,N}^{*}$ : Prediction acceleration magnitude at time N obtained by acceleration data
R : Value of Karl Pearson coefficient
^{ αN,βN } : Acceleration degradation parameters
m_{1,i}(T): BPA function of temperature at time i
${a}_{T,N+1}^{*}$ : Prediction acceleration magnitude at time N+1 obtained by temperature data
K_{N+1} : Degree of conflict between the two evidences at time N+1
N_{EOL} : Threshold of life cycle N
$\widehat{\mathrm{L}}*$ : Prediction of remaining useful life
X_{k} : State of prediction at time k
V_{k},N_{k} : Measurement noise at time k
f(⋅),g(⋅): Transition function and measurement function
${X}_{T,k}^{T}$ : State of temperature at time k
${X}_{a,k}^{T}$ : State of acceleration at time k
${X}_{DS,k}^{T}$ : State of fusion at time k
λ_{T,k} : Temperature degradation parameters obtained by temperature at time k
${\mathrm{\lambda}}_{\mathit{T},\mathit{k}}^{*}$ : Prediction temperature magnitude obtained by temperature at time k
${T}_{T,k}^{*}$ : Prediction temperature magnitude obtained by temperature at time k
${a}_{T,k}^{*}$ : Prediction acceleration magnitude obtained by temperature at time k
X_{a,k} : State of acceleration at time k
${\lambda}_{{T}_{1},k}^{*}$,${T}_{1,k}^{*}$ : Acceleration degradation parameters obtained by acceleration at time k
${\lambda}_{T,k+1}^{*}$ : Prediction temperature degradation parameters obtained by acceleration at time k+1
v_{1,k} : λ degradation parameters obtained by temperature at time k
${\lambda}_{T,{k}^{\Delta k}}^{*}$ : Distribution state of λ at time k
v_{2,k} : State noise of prediction temperature obtained by temperature at time k
v_{3,k} : State noise of prediction acceleration obtained by temperature at time k
${v}_{1,k}^{*}$ : λ degradation parameters obtained by acceleration at time k
${v}_{2,k}^{*}$ : State noise of prediction acceleration obtained by temperature at time k
${v}_{3,k}^{*}$ : Measurement noise of prediction acceleration obtained by acceleration at time k
n_{1,k} : Measurement noise of prediction temperature at time k
${n}_{1,k}^{*}$ : Measurement noise of prediction acceleration degradation parameters at time k
References
 X. Zhang, X. Chen, B. Li, Life prediction of machinery major equipment: a review, J. Mech. Eng. 47 (2011) 100–116 [CrossRef] [Google Scholar]
 N.M. Vichare, M.G. Pecht, Prognostics and health management of electronics, IEEE Trans. Compon. Pack. Technol. 29 (2006) 222–229 [CrossRef] [Google Scholar]
 Y. Lei, Z. He, Z. Yanyang, Fault diagnosis based on the new model of hybrid intelligence, Mech. Eng. 44 (2008) 112–117 [CrossRef] [Google Scholar]
 N. Vapnik Vladimir, Statistical learning theory, Wiley, NY, 1998, pp. 760–768 [Google Scholar]
 M. Sunghwan, L. Jumin, H. Ingoo, Hybrid genetic algorithms and support vector machines for bankruptcy prediction, Exp. Syst. Appl. 31 (2006) 652–660 [CrossRef] [Google Scholar]
 M. Nizam, M. Azah, H. Aini, Dynamic voltage collapse prediction in power systems using support vector regression, Exp. Syst. Appl. 37 (2020) 3730–3736 [CrossRef] [Google Scholar]
 H. Dong, X. Jin, Y. Lou, Lithiumion battery state of health monitoring and remaining useful life prediction based on support vector RegressionParticle filter, J. Power Source 2014 (2014) 114–123 [CrossRef] [Google Scholar]
 J. Llinas, D.L. Hall, An introduction to multisensor data fusion, IEEE Inter. Sym. Circ. Syst. 6 (1998) 537–540 [Google Scholar]
 S. Wu, W. Jiang, Research on data fusion fault diagnosis method based on DS evidence theory, IEEE Comp. Soc. 1 (2009) 689–692 [Google Scholar]
 J. Tian, W. Zhao, R. Du, DS Evidence Theory and its Data Fusion Application in Intrusion Detection, Springer, Berlin, Heidelberg, CA, 2000, pp. 244–251 [Google Scholar]
 H. Sorenson, D. Alspach, Recursive Bayesian estimation using Gaussian sums, J. Auto. 7 (1971) 465–479 [CrossRef] [Google Scholar]
 B. Ristic, S. Arulampalam, N. Gordon, Beyond the Kalman filterparticle filters for tracking applications, IEEE. Trans. Aero. Electr. Syst. 19 (2004) 37–38 [Google Scholar]
 J. Carpenter, P. Clifford, P. Fearnhead, Improved particle filter for nonlinear problems, IEEE Proc. Radar. Sonar. Navig. 146 (1999) 2–7 [CrossRef] [Google Scholar]
 A.F. Seila, Simulation and the Monte Carlo method, Tech. 24 (2012) 167–168 [Google Scholar]
 N. Metropolis, S. Ulam, The Monte Carlo method, J. Am. Tatis. Assoc. 60 (1948) 115–129 [Google Scholar]
 J. Carpenter, P. Clifford, P. Fearnhead, Improved particle filter for nonlinear problems, IEEE Proc. Radar. Sonar Navig. 146 (1999) 1–7 [Google Scholar]
 R. Kalman, A new approach linear frittering prediction problems, Trans. ASME J. Basic Eng. 81 (1960) 35–45 [Google Scholar]
 Bishop, Pattern Recognition and Machine Learning, Springer, New York, CA, 2006, pp. 339–344 [Google Scholar]
 Z. Yinliang, Z. Changpeng, H. Bo, Z. Qinghua, Runtime support for typesafe and contextbased behavior adaptation, in: Presented at Computer and Information Technology (CIT), 2012 IEEE 12th International Conference, 2012 [Google Scholar]
 N. Kabaoglu, Target tracking using particle filters with support vector regression, IEEE Trans. Veh. Technol. 58 (2009) 2569–2573 [CrossRef] [Google Scholar]
 V. Vapnik, The Nature of Statistical Learning, Springer, Berlin, CA, 1995, pp. 225–259 [Google Scholar]
 G. Yao, K. Qingci, A. Yuhua, Method for eliminating data outliers based on wavelet transform, J. Air. Spac. TT&C. Technol. 25 (2006) 64–67 [Google Scholar]
 W. Lin, W. Liu, Establishment and application of spring maize yield to evapotranspiration boundary function in the Loess Plateau of China, Agric. Waste Manag. 178 (2016) 345–349 [CrossRef] [Google Scholar]
 C. Anagnostopoulos, Qualityoptimized predictive analytics, Appl. Intel. 45 (2016) 1–13 [CrossRef] [Google Scholar]
 F. Zheng, L.Y. Liu, X.X. Liu, Y. Li, X.G. Shi, G.Y. Zhang, K.W. Huan, Study on outliers influence in NIR quantitative analysis model, Guang Pu Xue Yu Guang Pu Fen Xi 36 (2016) 3523–3529 [PubMed] [Google Scholar]
 P. Zhang, J. Chang, B. Qu, Q. Zhao, Denoising and trend terms elimination algorithm of accelerometer signals, Math. Prob. Eng. 2016 (2016) 1–9 [Google Scholar]
 E.L. Andreas, G. Treviño, Using wavelets to detect trends, J. Atmos. Ocean Technol. 12 (1997) 554–564 [CrossRef] [Google Scholar]
 S. Chen, S.A. Billings, W. Luo, Orthogonal least squares methods and their application to nonlinear system identification, Int. J. Control 50 (1989) 1873–1896 [CrossRef] [Google Scholar]
 C. Torrence, G.P. Compo, A practical guide to wavelet analysis, Bull. Am. Meteo. Soc. 79 (1998) 61–78 [Google Scholar]
 D.E. Newland, Wavelet analysis of vibration: Part 1—Theory, J. Vib. Acoust. 116 (1994) 21–37 [Google Scholar]
 I. Daubechies, The wavelet transform, timefrequency localization and signal analysis, IEEE Trans. Inf. Theory 36 (1990) 961–1005 [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
 Z.K. Peng, F.L. Chu, Application of the wavelet transform in machine condition monitoring and fault diagnostics: a review with bibliography, Mech. Syst. Sig. Process 18 (2004) 199–221 [CrossRef] [Google Scholar]
 D. Hancheng, J. Xiaoning, L. Yangbing, Lithiumion battery state of health monitoring and remaining useful life prediction based on support vector regressionparticle filter, J. Powder Source 2014 (2014) 114–123 [Google Scholar]
 W. Changhong, D. Hancheng, Remaining effective working time prediction method for vehicle lithium ion battery, J. Auto. Eng. 37 (2015) 476–479 [Google Scholar]
 Y. Lei, A modelbased method for remaining useful life prediction of machinery, IEEE Trans. Relia. 65 (2016) 1–13 [CrossRef] [Google Scholar]
 Y. Jie, Z. Xiaodong, Analysis and application of rolling bearing life calculation method, Petro. Mach. 32 (2004) 27–29 [Google Scholar]
Cite this article as: H. Liu, J. Wu, X. Ye, T. Liao, M. Chen, A method based on DempsterShafer theory and support vector regressionparticle filter for remaining useful life prediction of crusher roller sleeve, Mechanics & Industry 20, 106 (2019)
All Tables
All Figures
Fig. 1 Fundamental illustration of SVRPF. 

In the text 
Fig. 2 Flowchart of the proposed method. 

In the text 
Fig. 3 Overview of the data acquisition platform. 

In the text 
Fig. 4 Original temperature data. 

In the text 
Fig. 5 Original acceleration data. 

In the text 
Fig. 6 Temperature feature signal. 

In the text 
Fig. 7 Acceleration feature signal. 

In the text 
Fig. 8 Temperature signal after removing outliers. 

In the text 
Fig. 9 Acceleration signal after removing outliers. 

In the text 
Fig. 10 Smoothed temperature signal. 

In the text 
Fig. 11 Smoothed acceleration signal. 

In the text 
Fig. 12 Low frequency temperature signal. 

In the text 
Fig. 13 High frequency temperature signal. 

In the text 
Fig. 14 Low frequency acceleration signal. 

In the text 
Fig. 15 High frequency acceleration signal. 

In the text 
Fig. 16 Denoising temperature signal. 

In the text 
Fig. 17 Denoising acceleration signal. 

In the text 
Fig. 18 Predicted acceleration by temperature. 

In the text 
Fig. 19 Predicted acceleration by acceleration. 

In the text 
Fig. 20 Fusion prediction. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.