Open Access
Issue
Mechanics & Industry
Volume 25, 2024
Article Number 31
Number of page(s) 13
DOI https://doi.org/10.1051/meca/2024023
Published online 25 November 2024

© S. Torregrosa et al., Published by EDP Sciences, 2024

Licence Creative CommonsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1 Introduction

The analytical solution of engineering models is today compromised as they become more and more complex. Since the mid-20th century [1], more and more powerful computers have been developed and democratized leading to the emergence of the so called third paradigm of science: the “virtual twins” or physics-based numerical simulations. A “virtual twin” emulates the complex behavior of a physical system using mathematical models [2]. Hence, numerical simulation has become an essential tool for scientific investigation and analysis of complex systems in engineering, facilitating the access to high-fidelity (HF) data and, thus, drastically reducing the number of experimental tests [25].

However, the size and complexity of the problem studied still limits the numerical simulations’ capabilities. Indeed, the intricacy of the engineering system is proportional to the computational resources needed and its linked cost. Thus, achieving HF simulations remains nowadays expensive and time-consuming [6]. HF numerical simulations are, thus, not adapted to real-time constraints or to some industrial contexts where we expect HF data quickly and with less effort [5].

Among the most known “virtual twins” one can think about computational fluid dynamics (CFD), which is studied in this paper. Since in realistic turbulent flow cases the Navier Stokes equations cannot be solved analytically, CFD offers a numerical approach to compute such flow phenomena. Through the years, CFD has been widely studied and developed leading to a high-fidelity tool with its more accurate approach: Direct Numerical Simulations (DNS) [7]. However, as highlighted just before, such methods are extremely computationally expensive and still infeasible on numerous applications [8]. Simpler CFD approaches have been developed such as large eddy simulation (LES) [9] or Reynolds Averaged Navier Stokes equations method (RANS) [10], which is nowadays the industrial standard approach. Although all the work done to improve its accuracy and cost [11,12], CFD simulations remain today either not to be trusted completely or too computationally expensive to run [8].

Moreover, the present data-based revolution leads to the fourth paradigm: the “data-intensive science” [1]. A new framework is developed where data, simulation and theory can interact and reinforce each other. Indeed, thanks to widely developed machine learning and artificial intelligence techniques, data has massively proliferated in the majority of scientific fields and data-based models (also called “digital twins”) are receiving much attention in numerous and different applications. Hence, recent interest is posed on these new tools to either replace or enhance CFD [8]. Therefore, artificial intelligence is nowadays used in numerous forms for computational efficiency but also to increase accuracy. Among those different applications one can find:

  • Near wall turbulent flows modeling [13].

  • Turbulence model coefficients tuning [14,15].

  • Turbulence model enhancement by building a representation of closure terms [16,17].

  • Surrogate model of the numerical simulation trained with CFD data [18,19].

Likewise, borrowed from computer vision and image recognition, super-resolution aims at obtaining a high resolution image output from a low-resolution image. In this sense, it can be applied in fluid mechanics simulation leading to a machine learning model taking as input a fast and computationally inexpensive simulation and returning a high-fidelity one. As any data-driven model, such approach needs an offline training stage and data base. In the literature, one can find super-resolution models trained with DNS high-resolution data and the purposely down-scaled counterpart. However, this could be judged as unreasonable since turbulent flow structures often differ significantly from their fine counterparts when computed on coarser resolutions [5,20,21]. Physics-informed machine learning models can also be found. Such models usually do not need a HF training data base and guarantee that the super-resolved fields are faithful to the physical laws and principles. However, they need an optimization step in the online stage [22,23]. Finally, correction models of the error gap between coarse and high-fidelity simulations are also developed but only for scalar variables [24,25].

In this paper we focus on this last family of approaches: data driven models combining both the “digital” and “virtual” twins that correct coarse CFD simulation fields [2]. On the one hand we want the most correct solutions, but high-fidelity data is too expensive and time-consuming.

On the other hand, we have access to computational facilities to run “as many as needed” coarse numerical simulations, but these present a non negligible deviation with the actual solution (i.e. high-fidelity data) of the system. It seems thus coherent to use both technologies together, using the advantages of each approach. The “hybrid twin” (HT) is thus born, composed of an imperfect coarse simulation of the system (the “virtual twin”) and of a data-based model emulating the ignorance gap between coarse and high- fidelity simulation fields (the “digital twin”).

Indeed, since high-fidelity simulations are the most correct ones but also too expensive, it becomes cheaper to learn the difference between some HF and the corresponding coarse solutions, allowing to correct further coarse simulations, instead of computing the corresponding HF ones. Hence, the computational complexity of the here proposed HT methodology is related to the computation of the high-fidelity simulations. Indeed, this data generation is expensive, with the complexity of usual CFD simulations. However, the data-based enrichment model construction (performed in an offline stage) that we propose is orders of magnitude cheaper. For the solutions presented here is of the order of magnitude of the hour. On the other hand, the online application of the correction data-based model is almost instantaneous, the order of magnitude of the second, which represents the key benefit of the Hybrid Twin rationale.

The question that arises now is, how do we model the ignorance, i.e. the coarse-high-fidelity simulations gap? One could think about measuring this gap as a difference. However, since we are working with fields, this approach could lead to non-physical interpolated results in, for instance, fields where a given choice of the problem parameters leads to a localized solution in different regions (e.g. fluid mechanics). A smarter solution would be to use Optimal Transport (OT) theory [26], which provides a mathematical framework to calculate distances between general objects and can be considered more physical in many fields. Such a solution has already been explored by the authors in previous works [27,28].

Indeed, the Optimal Transport theory was applied by the authors in [28] to build a “digital twin”: the ignorance gap is learnt as a transport of information. In this previous work, the HT approach was applied over the error gap between experimental and numerical (the “virtual twin”) data, leading to a data-based model (the “digital twin”) able to correct a CFD simulation based on experimental knowledge. Indeed, physics-based simulations present some significant deviations when compared to measurement data. This deviations are expected to be biased since they represent the ignorance of the modeler on the subjacent physics, related to the inaccuracy in the employed models. Hence, the artificial intelligence, acting as a black box, models the physical part that is beyond the modeler’s knowledge [2,4,29].

In this new paper, the same Optimal Transport based methodology is applied, but now over the error gap between coarse and high-fidelity numerical data. Hence, the coarse data is OT-based corrected by being optimally transported to the high-fidelity data. Here, the artificial intelligence models the ignorance of the modeler over the spatial discretization of the industrial system: the meshing step. Indeed, a high-fidelity mesh leads to accurate but costly numerical results while a coarse mesh leads to poorer but cheaper ones. It is important to note that, in contrast with the numerical-experimental ignorance gap, now it is a known ignorance since the difference between the HF and coarse mesh is part of the modeler’s knowledge.

The already published HT method is a two stages approach: first, the OT-based “digital twin” is trained offline, then, this data driven correction is applied over the “virtual twin” output in an online manner. Indeed, coarse simulations and their high-fidelity counterparts are used to train the OT-based correction model. Once trained, the “digital twin” can be used in an online manner to correct further coarse simulations from the “virtual twin”.

The steps of the offline stage are as follows. First, coarse simulations (the “virtual twin”) and their high-fidelity counterparts are computed in the parametric space of our problem. Then, all the training simulations are decomposed into the sum of identical Gaussian functions (also called particles) based on a Smoothed-Particle Hydrodynamics (SPH) decomposition [30]. Next, for all the coarse- HF couples of the training set, each particle from the coarse data is matched with a particle from its corresponding high-fidelity data. Hence, the OT-based differences between coarse and HF data are computed. Finally, the OT-based “digital twin” is built by training a Neural Network architecture over the OT-based gaps.

Once the correction data-based model of the ignorance is trained, it can correct a new coarse simulation, from which we do not have access to the high-fidelity data, in an online manner. To this purpose, the “virtual twin” output is decomposed into particles and those particles transported based on the OT-based gap, interpolated by the “digital twin”. Finally, the so expected high-fidelity corresponding data is reconstructed by summing all the particles, i.e. Gaussian functions.

It can be noted that if the CFD solution of the studied problem is not regular with respect to the problem parameters’ change, the Kolmogorov n-width increases, needing for nonlinear dimensionality reduction, as Optimal Transport here proposed performs [3133]. Additionally, if the relation between the parameters and the corresponding solution is regular, the problem becomes simpler, and the OT-based technique proposed also performs accurately.

In this article, the principal ideas of OT theory and the main steps of the methodology are shortly presented. Indeed, we focus here in the new results more than in the methodology which has already been detailed in [28]. It is important to be noted that even if the methodology is the same, the problem solved is conceptually completely different, moving from the simulation-experiment gap to the coarse-HF simulation gap.

2 Revisiting optimal transport

In this section, the OT framework is presented and the tools on which the thereafter proposed OT-based correction model is based introduced. It should be noted that this section is a non exhaustive introduction of the principal ideas of OT theory. For further documentation on this topic, [34] and the references therein can be consulted.

The initial Optimal Transport problem was introduced by Monge [35]. It consisted in finding the most optimal path to move a given quantity of soil from an initial to a target location. The cost function in the transport was defined as the distance traveled by the soil. Note that in this article we are only interested in the discrete formulation of the problem. In order to introduce the discrete problem, let us consider M factories consuming a certain resource that needs to be transported from N mines. The cost function to minimize is the square of the total Euclidean distance traveled by the resource. This discrete Optimal Transport problem is illustrated in Figure 1.

On the one hand, each factory m ∈ ⟦M⟧ (note that the notation ⟦M⟧ corresponds to {1,…,m,…,M}) is located at ym and consumes an amount bm of the resource. On the other hand, each mine n ∈ ⟦N⟧ is located at xn and produces an amount an of this same resource. Hence, following the notion of measure, two distributions, α and β, corresponding to the resource produced and consumed respectively can be defined:

α=n=1Nanδxn and β=m=1Mbmδym(1)

where δxn and δym correspond to the Dirac at locations xn and ym respectively.

Solving the discrete Monge problem consists, thus, in finding the map T connecting each point xn with a single target point ym such that the transport cost is minimized. Here, we consider the square of the L2 distance between the mine n and its corresponding factory m:

Cxn,ym= xnym 22.(2)

Hence, the produced resource distribution, α, is pushed toward the consumed resource distribution, β. Since the resource cannot be destroyed or produced during its transport, transport map T:{x1,…,xN}→{y1,…,yM} must also satisfy the mass conservation:

mM,bm=n:T(xn)=yman,(3)

Note that here the map T is a surjective function. Finally, we obtain the following minimization problem:

minTn=1NCxn,T(xn).(4)

In order to developed our data-driven correction model, we are interested in a simplified version of the just presented discrete Monge problem. Indeed, it is first supposed that N= M, i.e. that the number of factories and mines is the same. In addition, it is also supposed that the quantity of resource produced by every mine and consumed by every factory is the same, i.e. an= bm = 1/N. Therefore, the optimization problem (4) is now a deterministic matching problem and the transport map becomes a bijective function.

Under these assumptions, the just simplified discrete Monge problem can be easily solved by linear programming. Indeed, the problem is now equivalent to an optimal matching problem between two particle clouds, as it is illustrated in Figure 2 in 2D. Note that each cloud is composed by the same number of particles, every particle has the same amount of mass and the cost is defined as the square of the L2 distance between two particles. In higher dimensions than 2D, the computational cost of resolution increases but the problem does not further complexifies.

The OT-based difference between two optimally paired particle clouds can be determined by calculating the Euclidean distances between matched particles. Indeed, the OT-based difference is the total sum of the L2 distances δk for all the pairs of particles.

thumbnail Fig. 1

Discrete OT formulation for N= 4 mines and M=3 factories. The resource produced by the mines is: a1 =3, a2 =1, a3 = 1 and a4 = 2. The resource consumed by the factories is: b1 =4, b2 = 1 and b3 = 2. The Euclidean distance traveled by the resource is the cost to minimize.

3 Hybrid twin based on optimal transport

The “hybrid twin” approach, based on optimal transport, developed by the authors and published in [28] is here shortly reviewed. We strongly recommend the interested reader to refer to the previous article for all the details of its implementation. First, the offline stage where the “digital twin” is trained is introduced. Then, its online application, in combination with the “virtual twin”, is presented.

Let suppose a parametric space W(η1,,ηq,,ηQ) where ηq, q ∈ ⟦Q⟧ are the parameters and where our parametric problem is defined. Next, let consider P 3D coarse-HF simulation couples in W corresponding to the 3D simulations of the parametric problem using a coarse and a high-fidelity meshing respectively. The 3D fields are monitored in a defined 2D domain of interest Ω. Hence, each coarse and HF data sample is formally represented by a distribution ψ: Ω Є ℝ2 → ℝ+. It can be noted that the image of ψ is supposed strictly positive.

First, the OT-based ignorance model is trained offline, based on the coarse-HF simulation couples, following the next steps (colored in blue in Fig. 3):

Pre-processing: Normalization of the distributions corresponding to the coarse simulations and the high- fidelity counterparts to obtain unitary integral distributions:

ρ=ψ where =ΩψdΩ(5)

Particles decomposition: Every coarse and HF simulation is decompounded into a sum of N identical 2D-Gaussian functions of fixed standard deviation σ and mass 1/N. Indeed, it is important to note that the number of particles N and the standard deviation σ of each particle are hyperparameters of our methodology.

ρ¯(x)=n=1NGμn,σ(x) where Gμn,σ(x)            =1Nσ22πexp(xμn)22σ2.(6)

Hence, the only variables are, for a given distribution, the means μn of each Gaussian function, i.e. N vectors of 2 components: μn,x and μn,y (because we are in 2 dimensions).

Therefore, we need to solve P × 2 minimization problems (i.e. two optimizations for each coarse-HF simulation couple) in order to place the N particles minimizing the error with respect to the original data. To this purpose a Gradient Descent approach is used to solve the optimization problem (here the subscript c denotes the coarse simulation):

minμcp12 ρcpρ¯cp 22                   =minμcp12[ i=1D(ρcp(xi)n=1NGμcnp,σ(xi))2 ],(7)

where D is the number of points of the mesh where the distribution ρvp is calculated.

Once the decomposition is computed, one can introduce the matrix μcpN×2, composed by the coordinates x and y of all the particles of the coarse simulation of the pth couple (and likewise for the high-fidelity simulation with the subscript hf):

μcp=[ μc1pμcnpμcNp ]=[ [ μc1,xp,μc1,yp ][ μcn,xp,μcn,yp ][ μcN,xp,μcN,yp ] ]N×2.(8)

It is important to note that the order of the particles in this matrix μcp is not arbitrary but will be used to represent the matching between point clouds of a given couple: the nth particle of one cloud being matched with the nth particle of the other cloud from the same couple.

P 2-dimensional matchings: The Optimal Transport behavior is calculated: for each coarse-HF simulation couple, each particle from the coarse simulation is matched with one particle from the high-fidelity counterpart. This is understood as the optimal matching problem between two N-particles clouds, where each particle is a 2D-Gaussian function represented by its μx and μy coordinates. This linear assignment problem, where the cost function for the pth couple, Cc,hfp, is the sum of the squared L2 distances between matched particles, can be resolved applying several algorithms. Here, the algorithm matchpairs from MATLAB is used to solve the problem [36].

Cc,hfp(ϕp)=n=1N μcϕp(n)pμhfnp 22(9)

where ϕp is a bijective function in the set of permutations of N elements. To each particle n of the distribution ρcp,ϕp: associates its new position in the sense of order in μcp.

OT-based gap: The OT-based gaps δp are calculated for all the P coarse-HF simulation couples by computing the difference between the coordinates of the matched N particles. Hence, the OT-based gap of the nth particle of the pth couple is noted:

δnp=[ δn,xp,δn,yp ]           =[ μcn,xpμhfn,xp,μcn,yppμhfn,yp ],(10)

The OT-based difference for the pth couple δp ∈ ℝN×2 is build as:

δp=[ δ1pδNp ]N×2.(11)

“Digital twin” training: The “digital twin” follows a Neural Network architecture. Indeed, two NN are trained, one for the x coordinates of the OT-based gap and another for the y coordinates. It can be noted that the parametric input space of the NN is built as follows. To each set of parameters p of the training set, we add the coordinates of the N particles of the pth “virtual twin” decomposition: μcxpN and μcypN . The new parametric space is noted W(η1,,ηq,,ηQ,μcx,μcu) . This yields a (P × N) × (Q +2) matrix X of explanatory variables. Moreover, the (P × N) × 2 matrix Y of response variables is the concatenation of the P OT-based differences δp, p∈ ⟦P⟧.

Then, the “digital twin” can be used in a partially online manner to correct a “virtual twin” coarse simulation, from which we do not have the high-fidelity counterpart. The online stage follows the next steps (colored in red in Fig. 3):

First, a new coarse simulation, computed in the parametric space W, is normalized and decomposed into N particles.

μc=[ μcx,μcy ]N×2 such that ρc         =n=1NGμcn,σ(x).(12)

Then, the “digital twin” returns the OT-based gap that needs to be added to these particles to obtain the particles of the high-fidelity counterpart. Indeed, the two NN applied over these new N inputs in W′ return the x and y components of the OT-based correction for all the particles respectively: δ ∈ ℝN×2. Adding this difference to the original particles leads to the the corrected positions of the particles (subscript cor):

μcor=μc+δ.(13)

Next, the expected high-fidelity data ρ^hf is reconstructed by summing these N corrected Gaussian functions:

ρ^hf=n=1NGμcor n,σ(x).(14)

Finally, in order to recover ψ^hf, the correction gap of the total masses of the coarse-HF simulation couples, JcJhf, is also interpolated by a NN architecture and, thus, the normalization step undone.

thumbnail Fig. 2

2D Monge problem with N=M and an= bm = 1/N. Mines and factories are represented by red and blue circles respectively. Black arrows illustrate the optimal matching.

thumbnail Fig. 3

Summary diagram of the methodology: the offline training of the “digital twin” is colored in blue and the online OT-based “hybrid twin” approach in red.

4 Results

4.1 Error evaluation

First of all, the error evaluation methodology is defined. To this purpose, let us introduce a testing data set in the parametric space W. The Ptest reference solutions, i.e. the high-fidelity data, ψhfp,pPtest  of this set are compared with the corrected coarse numerical solutions ψ^hfp,pPtest  and with the “virtual twin” solutions ψcp,pPtest . Three error metrics are here introduced: a maximum value amplitude error, a maximum value position error and a L2 – Wasserstein error. In this section, and without loss of generality, the error metrics are presented for the OT-based corrected data case.

First, the maximum value amplitude error is computed as the relative difference between the maximum value amplitude of the high-fidelity solution and the maximum value amplitude of the OT-based corrected data. Hence, the maximum value amplitude error for the pth training point writes:

εmaxp=100| max(ψhfp)max(ψ^hfp) |max(ψhfp).(15)

Next, the maximum value position error is calculated as the L2 norm between the positions in Ω of the maximum value of the high-fidelity simulation and of the maximum value of the “hybrid twin” solution. Note that this error is normalized with respect to lΩ, the length of one side of Ω:

εposp= argmaxx(ψhfp(x))argmaxx(ψ^hfp()) 2lΩ.(16)

Finally, the L2-Wasserstein metric W(ψhfp,ψ^hfp)22 is calculated between the reference ψhfp and the corrected ψ^hfp solutions. In order to calculate W(ψhfp,ψ^hfp), a Linear Programming methodology is followed.

The Kantorovich OT problem [37] corresponds to an infinite dimensional Linear Program. Indeed, given the distributions ψhfp and ψ^hfp,defined on X and Y, the problem reads

W(ψhfp,ψ^hfp)22=minπΠ(ψhfpψ^hfp)X×Yc(x,y)dπ(x,y),(17)

where c (x, y): X × Y→ ℝ is the cost function and П the set of transfer plans. The discretized measures ψhfp and ψ^hfp are defined as weighted sums of Dirac functions. The weights represent the value of the continuous measures evaluated at the corresponding nodes xi and yi of the mesh. Hence,

ψhfp=i=1D(ψhfp)iδxi and ψhfp=i=1D(ψhfp)iδyi.(18)

The discrete cost function is defined as

Ci,j=c(xi,yj)= xiyj 22.(19)

Therefore, the discrete formulation of the Kantorovich Optimal Transport problem reads

W(ψhfp,ψ^hfp)22=minπΠ(ψhfpψ^hfp)i,jCi,jπi,j,(20)

where πi,j represents the quantity of mass transported from πi towards yj. The set of transfer plans reads

Π(ψhfp,ψhfp)={ π=(πij)jπij=(ψhfp)i,                                    iπij=(ψ^hfp)j }(21)

It should be noted that, in order to analyse the three different errors, the value of the error metric presented corresponds to the mean value of the p ∈ ⟦Ptest⟧ points of the test set. Hence, the three error mean metrics are noted:

εmax,εpos  and W22.(22)

4.2 Fluid dynamics problem

In this section, the methodology developed is applied to a fluid dynamics problem. A 3D steady turbulent flow into a channel facing a backward ramp is studied, as it is illustrated in Figure 4. As indicated in (23), the fluid, considered as incompressible, has a uniform velocity profile at the inlet domain ΩInlet with a inlet velocity of vInlet.

The geometry used in this paper is closed to the Ahmed’s study geometry. The slant angle of our geometry corresponds nearly to the minimum of drag found in the Ahmed’s study geometry [38]. A non slip condition is imposed on the walls ΩWall and a zero gradient condition is imposed on the outlet section ΩOutlet. Therefore, the problem writes:

{ vv=1ρp+v2v in ΩChannel ,v=0 in ΩChannel ,v(x=0,y,z)=vInlet x in ΩInlet ,v=0 on ΩWall ,vn=0 on ΩOutlet , (23)

where ρ is the density, v the kinematic viscosity, n is the outward normal from ΩOutlet and x the elementary vector of the x axis. The turbulence model chosen is kω SST with a stepwise switch wall function:

{ ω=ωvis=6vwβ1y2  if y+ylam+ω=ωlog=kCμκy  if y+>ylam+, (24)

where ω is the specific dissipation rate, k the turbulent kinetic energy, y the wall-normal distance, Cμ and β1 model constants, vw the kinematic viscosity of fluid near wall, κthe von Karman constant, y+ the estimated wall-normal distance of the cell center in wall units and ylam+ the estimated intersection of the viscous and inertial sub-layers in wall units.

The geometry is parameterized as it can be seen in Figure 5. The numerical values chosen for each parameter are gathered in Table 1. Hence, the parameters defining the parametric space are vInlet, v and α: W(vInlet ,v,α)3. It can be noted that by determining α, and since h3 is fixed, h1 and h2 are thus also fixed by:

h2=l2tan(90α) and h1=Hh3h2.(25)

Likewise, by fixing L, l1 and l2, l3 is thus also fixed by:

l3=Ll2l1.(26)

In order to solve the problem, the channel is meshed with an hexahedral mesh. On the one hand, we mesh the geometry with a high-fidelity mesh, as it can be seen in Figure 6, for the high-fidelity data (i.e. the reference simulations). On the other hand, we mesh the geometry with a coarse mesh, as it can be seen in Figure 7, for the coarse data (i.e. the “virtual twin” simulations). It can be noted that in Figure 5, the number of the nodes per edge for the coarse and HF meshes are indicated in red and blue respectively.

The Computational Fluid Dynamics OpenFOAM code is used to solve both finite volumes problems. The SIMPLE solver is selected to solve the Navier-Stokes equations. The convergence of the simulations are assured by controlling the residual convergence. Moreover, the norm of the velocity field is monitored on a plane of interest PoI perpendicular to the channel at x = lPoI, as it is represented in Figure 4. In order to compare the high-fidelity simulation with the coarse one, the norm of two velocity fields are presented. First, a longitudinal cut of the channel is shown in Figure 8 for both the HF mesh (top) and the coarse one (bottom). It can be noticed that the flow behavior (recirculation bubble) of the Ahmed body is observed [38]. Indeed, here, for a low angle a, the flow remains attached over the ramp. Then, a perpendicular cut of the channel at the plane of interest PoI is shown in Figure 9 for both the HF mesh (left) and the coarse one (right). It can be noted that the poor mesh leads to different physical results with respect to the high-fidelity mesh.

Here, the high-fidelity results, and their coherence with the fine mesh and the numerical model chosen, are analyzed form a physical point of view. First, the dimensionless wall distance parameter y+ is analyzed. One can interpret y+ as a local Reynolds number, such as its magnitude can be expected to determine the relative importance of viscous and turbulment processes. This parameter writes

y+=uτzv with uτ=τwρ and τw=ρν(dUdz)wall (27)

where uτ is the friction velocity, z the absolute distance from the wall, U the velocity magnitude and τw the wall shear stress.

Turbulence models are usually not solved at the wall. Indeed, at some distance to the wall, boundary conditions are applied following the so called wall functions. Depending on the turbulence model used, different wall functions are applied. Hence, depending on the turbulence model and the corresponding wall function, a value of y+ at the wall mesh cell is required. As indicated before, here the k − ω SST model is applied using a wall function. In such a case, the OpenFOAM documentation recommends a y+ value between 30 and 300. If we calculate the y+ value across our parametric space, as it is shown for one solution of the problem in Figure 10, we get values from 50 to 150, which validates our high-fidelity mesh.

Then, the boundary layer velocity profiles are presented in Figure 11. The x coordinate of the velocity field is represented along a straight line, perpendicular to the wall, through the boundary layer for different x positions. On the one hand, upstream of the ramp, it can be observed how the boundary layer grows and merges with the opposite wall boundary layer leading to a fully developed velocity profile. On the other hand, over the ramp, it can be observed how the boundary layer thickness grows progressively til reaching a separation point where a negative x coordinate of the velocity field appears, leading to the boundary layer separation and to a flow recirculation zone.

Next, the x coordinate of the velocity field is analyzed with a contour plot, as it can be seen in Figure 12. The darkest blue contour zone represents the negative values of the field. Again, the flow recirculation over the ramp can be observed. Moreover, a big flow recirculation, downstream of the ramp, due to the previous flow detachment, can easily be seen. Further downstream, it can be observed how the flow reattaches to the wall.

Finally, the flow is studied using streamlines colored by the x coordinate of the velocity field, as it is presented in Figure 13 for 2 different solutions of the problem. On the top solution, both recirculation zones can again be observed. Moreover, the plane of interest, where the coarse-CFD solution is corrected, is represented by a violet line. It should be noted that, depending on the solution of the problem, the recirculation zone can be included or not in this plane of interest.

The resolution of the problems with both meshes is now presented in the plane of interest PoI (i.e. x= lPoI = 100 cm) for the test data set, as it can be seen in Figure 14. It can be observed that the flow behavior depends on the mesh. Indeed, a difference in the amplitude, position and shape of the velocity field can clearly be seen.

Then, the trained OT-based “digital twin” is applied to the “virtual twin” simulations of the test set. The database is composed by 100 simulations: 90 for the training and 10 for the test. Note that the training set is sampled following a classic Latin Hypercube Sampling strategy [39,40]. The results are presented in Figure 15. It can observed that the OT-based correction leads to a solution very close to the high-fidelity data. In order to quantify the accuracy improvement, the original coarse simulations and the corrected ones are compared with the high-fidelity data. The three error metrics are presented in the Table 2. It can be observed that the original ignorance gap between the “virtual twin” and the high-fidelity simulations is considerably reduced thanks to the OT-based ignorance model.

thumbnail Fig. 4

Problem geometry schema.

thumbnail Fig. 5

Parameterized geometry: the number of the nodes per edge for the coarse and HF meshes are indicated in red and blue respectively.

Table 1

Numerical values for the geometrical parameters.

thumbnail Fig. 6

High-fidelity mesh.

thumbnail Fig. 7

Coarse mesh.

thumbnail Fig. 8

Norm of the velocity field in a longitudinal cut of the channel for both the HF mesh (top) and the coarse one (bottom).

thumbnail Fig. 9

Norm of the velocity field in a perpendicular cut of the channel at the plane of interest PoI for both the HF mesh (left) and the coarse one (right).

thumbnail Fig. 10

Contour plot of the y+ coefficient for a given solution of the problem. The high-fidelity mesh is plotted over the contour plot.

thumbnail Fig. 11

Boundary layer velocity profiles at different x values for a given solution of the problem: (a) upstream of the ramp and (b) over the ramp.

thumbnail Fig. 12

Contour plot of the x coordinate of the velocity field for a given solution of the problem.

thumbnail Fig. 13

Streamlines of the flow, colored by the intensity of the x coordinate of the velocity field, for 2 different solutions of the problem. The plane of interest is represented by a violet straight line at x = lPoI.

thumbnail Fig. 14

Test set data points: (a) OpenFOAM solutions of the problem (23) corresponding to the high-fidelity simultaion. (b) OpenFOAM solutions of the problem (23) corresponding to the coarse simulation.

thumbnail Fig. 15

“Hybrid twin” results in the test set: “virtual twin” simulations corrected by the OT-based “digital twin”.

Table 2

Mean error metrics for the coarse data and the OT-based HT corrected data when compared with the reference high-fidelity data in the test set.

5 Conclusion

FD is today a widely used tool for scientific investigation and analysis of complex systems in engineering, allowing a numerical resolution of realistic turbulent flows which cannot be solved analytically. Although all the work done to improve its accuracy and cost, CFD simulations remain today either not to be trusted completely or too computationally expensive to run in many industrial contexts. The “hybrid twin” rationale brings a solution to this problem by allowing a correction of fast but inaccurate coarse simulations, bringing them close to precise but costly high-fidelity simulations. However, in fields such as fluid dynamics, filling the ignorance gap of the mesher by classical data-driven models leads to non physical results. Combining the “hybrid twin” rationale with the simplified Optimal Transport Monge problem, our approach leads to an OT-based “digital twin” able to correct “virtual twin” coarse simulations from an OT point of view. This OT-based “hybrid twin” methodology was already presented in a previous paper of the authors in order to correct the simulation-experiment ignorance gap. In this new paper, the same approach has been applied to a completely different scenario, correcting the coarse-HF simulation ignorance gap. Therefore, the previously proposed OT-based “hybrid twin” methodology is shown to be able to correct coarse numerical simulations giving solutions very close to the high-fidelity counterpart data and leading, thus, to a faster and cheaper access to data almost as much accurate as precise but costly numerical solutions.

Acknowledgments

We thank Angelo Pasquale of PIMM Laboratory at Arts et Métiers Institute of Technology for providing helpful assistance and advice on the CFD simulations with OpenFOAM. The research work was carried out at Stellantis as part of a CIFRE (Conventions Industrielles de Formation par la REcherche) thesis.

Funding

This research received no external funding.

Conflicts of interest

The authors have nothing to disclose.

Data availability statement

The data presented in this study are available on request from the corresponding author.

Author contribution statement

Conceptualization, S.T., V.C., A.A., V.H., and F.C.; Methodology, S.T., V.C., A.A. and F.C.; Software, S.T. V.C. and A.A.; Validation, S.T.; Formal Analysis, S.T. and V.C.; Investigation, S.T. and V.C.; Resources, V.H. and F.C.; Data Curation, S.T. and V.C.; Writing—Original draft preparation, S.T.; Writing— Review and editing, S.T., V.C., A.A., V.H., and F.C.; Visualization, S.T.; Supervision, V.H. and F.C.; Project administration, V.H. and F.C.; Funding acquisition, V.H. and F.C.; All authors have read and agreed to the published version of the manuscript.

References

  1. T. Hey, S. Tansley, K. Tolle, The fourth paradigm: dataintensive scientific discovery (Microsoft Research, Redmond, 2009) [Google Scholar]
  2. F. Chinesta, E.G. Cueto, E. Abisset-Chavanne, J.L. Duval, F. El Khaldi, Virtual, digital and hybrid twins: a new paradigm in data-based engineering and engineered data, Arch. Comput. Methods Eng. (2019) [Google Scholar]
  3. Q. Chatenet, A. Tahan, M. Gagnon, J. Chamberland-Lauzon, Numerical model validation using experimental data: application of the area metric on a Francis runner, IOP Conf. Ser.: Earth Environ. Sci. 49 (2016) [Google Scholar]
  4. C.J. Freitas, The issue of numerical uncertainty, Appl. Math. Modell. 26, 237–248 (2002) [CrossRef] [Google Scholar]
  5. B. Liu, J. Tang, H. Huang, X.Y. Lu, Deep learning methods for super-resolution reconstruction of turbulent flows, Phys. Fluids 32, 025105 (2020) [CrossRef] [Google Scholar]
  6. S.L. Brunton, B.R. Noack, Closed-loop turbulence control: progress and challenges, Appl. Mech. Rev. 67, 050801 (2015) [CrossRef] [Google Scholar]
  7. P. Moin, K. Mahesh, Direct numerical simulation: a tool in turbulence research, Annu. Rev. Fluid Mech. 30, 539–578 (1998) [CrossRef] [Google Scholar]
  8. G. Calzolari, W. Liu, Deep learning to replace, improve, or aid cfd analysis in built environment applications: a review, Build. Environ. 206, 108315 (2021) [CrossRef] [Google Scholar]
  9. J.W. Deardorff, A numerical study of three-dimensional turbulent channel flow at large Reynolds numbers, J. Fluid Mech. 41, 453–480 (1970) [CrossRef] [Google Scholar]
  10. S.B. Pope, Turbulent Flows (IOP Publishing, 2001) [Google Scholar]
  11. M.J. Berger, M.J. Aftosmis, D. Marshall, S.M. Murman, Performance of a new cfd flow solver using a hybrid programming paradigm, J. Parallel Distrib. Comput. 65, 414–423 (2005) [CrossRef] [Google Scholar]
  12. P.R. Spalart, A.V. Garbaruk, A new “λ 2” term for the Spalart-Allmaras turbulence model, active in axisymmetric flows, Flow Turbul. Combust. 107, 1–12 (2021) [Google Scholar]
  13. M. Milano, P. Koumoutsakos, Neural network modeling for near wall turbulent flow, J. Comput. Phys. 182, 1–26 (2002) [CrossRef] [Google Scholar]
  14. S. Luo, M. Vellakal, S. Koric, V. Kindratenko, J. Cui, Parameter identification of rans turbulence model using physics-embedded neural network, Lecture Notes in Computer Science. Springer International Publishing (2020) pp. 137–149 [Google Scholar]
  15. S. Yarlanki, B. Rajendran, H. Hamann, Estimation of turbulence closure coefficients for data centers using machine learning algorithms, in 13th InterSociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (2012) [Google Scholar]
  16. J. Ling, A. Kurzawski, J. Templeton, Reynolds averaged turbulence modelling using deep neural networks with embedded invariance, J. Fluid Mech. 155–166 (2016) [CrossRef] [MathSciNet] [Google Scholar]
  17. B.D. Tracey, K. Duraisamy, J.J. Alonso, A machine learning strategy to assist turbulence model development, in 53rd AIAA Aerospace Sciences Meeting (2015) [Google Scholar]
  18. D. Hintea, J. Brusey, E. Gaura, A study on several machine learning methods for estimating cabin occupant equivalent temperature, in Proceedings of the 12th International Conference on Informatics in Control, Automation an Robotics (2015) [Google Scholar]
  19. T. Zhang, X. You, Applying neural networks to solve the inverse problem of indoor environment, Indoor Built Environ. 23, 1187–1195 (2013) [Google Scholar]
  20. Z. Deng, C. Hea, Y. Liu, K.C. Kim, Super-resolution reconstruction of turbulent velocity fields using a generative adversarial network-based artificial intelligence framework, Phys. Fluids 31 (2019) [Google Scholar]
  21. K. Fukami, K. Fukagata, K. Taira, Machine-learning-based spatio-temporal super resolution reconstruction of turbulent flows, J. Fluid Mech. 909 (2021) [CrossRef] [Google Scholar]
  22. M. Aliakbari, M. Mahmoudi, P. Vadasza, A. Arzani, Predicting high-fidelity multiphysics data from low-fidelity uid ow and transport solvers using physics-informed neural networks, Int. J. Heat Fluid Flow (2022) [Google Scholar]
  23. H. Gao, L. Sun, J.-X. Wang, Super-resolution and denoising of fluid flow using physics-informed convolutional neural networks without high-resolution labels, Phys. Fluids 33 (2021) [Google Scholar]
  24. H. Bao, J. Feng, N. Dinh, H. Zhang, Deep learning interfacial momentum closures in coarse-mesh cfd two-phase flow simulation using validation data, Int. J. Multiphase Flow 135, 103489 (2021) [CrossRef] [Google Scholar]
  25. B.N. Hanna, N.T. Dinh, R.W. Youngblood, I.A. Bolotnov, Machine-learning based error prediction approach for coarse-grid computational fluid dynamics (CG-CFD), Prog. Nucl. Energy 118, 103140 (2020) [CrossRef] [Google Scholar]
  26. C. Villani, Optimal Transport, Old and New (Springer, 2006) [Google Scholar]
  27. S. Torregrosa, V. Champaney, A. Ammar, V. Herbert, F. Chinesta, Surrogate parametric metamodel based on optimal transport, Math. Comput. Simulat. (2021) [Google Scholar]
  28. S. Torregrosa, V. Champaney, A. Ammar, V. Herbert, F. Chinesta, Hybrid twins based on optimal transport, Comput. Math. Appl. 127, 12–24 (2022) [CrossRef] [MathSciNet] [Google Scholar]
  29. W.L. Oberkampf, S.M. De Land, B.M. Rutherford, K.V. Diegert, K.F. Alvin, Error and uncertainty in modeling and simulation, Reliab. Eng. Syst. Saf. 75, 333–357 (2002) [CrossRef] [Google Scholar]
  30. S. Lind, B. Rogers, P. Stansby, Review of smoothed particle hydrodynamics: towards converged lagrangian flow modelling, Proc. R. Soc. A 476 (2020) [Google Scholar]
  31. A. Pinkus, N-widths in approximation theory, Springer Science and Business Media, 7 (2012) [Google Scholar]
  32. A. Kolmogoroff, Uber die beste annaherung von funktionen enier gegebenen funktionenklasse, Ann. Math. 107–110 (1936) [CrossRef] [MathSciNet] [Google Scholar]
  33. G. Peyré, M. Cuturi, Breaking the kilmogorov barrier with nonlinear model reduction, Notic. Am. Math. Soc. 69, 725–733 (2022) [Google Scholar]
  34. G. Peyré, M. Cuturi, Computational optimal transport, Found. Trends Mach. Learn. 11, 355–607 (2019) [CrossRef] [Google Scholar]
  35. G. Monge, Memoire sur la théorie des déblais et des remblais, Histoire de L'Academie Royale Des Sciences de Paris (1781), pp. 666–704 [Google Scholar]
  36. I.S. Duff, J. Koster, On algorithms for permuting large entries to the diagonal of a sparse matrix, SIAM J. Matrix Anal. Appl. 22, 973–996 (2000) [Google Scholar]
  37. L. Kantorovich, On the transfer of masses (in Russian), Doklady Akademii Nauk 37, 227–229 (1942) [Google Scholar]
  38. S.R. Ahmed, G. Ramm, G. Faltin, Some salient features of the time-averaged ground vehicle wake, SAE Technical Paper, 840300 (1984) [Google Scholar]
  39. M.D. McKay, W.J. Conover, R.J. Beckman, A comparison of three methods for selecting values of input variables in the analysis of output from a computer code, Technometrics 21, 239–245 (1979) [MathSciNet] [Google Scholar]
  40. M. Stein, Large sample properties of simulations using latin hypercube sampling, Technometrics 29, 143–151 (1987) [CrossRef] [MathSciNet] [Google Scholar]

Cite this article as: S. Torregrosa, V. Champaney, A. Ammar, V. Herbert, F. Chinesta, Predicting high-fidelity data from coarse-mesh computational fluid dynamics corrected using hybrid twins based on optimal transport, Mechanics & Industry 25, 31 (2024), https://doi.org/10.1051/meca/2024023

All Tables

Table 1

Numerical values for the geometrical parameters.

Table 2

Mean error metrics for the coarse data and the OT-based HT corrected data when compared with the reference high-fidelity data in the test set.

All Figures

thumbnail Fig. 1

Discrete OT formulation for N= 4 mines and M=3 factories. The resource produced by the mines is: a1 =3, a2 =1, a3 = 1 and a4 = 2. The resource consumed by the factories is: b1 =4, b2 = 1 and b3 = 2. The Euclidean distance traveled by the resource is the cost to minimize.

In the text
thumbnail Fig. 2

2D Monge problem with N=M and an= bm = 1/N. Mines and factories are represented by red and blue circles respectively. Black arrows illustrate the optimal matching.

In the text
thumbnail Fig. 3

Summary diagram of the methodology: the offline training of the “digital twin” is colored in blue and the online OT-based “hybrid twin” approach in red.

In the text
thumbnail Fig. 4

Problem geometry schema.

In the text
thumbnail Fig. 5

Parameterized geometry: the number of the nodes per edge for the coarse and HF meshes are indicated in red and blue respectively.

In the text
thumbnail Fig. 6

High-fidelity mesh.

In the text
thumbnail Fig. 7

Coarse mesh.

In the text
thumbnail Fig. 8

Norm of the velocity field in a longitudinal cut of the channel for both the HF mesh (top) and the coarse one (bottom).

In the text
thumbnail Fig. 9

Norm of the velocity field in a perpendicular cut of the channel at the plane of interest PoI for both the HF mesh (left) and the coarse one (right).

In the text
thumbnail Fig. 10

Contour plot of the y+ coefficient for a given solution of the problem. The high-fidelity mesh is plotted over the contour plot.

In the text
thumbnail Fig. 11

Boundary layer velocity profiles at different x values for a given solution of the problem: (a) upstream of the ramp and (b) over the ramp.

In the text
thumbnail Fig. 12

Contour plot of the x coordinate of the velocity field for a given solution of the problem.

In the text
thumbnail Fig. 13

Streamlines of the flow, colored by the intensity of the x coordinate of the velocity field, for 2 different solutions of the problem. The plane of interest is represented by a violet straight line at x = lPoI.

In the text
thumbnail Fig. 14

Test set data points: (a) OpenFOAM solutions of the problem (23) corresponding to the high-fidelity simultaion. (b) OpenFOAM solutions of the problem (23) corresponding to the coarse simulation.

In the text
thumbnail Fig. 15

“Hybrid twin” results in the test set: “virtual twin” simulations corrected by the OT-based “digital twin”.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.