CFC2023

Generic high-performance coupling of massively parallel CFD solver to neural network inferences

  • Serhani, Anass (CERFACS)
  • Lapeyre, Corentin (CERFACS)
  • Staffelbach, Gabriel (CERFACS)

Please login to view abstract download link

Scientific machine learning aims to associate data-driven and physics-based models, and is a promising avenue to improve high-fidelity computational fluid dynamics (CFD). Notably, Deep learning (DL) has eg. been shown to exceed the accuracy of state-of-art physical closure models, and the emergence of general-purpose computing on Graphics Processing Units (GPUs) allows efficient inference. However, exploiting DL models within massively parallel CFD solvers presents both performance and implementation challenges. Such solvers can run reliably on thousands of compute cores, and are often written in compiled languages such as C, C++ and Fortran to achieve good resource usage and parallel scaling. Conversely, deep learning models are commonly designed and trained in interpreted languages such as Python, which offers leading machine learning platforms such as TensorFlow, PyTorch and Jax. What's more, data representation in CFD solvers (irregular meshes) differs from that of leading DL technologies (pixels and voxels). Connecting these two worlds is therefore not straightforward. This work explores a coupling strategy to enable both simple and effective coupling between these programming frameworks, showcased in a prototype solver named AVBP-DL, an extension of the Fortran-based CFD solver AVBP which simulates compressible reacting turbulent flows on unstructured grids. AVBP-DL offers two operating modes: (i) coupling with interpolation, which is called for when the DL model has its own distinct mesh which differs from that of the CFD solver. The CWIPI library performs communications and interpolations. This mode has been demonstrated to model flame-turbulence sub-grid scale interactions using a convolutional U-Net architecture. (ii) direct coupling, which is used when the Python DL model is able to manipulate the solver data directly. The communications rely on classical MPI subroutines. This mode was showcased by a study replacing algebraic wall-models by predictions from a a mesh-graph-net architecture.