5.2.2. CMFD acceleration for k-eigenvalue problems

This tutorial demonstrates how to add coarse-mesh finite-difference (CMFD) acceleration to a multigroup k-eigenvalue transport solve. The runnable problem is intentionally small so the notebook finishes quickly, but the CMFD setup mirrors the workflow used by larger production inputs.

The focus is the CMFD accelerator object:

when to use CMFD,
how to construct CMFDAcceleration,
what the user-facing options mean,
which options to tune first,
how to interpret the CMFD status lines,
what to do when CMFD skips a correction or converges slowly.

The mesh, cross sections, quadrature, and partitioner are only setup details. They are included so the CMFD example is complete and executable.

5.2.2.1. Using this notebook

Before running this example, make sure that the Python module of OpenSn is installed.

To run this notebook from the terminal, simply type:

jupyter nbconvert –to python –execute .ipynb.

To run this notebook in parallel (for example, using 4 processes), simply type:

mpiexec -n 4 jupyter nbconvert –to python –execute .ipynb.

[ ]:

import math
import sys
from pathlib import Path

import numpy as np
from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.rank
size = comm.size

if rank == 0:
    print(f"Running with {size} MPI processes.")

5.2.2.2. Import OpenSn

When OpenSn is installed as a Python package, no path adjustment is needed. The small source-tree fallback below only helps when running the notebook directly from this documentation directory before installation. The mesh and cross-section files used by this tutorial are local files in this directory.

[ ]:

tutorial_dir = Path.cwd().resolve()

# Prefer a source-tree build when this notebook is executed from the documentation tree.
# This avoids accidentally importing a stale installed wheel during development.
for candidate in [tutorial_dir, *tutorial_dir.parents]:
    if (candidate / "build/pyopensn").exists():
        sys.path.insert(0, str(candidate / "build"))
        break

from pyopensn.mesh import FromFileMeshGenerator, KBAGraphPartitioner
from pyopensn.xs import MultiGroupXS
from pyopensn.aquad import GLCProductQuadrature2DXY
from pyopensn.solver import DiscreteOrdinatesProblem, PowerIterationKEigenSolver, CMFDAcceleration
from pyopensn.context import UseColor, Finalize

UseColor(False)

5.2.2.3. Fast tutorial assets

The example in this notebook is deliberately small. The tutorial case uses the pincell.obj mesh and mgxs_2B_one_eighth_SHEM-361.h5 cross-section file located next to this notebook. Keeping these assets with the tutorial makes the example self-contained while reducing the problem size enough for an interactive demonstration.

[ ]:

mesh_filepath = tutorial_dir / "pincell.obj"
xs_filepath = tutorial_dir / "mgxs_2B_one_eighth_SHEM-361.h5"

if not mesh_filepath.exists() or not xs_filepath.exists():
    raise RuntimeError(
        "This tutorial expects pincell.obj and mgxs_2B_one_eighth_SHEM-361.h5 "
        "in the same directory as cmfd.ipynb. Run the notebook from "
        "doc/source/tutorials/lbs/keigen or copy both files alongside it."
    )

if rank == 0:
    print(f"Mesh: {mesh_filepath}")
    print(f"Cross sections: {xs_filepath}")

5.2.2.4. Mesh partitioning setup

CMFD does not require a special partitioner. For this small structured lattice, we use a simple KBA-style geometric partitioner because it gives compact rank-local cell sets and predictable sweep behavior. That is enough for this tutorial.

The important CMFD point is that the default coarse mesh used below is coarse_mesh="local_aggregation": each rank aggregates its own fine cells into local CMFD coarse cells. Therefore, partition quality can affect CMFD quality indirectly, because the rank-local fine-cell chunks are the raw material for the coarse mesh. For production runs, use a partitioner that keeps cells spatially compact.

The next section explains what changes when the CMFD coarse cells are aggregated across MPI ranks instead of being kept rank-local.

5.2.2.5. Global cell aggregation across MPI ranks

The notebook uses rank-local spatial aggregation. That means each MPI rank builds CMFD coarse cells only from its own fine cells. This is simple, robust, and usually the right starting point, but it also means the coarse mesh can inherit the partition boundaries of the transport mesh.

Global cell aggregation means the spatial CMFD coarse cells are formed from the full mesh geometry, not from rank-local chunks alone. In other words, a single coarse cell may span fine cells owned by multiple MPI ranks. Conceptually, this makes the CMFD coarse mesh look more like a geometry-driven coarsening and less like a partition-driven one.

Why this matters:

Less partition sensitivity. With local aggregation, moving MPI cut lines can change the CMFD coarse mesh even when the physical problem is unchanged. Global aggregation reduces that effect.
Cleaner coarse cells. If a physically coherent region is split across ranks, global aggregation can preserve it as one coarse region instead of forcing separate rank-local pieces.
Potentially stronger CMFD correction. A coarse mesh that better follows the problem geometry can give a more consistent low-order problem, especially when local partitions are thin or fragmented.

The tradeoff is implementation and communication cost. Once a coarse cell spans multiple ranks, the CMFD data structures, current restrictions, and coarse solves all need more off-rank bookkeeping. That is why local aggregation is the standard production default: it is easier to build, easier to reason about, and often good enough when the transport partition is already spatially compact.

A practical rule of thumb is:

start with local aggregation,
use a partitioner that keeps fine-cell ownership spatially compact,
think about global aggregation only when partition boundaries are clearly degrading the CMFD coarse mesh or making performance/results overly partition-dependent.

So the distinction is not mathematical CMFD versus a different acceleration method. The difference is how the spatial coarse mesh is constructed in parallel: purely from rank-local fine-cell chunks, or from the global geometry across rank boundaries.

[ ]:

def make_spatial_partitioner(filename: Path) -> KBAGraphPartitioner:
    num_partitions = MPI.COMM_WORLD.size
    nx = math.isqrt(num_partitions)
    while num_partitions % nx != 0:
        nx -= 1
    ny = num_partitions // nx
    if nx < ny:
        nx, ny = ny, nx

    xmin = ymin = float("inf")
    xmax = ymax = -float("inf")
    with open(filename, encoding="utf-8") as mesh_file:
        for line in mesh_file:
            if not line.startswith("v "):
                continue
            _, x, y, *_ = line.split()
            x = float(x)
            y = float(y)
            xmin = min(xmin, x)
            xmax = max(xmax, x)
            ymin = min(ymin, y)
            ymax = max(ymax, y)

    xcuts = [xmin + (xmax - xmin) * i / nx for i in range(1, nx)]
    ycuts = [ymin + (ymax - ymin) * i / ny for i in range(1, ny)]
    return KBAGraphPartitioner(nx=nx, ny=ny, xcuts=xcuts, ycuts=ycuts)

meshgen = FromFileMeshGenerator(
    filename=str(mesh_filepath),
    partitioner=make_spatial_partitioner(mesh_filepath),
)

grid = meshgen.Execute()
grid.SetOrthogonalBoundaries()

5.2.2.6. Cross sections and material map

The small pin-lattice mesh has four material blocks. A large production problem will likely have more material regions, but the workflow is identical: load each material from the HDF5 cross-section file and map mesh block IDs to MultiGroupXS objects explicitly.

[ ]:

h5_mat_names = [
    "fuel",
    "fuel clad",
    "fuel gap",
    "moderator",
]

xs_dict = {}
xs_list = []
for name in h5_mat_names:
    xs = MultiGroupXS()
    xs.LoadFromOpenMC(str(xs_filepath), name, 294.0)
    xs_dict[name] = xs
    xs_list.append(xs)

num_groups = xs_list[0].num_groups
scattering_order = 3
xs_mapping = [{"block_ids": [block_id], "xs": xs} for block_id, xs in enumerate(xs_list)]

if rank == 0:
    print(f"Loaded {len(xs_list)} materials with {num_groups} energy groups.")

5.2.2.7. Angular quadrature and transport settings

The notebook uses a light quadrature so that the example runs quickly:

n_polar = 2
n_azimuthal = 4

For production calculations, use a more resolved product quadrature. The lower quadrature here is for workflow demonstration, not production-quality eigenvalues.

The groupset uses PETSc GMRES for the WGS update solve. In the CMFD workflow, the number of WGS updates between CMFD corrections is controlled by CMFDAcceleration(update_wgs_max_its=...).

[ ]:

n_polar = 2
n_azimuthal = 4

pquad = GLCProductQuadrature2DXY(
    n_polar=n_polar,
    n_azimuthal=n_azimuthal,
    scattering_order=scattering_order,
)

group_sets = [
    {
        "groups_from_to": [0, num_groups - 1],
        "angular_quadrature": pquad,
        "inner_linear_method": "petsc_gmres",
        "l_abs_tol": 1.0e-7,
        "l_max_its": 300,
    }
]

bound_conditions = [
    {"name": "xmin", "type": "reflecting"},
    {"name": "xmax", "type": "reflecting"},
    {"name": "ymin", "type": "reflecting"},
    {"name": "ymax", "type": "reflecting"},
]

5.2.2.8. Build the transport problem

The transport problem is the high-order solve that CMFD accelerates. CMFD does not replace this solve; it applies a low-order scalar-flux correction after each transport update.

For a CMFD run, the transport problem should normally be configured the same way you would configure the corresponding PI problem:

use the same mesh, materials, boundary conditions, scattering order, and quadrature,
use a stable WGS method such as PETSc GMRES for difficult multigroup problems,
set save_angular_flux=False unless another workflow needs angular fluxes,
keep outer iteration logging enabled while tuning CMFD.

In this notebook, verbose_outer_iterations=True is useful because it prints the PI status and the CMFD convergence check. verbose_inner_iterations=False keeps the tutorial output short.

[ ]:

phys = DiscreteOrdinatesProblem(
    mesh=grid,
    num_groups=num_groups,
    groupsets=group_sets,
    xs_map=xs_mapping,
    boundary_conditions=bound_conditions,
    options={
        "verbose_inner_iterations": False,
        "verbose_outer_iterations": True,
        "use_precursors": False,
        "power_default_kappa": 1.0,
        "save_angular_flux": False,
    },
)

5.2.2.9. Create the CMFD accelerator

The CMFD accelerator is created separately from the transport problem and then passed to PowerIterationKEigenSolver through the acceleration argument.

Conceptually, each power iteration does this:

perform a high-order transport update,
restrict the updated scalar flux and transport currents to a coarse mesh and coarse energy grid,
assemble and solve a low-order CMFD k-eigenvalue problem,
prolong a bounded scalar-flux correction back to the transport unknowns,
allow outer convergence only when both k change and the CMFD balance check are satisfied.

The next cell is the part of the notebook to copy into production inputs.

5.2.2.9.1. Options most users should tune

current_closure="auto": CMFD needs a face-current closure for the low-order operator. "auto" starts from net current and probes whether net, partial current, or a blend is more appropriate. This is the recommended starting point. Fixed "net" and "partial" are useful for sensitivity studies or if auto is poor for a specific problem.
aggregation_size: Target fine cells per spatial CMFD coarse cell. Larger values make the CMFD solve cheaper but less detailed. Smaller values make the correction more detailed but increase coarse-solve cost. For small tutorial problems, use a small value such as 8. For larger problems, 16 to 64 is a reasonable first scan.
group_aggregation_size: Number of fine energy groups per CMFD coarse energy group. This is not the final number of coarse groups. To target N coarse groups, use (num_groups + N - 1) // N. The notebook targets 16 coarse groups for speed; production CASL runs often use 16 to 32 coarse groups.
relaxation: Requested strength of the scalar-flux correction. 0.5 is conservative. Larger values can reduce iterations but may trigger damping or skipped corrections. Smaller values are safer but weaker.
update_wgs_max_its: Number of high-order WGS update iterations before each CMFD correction. The production CMFD workflow normally starts with 1. Increasing this can stabilize hard cases, but each power iteration becomes more expensive.
balance_residual_tolerance: Additional CMFD convergence check based on restricted transport-current balance. This prevents convergence based only on a small k change when the low-order balance is still inconsistent with transport. The notebook uses a loose value so it runs quickly. Production comparisons should tighten it deliberately, often starting near 10*k_tol for 2D problems and loosening only after verification.

5.2.2.9.2. Options usually left at defaults

coarse_mesh="local_aggregation": Rank-local spatial aggregation. This is the production mode.
coarse_solver_policy="direct": Direct coarse solve. This is robust for small or moderate coarse systems. For large 3D problems, "auto" or iterative PETSc options may be required to avoid excessive memory use.
update_wgs_abs_tol=1.0e-12: With update_wgs_max_its=1, this tolerance usually does not stop the WGS solve; the iteration limit does. It remains useful if you increase update_wgs_max_its.

[ ]:

k_tol = 1.0e-4
target_coarse_groups = 16

cmfd = CMFDAcceleration(
    problem=phys,
    current_closure="auto",
    coarse_mesh="local_aggregation",
    aggregation_size=8,
    group_aggregation_size=(num_groups + target_coarse_groups - 1) // target_coarse_groups,
    relaxation=0.5,
    update_wgs_max_its=1,
    update_wgs_abs_tol=1.0e-12,
    balance_residual_tolerance=1.0e-1,
    coarse_solver_policy="direct",
)

k_solver = PowerIterationKEigenSolver(
    problem=phys,
    acceleration=cmfd,
    k_tol=k_tol,
)

5.2.2.10. Execute the CMFD-accelerated solve

The PI output contains two pieces of convergence information:

PI iteration = ..., k_eff = ..., k_eff_change = ...
CMFD convergence check = ..., transport-current balance residual = ...

The solve is converged only when the outer k tolerance and the CMFD balance check are both satisfied.

With update_wgs_max_its=1, WGS = iteration_limit is expected. It does not mean the calculation failed; it means the high-order update intentionally performed one WGS iteration before the CMFD correction.

[ ]:

k_solver.Initialize()
k_solver.Execute()

[ ]:

keff = k_solver.GetEigenvalue()
if rank == 0:
    print(f"CMFD k-eigenvalue = {keff:.8f}")

5.2.2.11. CMFD diagnostics and contingencies

CMFD is designed to avoid applying corrections it does not trust. If a correction is skipped, the power iteration uses the unaccelerated transport update for that iteration and continues.

5.2.2.11.1. Coarse linear solve did not converge

Message:

correction = skipped (coarse_linear_solve_not_converged)

Meaning: an iterative low-order coarse solve failed to converge, so CMFD refused to use its correction.

What to do:

For modest coarse systems, use coarse_solver_policy="direct".
For large systems, increase max_iters or provide stronger petsc_options.
Reduce the coarse-system size by increasing aggregation_size or group_aggregation_size.
Watch whether skipped corrections persist. Occasional early skips are less concerning than repeated skips after the flux shape has stabilized.

5.2.2.11.2. Negative-flux guard rejected the correction

Message:

correction = skipped (negative_flux_guard)

Meaning: every damping attempt produced a scalar-flux update that violated the negative-flux guard.

What to do:

Reduce relaxation, for example from 0.5 to 0.25.
Try current_closure="net" if auto selected partial current too aggressively.
Increase update_wgs_max_its to make the transport update cleaner before CMFD is applied.
Consider a few inactive_iterations only as a diagnostic. Do not use inactive iterations to hide persistent instability.

5.2.2.11.3. k looks converged but CMFD keeps iterating

Meaning: k_eff_change is below k_tol, but the restricted transport-current balance residual is still above balance_residual_tolerance.

What to do:

First check that the balance residual is decreasing. If it is, CMFD is doing the right thing.
For a tutorial or exploratory run, a looser balance_residual_tolerance is acceptable.
For production k comparisons, tighten it and verify against PI or a trusted reference.
In large 3D cases, 10*k_tol may be too strict; choose a value that is verified for the application rather than blindly forcing the residual to a tiny number.

5.2.2.11.4. CMFD is slower than PI

Check these in order:

Is the coarse system too large? Increase aggregation_size or group_aggregation_size.
Are corrections being skipped? Fix the skip reason first.
Is update_wgs_max_its too large? Start with 1.
Is the closure poor? Compare auto, net, and partial on a short run.
Is the balance check much tighter than needed for the requested k accuracy?

5.2.2.12. Applying the same CMFD setup to production problems

For large production problems, the CMFD construction is the same. The main changes are the mesh, material list, and production quadrature. Treat aggregation_size, target coarse groups, relaxation, and balance_residual_tolerance as the first production tuning knobs. Leave developer/debug controls alone unless the log gives a concrete reason to change them.

5.2.2.13. Finalize Jupyter runs

In script mode PyOpenSn finalizes automatically. In an interactive Jupyter kernel, explicit finalization is needed. The event hook below follows the convention used by the other OpenSn tutorial notebooks.

[ ]:

MPI.COMM_WORLD.Barrier()

from IPython import get_ipython


def finalize_env():
    Finalize()
    MPI.Finalize()


ipython_instance = get_ipython()
if ipython_instance is not None:
    ipython_instance.events.register("post_execute", finalize_env)