Adding a variable resolution grid to CAM via CIME
Introduction
A recent version of the Community Atmosphere Model added limited support for the Model for Prediction Across Scales (MPAS) atmospheric dynamical core. MPAS is a dynamical core with support for a non-hydrostatic equation set built in. It is based on a C-grid discretization on Spherical Centroidal Voronoi Tesselation (SCVT) horizontal mesh. As such it is designed to support variable resolution grids when it is run in its own codebase. The CIME infrastructure that makes creating new cases and component sets necessarily limits the number of horizontal grid spacings that you can use.
This post is designed to show you how to add a new variable resolution MPAS grid to CIME. I have used this for aquaplanet runs, but I have not tried AMIP runs yet.
In order to run this you will regrettably need access to a working standalone version of the MPAS-A model available here. I am working on a containerized version that is easy to run on a laptop here which is currently a work in progress.
Step -1: install MPAS
For the time being, I have instructions on how I installed MPAS on the Great Lakes Cluster located at the University of Michigan.
If you have access to NCAR's Cheyenne, then your life is much easier.
In order to compile MPAS on Cheyenne as of 2022-04-11, you can use the following scripts:
setup.mpas.sh
module unload netcdfmodule load intel/19.1.1 mpt/2.22 module load netcdf-mpi/4.7.4 pnetcdf/1.12.1 pio/2.5.2
and
build.mpas.sh
make ifort CORE=init_atmosphere PRECISION=single USE_PIO2=truemake clean CORE=atmosphere make ifort CORE=atmosphere PRECISION=single USE_PIO2=true
- Before you proceed, run
source ${SOURCE_TO_SCRIPT}/setup.mpas.sh
. - Download the MPAS code mentioned above.
- Change directories into the root of the repository.
- Run the commands in order listed in
build.mpas.sh
.
Step 0: generating the horizontal grid
Follow the instructions in my other post.
At this point I assume that you have a folder referred to by the variable ${GRID_DIR}
, and that
your grid has a unique prefix referred to in the variable ${GRID_PREFIX}
. This
prefix needs to uniquely identify your grid files so that they don't overlap with other MPAS files
in ${GRID_DIR}
.
In these instructions I will be using my particular use case as an example.
On Cheyenne, GRID_DIR="${HOME}/grids/${GRID_PREFIX}/"
and GRID_PREFIX="x4.92067"
where x4
refers
to a 4x grid refinement in a band near the equator and the resulting grid has 92067
horizontal cells.
I'm going to call this grid mpasa120-30
because on a full-radius grid it has a maximum grid spacing
of 120km and a minimum grid spacing of 30km.
I make the assumption that you have generated the following files:
${GRID_DIR}/${GRID_PREFIX}.grid.nc
contains a 2d representation of your MPAS grid connectivity.- Several files
${GRID_DIR}/${GRID_PREFIX}.graph.info.part.${NPROC}
where${NPROC}
includes any number of MPI processes with which you want to run an MPAS model. It should probably be a multiple of the number of physical cores in a node within your cluster (or threads if your system has [HT](https://en.wikipedia.org/wiki/Hyper-threading, but I've had mixed success with this). ${GRID_DIR}/${GRID_PREFIX}.esmf.coupling.nc
which will be needed by the coupler within CAM.
Step 1: Generating a 3D grid with metric terms
Follow one of the MPAS tutorials Use their tutorials to find an atmospheric configuration that matches the run you want to do (I'll explain what I mean in a moment).
Include a concrete example of what to do here Make sure to look in namelist files for referenced grid files and change them to your custom generated mesh files. I would recommended symlinking them to your case directory with
ln -s ${GRID_DIR}/${GRID_PREFIX}.grid.nc ${MPAS_CASE_DIR}
ln -s ${GRID_DIR}/$GRID_PREFIX}.graph.info.part.${NPROC} ${MPAS_CASE_DIR}
Continue until you call mpirun -n ${NPROC} ./init_atmosphere_model
which generates a NetCDF file
that contains a dimension called nVertLevels.
I'm going to assume that this file is moved to ${GRID_DIR}/${GRID_PREFIX}.init.nc
.
Vital considerations:
As of 2022-04-11 the CAM MPAS implementation does not understand how to generate 3d metric terms for
MPAS. My understanding is that it knows how to correctly overwrite prognostic fields like edge wind flux
components. However, because MPAS uses height as a vertical coordinate, generation of
terrain following coordinates is rather involved.
As such, any topography that you intend to include in your CAM run must be in
your ${GRID_DIR}/${GRID_PREFIX}.init.nc
Similarly your vertical level number and location will be set by creating this file. Ensure it matches the CAM vertical level configuration that you want to use.
Todo: expand section about how to specify different vertical configurations.
Modifying CIME!
Modifications to ${CAM_ROOT}/bld/namelist_files/namelist_defaults_cam.xml
analytic initialization grid files
In this file there are lines reading something like
<ncdata hgrid="mpasa120" nlev="32" analytic_ic="1" phys="cam_dev" >atm/cam/inic/mpas/mpasa120_L32_topo_coords_c201022.nc</ncdata>
Here mpasa120
refers to an approximately grid with a nominal grid spacing of 120km.
These files are used in the case that you are starting from analytic initial conditions, such as
the UMJS Moist Baroclinic Wave
which can be used to spin up aquaplanets. The purpose of these files is
to provide vertical levels and associated metric terms for such a run.
I make use of the file that I generated using the standalone version of MPAS, and make the following definition
<ncdata hgrid="mpasa120-30" nlev="32" analytic_ic="1" >/glade/u/home/owhughes/grids/x4.92067/x4.92067.init.nc</ncdata>
problem: bnd_topo
grid file
The definition
<bnd_topo hgrid="mpasa120" >atm/cam/topo/mpas/mpas_120_nc3000_Co060_Fi001_MulG_PF_Nsw042_c200921.nc</bnd_topo>
sets the location of a topography file. I don't know how to generate this file yet.
drydep_srf_file
The definition
<drydep_srf_file hgrid="mpasa120">atm/cam/chem/trop_mam/atmsrf_mpasa120_c090720.nc</drydep_srf_file>
defines land use fields that are required when doing an aquaplanet run for some reason.
For the sake of my current runs I create a dummy file which has the correct dimensions but all
fields are identically 0. Assuming that you have access to the xarray
and numpy
python packages (
e.g. by installing miniconda and running conda install xarray
)
then you can use the following python script:
create_srf.py
import xarray as xrimport numpy as np
ncol = 92067 # modify this to match your grid! nclass = 11 nmonth = 12 fraction_landuse = np.zeros((nclass, ncol)) soilw = np.zeros((nmonth,ncol)) ds = xr.Dataset( { "fraction_landuse": (["class", "ncol"], fraction_landuse), "soilw": (["month", "ncol"], soilw), } )
ds.to_netcdf(path="atmsrf_mpasa120-30.nc")
I then add the following definition to the xml file:
<drydep_srf_file hgrid="mpasa120-30">/glade/u/home/owhughes/grids/stub_data/atmsrf_mpasa120-30.nc</drydep_srf_file>
(note that I have put this in a separate directory from ${GRID_DIR}
).
Default timestep and diffusion:
You will find a line similar to
<mpas_dt hgrid="mpasa120" > 900.0D0 </mpas_dt>
This sets the dynamics timestep for the MPAS dynamical core.
You will need to set ATM_NCPL
using ./xmlchange
so that the total physics timestep
(i.e. 86400/ATM_NCPL
) is an integer multiple of mpas_dt
. Otherwise CAM will throw a
runtime error right at the beginning of the run.
This time step is chosen to obey numerical stability conditions (namely the CFL condition).
For practical purposes, 900.0
seconds is sufficient for a grid. When you halve the
horizontal grid spacing, you must also halve the time step. Accordingly, for a grid with
minimum grid spacing
,
a timestep of 200 seconds is sufficient. I thus add the line
<mpas_dt hgrid="mpasa120-30" > 200.0D0 </mpas_dt>
MPAS's default diffusion scheme requires only one parameter which is set via a line like
<mpas_len_disp hgrid="mpasa120">120000.0D0</mpas_len_disp>
This value is set to the minimum grid spacing in meters, and thus we add
<mpas_len_disp hgrid="mpasa120-30">30000.0D0</mpas_len_disp>
Block decomposition files:
You will find a line similar to
<mpas_block_decomp_file_prefix hgrid="mpasa120">atm/cam/inic/mpas/mpasa120.graph.info.part.</mpas_block_decomp_file_prefix>
Here we leverage the processor decompositions we created when we generated our grid (mind you,
if you want to change the number of processors you're using for your run, you must ensure that
you have a suitable file in your ${GRID_DIR}
directory.) I thus add the line:
<mpas_block_decomp_file_prefix hgrid="mpasa120-30">glade/u/home/owhughes/grids/x4.92067/x4.92067.graph.info.part.</mpas_block_decomp_file_prefix>
Modifications to ${CAM_ROOT}/cime/config/cesm/config_grids.xml
Note we have changed xml files
Defining a grid alias
In this file you will find a snippet of text similar to
<model_grid alias="mpasa120_mpasa120" not_compset="_POP"><grid name="atm">mpasa120</grid> <grid name="lnd">mpasa120</grid> <grid name="ocnice">mpasa120</grid> <mask>gx1v7</mask> </model_grid>
This defines the grids on which different components are defined. So far I have only run aquaplanet simulations which are fairly robust to grid changes. The following definitions work for running aquaplanet configurations but may break if you try to run AMIP. Test whether this is true lol. I add the following lines for my custom grid:
<model_grid alias="mpasa120-30_mpasa120-30" not_compset="_POP"><grid name="atm">mpasa120-30</grid> <grid name="lnd">mpasa120-30</grid> <grid name="ocnice">mpasa120-30</grid> <mask>gx1v7</mask> </model_grid>
This will allow us to do something like the following when we're creating a new case:
${SRC_DIR}/cime/scripts/create_newcase --compset "${COMPSET}" --res "mpasa120-30_mpasa120-30" --case ${CASE_DIR}/${CASE_NAME} --run-unsupported --project ${PROJECT} --pecount ${NPROCS}
which uses the alias we set in the xml snippet above.
Creating a grid specification
We find a snippet of text like
<domain name="mpasa120"><nx>40962</nx> <ny>1</ny> <mesh driver="nuopc">$DIN_LOC_ROOT/share/meshes/mpasa120z32_ESMFmesh_cdf5_c20210120.nc</mesh> <desc>MPAS-A 120-km quasi-uniform mesh:</desc> </domain>
The mesh
specification is what will force us to use the ${GRID_PREFIX}.esmf.coupling.nc
that we
created above. Because MPAS is an unstructured grid, the only dimension you
need to specify is the nx
tag, which contains the number of columns. In my case I add
the following text:
<domain name="mpasa120-30"><nx>92067</nx> <ny>1</ny> <mesh driver="nuopc">/glade/u/home/owhughes/grids/x4.92067/x4.92067.esmf.coupling.nc</mesh> <desc>MPAS-A 120km-30km variable resolution grid:</desc> </domain>
Step 4 creating and running a case
The following process was used to create, build, and run an aquaplanet configuration in CAM on 2022/04/11. I use the defensive bash scripting style to create my case in a repeatable way.
create_case.sh
export CESMDATAROOT=$SCRATCH/CESM_INPUTreadonly CESM_PREFIX="CAM_Jan22_var_res" readonly COMPSET="2000_CAM40_SLND_SICE_DOCN%AQP1_SROF_SGLC_SWAV" readonly GRID_NAME="mpasa120-30_mpasa120-30" readonly GROUP_NAME="589_final_project" readonly CASE_ID="AQP_spinup" readonly PROJECT="UMIC0087" readonly NPROCS="288"
# ================= readonly CASE_DIR="${HOME}/CESM_CASE_DIRS/${CESM_PREFIX}_CASES/${CESM_PREFIX}/${GROUP_NAME}" readonly SRC_DIR="${HOME}/CESM_SRC_DIRS/${CESM_PREFIX}" readonly CASE_NAME="${CESM_PREFIX}.${GRID_NAME}.${COMPSET}.${GROUP_NAME}.${CASE_ID}" readonly BUILD_ROOT="/glade/scratch/${USER}/$CASE_NAME" # =================
change_cesm() { local cesm_prefix=$1 rm "${HOME}/cesm_src" rm "${HOME}/cesm_cases" local src_dir="${HOME}/CESM_SRC_DIRS/$cesm_prefix/" if is_dir ${src_dir}; then ln -s ${src_dir} "${HOME}/cesm_src" else echo "${src_dir} does not exist!" exit fi local case_dir="${HOME}/CESM_CASE_DIRS/${cesm_prefix}_CASES/" if is_dir ${case_dir}; then ln -s ${case_dir} "${HOME}/cesm_cases" else echo "${case_dir} does not exist!" exit fi
}
handle_case_exists() { local abs_case_dir=$1 cat << EOF Case directory ${abs_case_dir} already exists!
Cowardly refusing to overwrite it :(
EOF exit 1
}
create_case(){ is_dir "${CASE_DIR}/${CASE_NAME}" </span> && handle_case_exists "${CASE_DIR}/${CASE_NAME}" yes r | ${SRC_DIR}/cime/scripts/create_newcase --compset "${COMPSET}" --res ${GRID_NAME} --case ${CASE_DIR}/${CASE_NAME} --run-unsupported --project ${PROJECT} --pecount ${NPROCS} }
create_xml_config_helper() { local config_script="$CASE_DIR/${CASE_NAME}/xml_config_helper.sh" cat << EOF > $config_script
# ------------------
readonly DEBUG=FALSE
# ------------------ # set this to true if you want an archive directory
readonly DOUT_S=FALSE
#------------------- # valid options for this value are: # nsteps, nseconds, nminutes, nhours, ndays, nmonths, nyears
readonly STOP_OPTION=nmonths
# ------------------
readonly STOP_N=24
readonly ATM_NCPL=288
# ------------------ # Set the frequency at which files for restarting the model (e.g. in the # case of a blowup) are written. If TRUE, make sure to set REST_N
readonly REST_OPTION=nmonths readonly REST_N=1
# ==================== # CAM build flag (note: this is quite finicky and causes compiler errors if done wrong)
readonly CAM_CONFIG_OPTS="--phys cam4 --aquaplanet --analytic_ic --nlev=30"
# -------------------- # Maximum runtime:
readonly HOURS=11 readonly MINUTES=50 readonly SECONDS=00
# --------------------
main() { ./xmlchange DEBUG=${DEBUG},DOUT_S=${DOUT_S},STOP_OPTION=${STOP_OPTION},STOP_N=${STOP_N},REST_OPTION=${REST_OPTION},REST_N=${REST_N},ATM_NCPL=${ATM_NCPL} </span> && ./xmlchange --file env_build.xml --id CAM_CONFIG_OPTS --val "${CAM_CONFIG_OPTS}" </span> && ./xmlquery CAM_CONFIG_OPTS </span> && ./xmlchange JOB_WALLCLOCK_TIME=${HOURS}:${MINUTES}:${SECONDS} </span> && ./case.setup
}
main EOF }
create_user_nl_cam() { local config_script="$CASE_DIR/${CASE_NAME}/user_nl_cam" cat << EOF > $config_script ncdata = '/glade/u/home/owhughes/grids/x4.92067/x4.92067.init.deep.nc' mpas_block_decomp_file_prefix = "/glade/u/home/owhughes/grids/x4.92067/x4.92067.graph.info.part." drydep_srf_file="/glade/u/home/owhughes/grids/stub_data/atmsrf_mpasa120-30.nc" mpas_dt = 100.0 omega = 0.00014584 rearth = 3185500.0 mpas_len_disp = 15000 analytic_ic_type = 'moist_baroclinic_wave_dcmip2016' empty_htapes = .TRUE. NDENS = 2,2 fincl1 = 'SST:A','PHIS:A','PS:A','T:A','U:A','V:A','OMEGA:A','Q:A','ATMEINT:A','CLDICE:A','CLDLIQ:A','CLOUD:A','CLDTOT:A','PRECL:A','PRECC:A','PRECT:A','TMQ:A','TMCLDICE:A','TMCLDLIQ:A','SHFLX:A','LHFLX:A','QFLX:A','RELHUM:A','FLUT:A','U200:A','V200:A' fincl2 = 'PS:I','SHFLX:I','LHFLX:I','FLUT:I','TMQ:I','CLDTOT:I','PRECC:I','PRECT:I','U200:I','V200:I' MFILT = 280, 721 NHTFRQ = -720, -24 inithist = 'ENDOFRUN'
EOF
}
create_preview_namelists_helper() { local config_script="$CASE_DIR/${CASE_NAME}/preview_namelists_helper.sh" cat << EOF > $config_script ./preview_namelists > preview_namelists.log 2> preview_namelists.err EOF }
create_case_build_helper() { local config_script="$CASE_DIR/${CASE_NAME}/case_build_helper.sh" echo "qcmd -A ${PROJECT} -- ${CASE_DIR}/${CASE_NAME}/case.build" > $config_script }
main() { change_cesm ${CESM_PREFIX} create_case </span> && create_xml_config_helper </span> && create_user_nl_cam </span> && create_preview_namelists_helper </span> && create_case_build_helper
}
# ============================================================================ # begin namelist # ============================================================================
# ===========================================================================
# ============================================================================ # begin boilerplate # ============================================================================
readonly PROGNAME=$(basename $0) readonly PROGDIR=$(readlink -m $(dirname $0)) readonly ARGS="$@"
is_empty() { local var=$1
<span style="color: #f92672">[[</span> -z <span style="color: #f8f8f2">$var</span> <span style="color: #f92672">]]</span>
}
is_not_empty() { local var=$1
<span style="color: #f92672">[[</span> -n <span style="color: #f8f8f2">$var</span> <span style="color: #f92672">]]</span>
}
is_file() { local file=$1
<span style="color: #f92672">[[</span> -f <span style="color: #f8f8f2">$file</span> <span style="color: #f92672">]]</span>
}
is_link() { local var=$1
<span style="color: #f92672">[[</span> <span style="color: #e6db74">`</span><span style="color: #f8f8f2">test</span> -L <span style="color: #f8f8f2">$1</span><span style="color: #e6db74">`</span> <span style="color: #f92672">]]</span>
}
is_dir() { local dir=$1
<span style="color: #f92672">[[</span> -d <span style="color: #f8f8f2">$dir</span> <span style="color: #f92672">]]</span>
}
# ================================================================= # end boilerplate # =================================================================
main
Where you navigate to the case directory and run the following in order:
bash xml_config_helper.shbash case_build_helper.sh ./case.submit
Regridding output to a lat-lon grid:
Download the following repository and build the convert_mpas
tool.
The tl;dr of how to use this tool is
${ABSOLUTE_PATH_TO_CONVERT_MPAS}/convert_mpas ${GRID_DIR}/${GRID_PREFIX}.init.nc ${ABSOLUTE_PATH_TO_NETCDF}/output_on_mpas_grid.nc
The first file specifies all of the metric terms and topography that were used for your run. The second file is the output data.
After this command is finished it will have created a file called latlon.nc
which you can view with your NetCDF
viewer of choice.
If you run this command multiple times in the same directory, make sure to run rm latlon.nc
(or some similar command) in between. Even if the files have the same structure, calling the command again
will silently overwrite latlon.nc
if it exists. If the variables or dimensions
of the file you're trying to regrid changes, it will do something very strange. I think it only regrids
the variables that exist in the existing latlon.nc
file along the dimensions that exist.