Handle OCO-2 data in satellite transformation
Summary
I implemented sampling satellite observations with the wrfchem plugin and used OCO-2 as sample data. I had to adapt the 'satellites' transformation to work with OCO-2. Since this is not wrf-specific, I put the modifications in the new branch "feature-oco2" for merging them into master. This issue is for reviewing, commenting and testing. Specifically, the changes should be tested with another model than WRF and in adjoint mode (I only tested the forward mode). Below I describe the major modifications and an example configuration for testing.
Modifications
New formula in apply_AK.py
The formula for OCO-2 is:
y0 = (pwf*qa0).sum(axis=0) + (pwf*aks*(sim_ak - qa0)).sum(axis=0)
Here, all variables are already used in other formulas except pwf
, which is the pressure weighting function. It is similar to dpavgs
, but is normalized to sum to 1.0. I implemented this by including the pressure weighting function in the observation files and, in forward.py, assigning it to dpavgs
if it is present in the datastore. I think this is better than making a new variable, because their meaning is very similar, and I didn't want to clutter apply_AK.py with an argument that is only used by one formula.
Of course this changes the value range of dpavgs
, from pressures to fractions. As far as I can tell, this doesn't break anything: In apply_AK.py, dpavgs
is only used in formula 1 and normalizing shouldn't have an impact there. Apart from that, dpavgs
is only used in the computation of scale_pthick
in forward.py. I must say I don't quite understand the logic behind scale_pthick, but I did a test and it was close to 1.0 with both the original dpavgs
and pwf
values, and the correlation between the two versions was 0.9. The correlation is not exactly 1.0 because the relative contributions of the surface and toa level to dpavgs
were different in my example. But the way it looks, I think scale_pthick
works with these different values as well.
Nonetheless, if this application of dpavgs
is not favored, I could also just implement pwf
as a new variable and pass it to the functions in apply_AK.py as a new argument.
Handle profiles defined on pressure boundaries, not pressure midpoints
The CIF code so far assumes that retrieval profiles (prior, averaging kernel) are defined on pressure layers (or pressure layer midpoints). Pressures are given on the boundaries, so the pressure axis (pavg0
) has one more level than prior profile (qa0
) and averaging kernel (aks
). However, the OCO-2 retrieval works differently. Here, the kernel and prior profile are defined on the pressure boundaries (see O'Dell et al. 2012). So here, pavg0
, qa0
and aks
all have the same number of levels.
I modified the satellites' forward.py and adjoint.py to distinguish and handle the two different cases of the retrieval grid. The distinction happens by the vertical dimension of the averaging kernel in the observation file, which was so far always level
, but can now also be level_pressure
, the vertical dimension of pavg0
.
Of course the case when the retrieval lives on the pressure boundaries requires that the pressure weighting function is provided in the observation file, because the normal dpavgs
is constructed such that it is defined on layers, not layer boundaries. If the pressure weighting function is missing in this case, a NotImplementedError
is raised.
To keep the code legible, I replaced variables coords0
and coords1
with coords_pb
(Pressure Boundaries) and coords_ret
(RETrieval) in forward.py and adjoint.py. Furthermore, I renamed pavgs
to pavgs_pb
and pavgs_mid
to pavgs_ret
.
Example
Here is a sample OCO-2 monitor file. The corresponding sample section in the yml is:
datavect:
plugin:
name: standard
version: std
components:
satellites:
parameters:
CO2:
dir: ~/run_cif/
file: 'monitor_oco2_testv01.nc'
# Mandatory arguments:
formula: 6
molmass: nan
# Optional arguments:
# chosenlev: 0
# correct_pthick: false
# cropstrato: false
# extend_surf: false
# fill_strato: false
nchunks: 1
pressure: hPa