Handle OCO-2 data in satellite transformation

Summary

I implemented sampling satellite observations with the wrfchem plugin and used OCO-2 as sample data. I had to adapt the 'satellites' transformation to work with OCO-2. Since this is not wrf-specific, I put the modifications in the new branch "feature-oco2" for merging them into master. This issue is for reviewing, commenting and testing. Specifically, the changes should be tested with another model than WRF and in adjoint mode (I only tested the forward mode). Below I describe the major modifications and an example configuration for testing.

Modifications

New formula in apply_AK.py

The formula for OCO-2 is:

y0 = (pwf*qa0).sum(axis=0) + (pwf*aks*(sim_ak - qa0)).sum(axis=0)

Here, all variables are already used in other formulas except pwf, which is the pressure weighting function. It is similar to dpavgs, but is normalized to sum to 1.0. I implemented this by including the pressure weighting function in the observation files and, in forward.py, assigning it to dpavgs if it is present in the datastore. I think this is better than making a new variable, because their meaning is very similar, and I didn't want to clutter apply_AK.py with an argument that is only used by one formula.

Of course this changes the value range of dpavgs, from pressures to fractions. As far as I can tell, this doesn't break anything: In apply_AK.py, dpavgs is only used in formula 1 and normalizing shouldn't have an impact there. Apart from that, dpavgs is only used in the computation of scale_pthick in forward.py. I must say I don't quite understand the logic behind scale_pthick, but I did a test and it was close to 1.0 with both the original dpavgs and pwf values, and the correlation between the two versions was 0.9. The correlation is not exactly 1.0 because the relative contributions of the surface and toa level to dpavgs were different in my example. But the way it looks, I think scale_pthick works with these different values as well.

Nonetheless, if this application of dpavgs is not favored, I could also just implement pwf as a new variable and pass it to the functions in apply_AK.py as a new argument.

Handle profiles defined on pressure boundaries, not pressure midpoints

The CIF code so far assumes that retrieval profiles (prior, averaging kernel) are defined on pressure layers (or pressure layer midpoints). Pressures are given on the boundaries, so the pressure axis (pavg0) has one more level than prior profile (qa0) and averaging kernel (aks). However, the OCO-2 retrieval works differently. Here, the kernel and prior profile are defined on the pressure boundaries (see O'Dell et al. 2012). So here, pavg0, qa0 and aks all have the same number of levels.

I modified the satellites' forward.py and adjoint.py to distinguish and handle the two different cases of the retrieval grid. The distinction happens by the vertical dimension of the averaging kernel in the observation file, which was so far always level, but can now also be level_pressure, the vertical dimension of pavg0.

Of course the case when the retrieval lives on the pressure boundaries requires that the pressure weighting function is provided in the observation file, because the normal dpavgs is constructed such that it is defined on layers, not layer boundaries. If the pressure weighting function is missing in this case, a NotImplementedError is raised. To keep the code legible, I replaced variables coords0 and coords1 with coords_pb (Pressure Boundaries) and coords_ret (RETrieval) in forward.py and adjoint.py. Furthermore, I renamed pavgsto pavgs_pb and pavgs_mid to pavgs_ret.

Example

Here is a sample OCO-2 monitor file. The corresponding sample section in the yml is:

datavect:
  plugin:
    name: standard
    version: std
  components:
    satellites:
      parameters:
        CO2:
          dir: ~/run_cif/
          file: 'monitor_oco2_testv01.nc'

          # Mandatory arguments:
          formula: 6
          molmass: nan

          # Optional arguments:
          # chosenlev: 0
          # correct_pthick: false
          # cropstrato: false
          # extend_surf: false                                                                                                                                                                                                                                          
          # fill_strato: false                                                                                                                                                                                                                                          
          nchunks: 1
          pressure: hPa

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information