# Handle OCO-2 data in satellite transformation

## Summary

I implemented sampling satellite observations with the wrfchem plugin and used OCO-2 as sample data. I had to adapt the 'satellites' transformation to work with OCO-2. Since this is not wrf-specific, I put the modifications in the new branch "feature-oco2" for merging them into master. This issue is for reviewing, commenting and testing. Specifically, the changes should be tested with another model than WRF and in adjoint mode (I only tested the forward mode). Below I describe the major modifications and an example configuration for testing.

## Modifications

### New formula in apply_AK.py

The formula for OCO-2 is:

`y0 = (pwf*qa0).sum(axis=0) + (pwf*aks*(sim_ak - qa0)).sum(axis=0)`

Here, all variables are already used in other formulas except `pwf`

, which is the pressure weighting function. It is similar to `dpavgs`

, but is normalized to sum to 1.0. I implemented this by including the pressure weighting function in the observation files and, in forward.py, assigning it to `dpavgs`

if it is present in the datastore. I think this is better than making a new variable, because their meaning is very similar, and I didn't want to clutter apply_AK.py with an argument that is only used by one formula.

Of course this changes the value range of `dpavgs`

, from pressures to fractions. As far as I can tell, this doesn't break anything: In apply_AK.py, `dpavgs`

is only used in formula 1 and normalizing shouldn't have an impact there. Apart from that, `dpavgs`

is only used in the computation of `scale_pthick`

in forward.py. I must say I don't quite understand the logic behind scale_pthick, but I did a test and it was close to 1.0 with both the original `dpavgs`

and `pwf`

values, and the correlation between the two versions was 0.9. The correlation is not exactly 1.0 because the relative contributions of the surface and toa level to `dpavgs`

were different in my example. But the way it looks, I think `scale_pthick`

works with these different values as well.

Nonetheless, if this application of `dpavgs`

is not favored, I could also just implement `pwf`

as a new variable and pass it to the functions in apply_AK.py as a new argument.

### Handle profiles defined on pressure boundaries, not pressure midpoints

The CIF code so far assumes that retrieval profiles (prior, averaging kernel) are defined on pressure layers (or pressure layer midpoints). Pressures are given on the boundaries, so the pressure axis (`pavg0`

) has one more level than prior profile (`qa0`

) and averaging kernel (`aks`

). However, the OCO-2 retrieval works differently. Here, the kernel and prior profile are defined on the pressure boundaries (see O'Dell et al. 2012). So here, `pavg0`

, `qa0`

and `aks`

all have the same number of levels.

I modified the satellites' forward.py and adjoint.py to distinguish and handle the two different cases of the retrieval grid. The distinction happens by the vertical dimension of the averaging kernel in the observation file, which was so far always `level`

, but can now also be `level_pressure`

, the vertical dimension of `pavg0`

.

Of course the case when the retrieval lives on the pressure boundaries requires that the pressure weighting function is provided in the observation file, because the normal `dpavgs`

is constructed such that it is defined on layers, not layer boundaries. If the pressure weighting function is missing in this case, a `NotImplementedError`

is raised.
To keep the code legible, I replaced variables `coords0`

and `coords1`

with `coords_pb`

(Pressure Boundaries) and `coords_ret`

(RETrieval) in forward.py and adjoint.py. Furthermore, I renamed `pavgs`

to `pavgs_pb`

and `pavgs_mid`

to `pavgs_ret`

.

## Example

Here is a sample OCO-2 monitor file. The corresponding sample section in the yml is:

```
datavect:
plugin:
name: standard
version: std
components:
satellites:
parameters:
CO2:
dir: ~/run_cif/
file: 'monitor_oco2_testv01.nc'
# Mandatory arguments:
formula: 6
molmass: nan
# Optional arguments:
# chosenlev: 0
# correct_pthick: false
# cropstrato: false
# extend_surf: false
# fill_strato: false
nchunks: 1
pressure: hPa
```