a handful of root variable/*.json entries have a long_name that's specific to one PFT tile or one daily-extreme operation, even though the term itself is the generic root. branded variants then fail the wcrp_cmip7 ATTR004 long_name registry check on real files even when the file's long_name is correct for its branding.
example, current state in variable/cveg.json:
{
"id": "cveg",
"long_name": "Carbon Mass in Vegetation on Grass Tiles",
"description": "..."
}
while variable/cveggrass.json / cvegshrub.json / cvegtree.json all have "long_name": null and the per-tile text sitting in description. so a CMIP7 file with branded variable cveg_tavg-u-hxy-shb (Shrub) and long_name = "Carbon Mass in Vegetation on Shrub Tiles" (the correct text for its tile) fails the registry check because the registry says cveg.long_name = "Carbon Mass in Vegetation on Grass Tiles".
same shape affects several other root terms - all carry a long_name that belongs to a specific variant:
| root |
current long_name |
cveg |
"Carbon Mass in Vegetation on Grass Tiles" |
hurs |
"Daily Minimum Near-Surface Relative Humidity over Crop Tile" |
tas |
"Daily Minimum Near-Surface Air Temperature" |
mrsol |
"Mean soil water content at a depth of 1 m" |
gpp |
"Carbon Mass Flux out of Atmosphere Due to Gross Primary Production on Land [kgC m-2 s-1]" |
npp |
"Net Primary Production on Grass Tiles as Carbon Mass Flux [kgC m-2 s-1]" |
ra |
"Autotrophic Respiration on Shrub Tiles as Carbon Mass Flux [kgC m-2 s-1]" |
rh |
"Heterotrophic Respiration on Shrub Tiles as Carbon Mass Flux [kgC m-2 s-1]" |
proposed fix on the registry side:
- root term
long_name becomes the generic form (e.g. cveg.long_name = "Carbon Mass in Vegetation"), or is set to null like the other generic descriptors.
- tile/operation variants (
cveggrass, cvegshrub, cvegtree, hursmin, hursmax, tasmin, tasmax, etc.) get the specific long_name populated - usually the text already sitting in their description field.
- for cases where there's no separate variant term (e.g.
gpp is itself the root), generalize the long_name so it doesn't carry a tile token.
related plugin issue: cc-plugin-wcrp's variable long_name lookup uses variable_id.lower() (the root term) rather than the branded-variant id, so even after the registry is split per variant, the plugin will need to look up the variant first. happy to file a sister PR upstream once this side is settled.
observed against esgvoc cmip7@1.2.6 / WCRP-universe (registry-source DB).
a handful of root
variable/*.jsonentries have along_namethat's specific to one PFT tile or one daily-extreme operation, even though the term itself is the generic root. branded variants then fail the wcrp_cmip7 ATTR004 long_name registry check on real files even when the file'slong_nameis correct for its branding.example, current state in
variable/cveg.json:{ "id": "cveg", "long_name": "Carbon Mass in Vegetation on Grass Tiles", "description": "..." }while
variable/cveggrass.json/cvegshrub.json/cvegtree.jsonall have"long_name": nulland the per-tile text sitting indescription. so a CMIP7 file with branded variablecveg_tavg-u-hxy-shb(Shrub) andlong_name = "Carbon Mass in Vegetation on Shrub Tiles"(the correct text for its tile) fails the registry check because the registry sayscveg.long_name = "Carbon Mass in Vegetation on Grass Tiles".same shape affects several other root terms - all carry a long_name that belongs to a specific variant:
long_namecveghurstasmrsolgppnpprarhproposed fix on the registry side:
long_namebecomes the generic form (e.g.cveg.long_name = "Carbon Mass in Vegetation"), or is set tonulllike the other generic descriptors.cveggrass,cvegshrub,cvegtree,hursmin,hursmax,tasmin,tasmax, etc.) get the specificlong_namepopulated - usually the text already sitting in theirdescriptionfield.gppis itself the root), generalize thelong_nameso it doesn't carry a tile token.related plugin issue: cc-plugin-wcrp's variable long_name lookup uses
variable_id.lower()(the root term) rather than the branded-variant id, so even after the registry is split per variant, the plugin will need to look up the variant first. happy to file a sister PR upstream once this side is settled.observed against esgvoc
cmip7@1.2.6/ WCRP-universe (registry-source DB).