Skip to content

Commit daa4bfe

Browse files
authored
Merge pull request #51 from nasa/DAS-2287-earthdata-varinfo-var-dim-shape
Das 2287 earthdata varinfo var dim shape
2 parents aecfcf3 + 07f2e98 commit daa4bfe

12 files changed

+5864
-79
lines changed

CHANGELOG.md

+15
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,18 @@
1+
## v3.0.2
2+
### 2025-02-07
3+
4+
This version of `earthdata-varinfo` enables support for `VariableFromDmr`
5+
variable dimension shape data in NetCDF-4 files with named dimensions
6+
and HDF-5 files with anonymous size-only dimensions.
7+
8+
### Added:
9+
* `VariableFromDmr::_get_shape()` returns dimension shape data for NetCDF-4
10+
files with named dimensions and HDF-5 files with anonymous size-only dimensions.
11+
12+
### Changed:
13+
14+
*Update DRM `unittest` to validate variable dimension shape data.
15+
116
## v3.0.1
217
### 2024-10-18
318

VERSION

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
3.0.1
1+
3.0.2

config/1.0.0/sample_config_1.0.0.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -292,7 +292,7 @@
292292
{
293293
"Applicability": {
294294
"Mission": "ICESat2",
295-
"ShortNamePath": "ATL0[3-9]|ATL1[023]",
295+
"ShortNamePath": "ATL03",
296296
"VariablePattern": "/gt[123][lr]/geolocation/.*"
297297
},
298298
"Attributes": [

docs/earthdata-varinfo.ipynb

+139-9
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@
9292
"Granules used:\n",
9393
"\n",
9494
"* [GPM_3IMERGHH](https://cmr.uat.earthdata.nasa.gov/search/concepts/G1256265181-EEDTEST.umm_json)\n",
95+
" * [OPeNDAP DMR](https://opendap.uat.earthdata.nasa.gov/collections/C1245618475-EEDTEST/granules/GPM_3IMERGHH.06:3B-HHR.MS.MRG.3IMERG.20200201-S233000-E235959.1410.V06B.HDF5.dmr.xml)\n",
9596
"* [GEDI L4A](https://cmr.uat.earthdata.nasa.gov/search/concepts/G1245557637-EEDTEST.umm_json)\n",
9697
"\n",
9798
"The granules linked to above will not be circulated with this notebook, but can be downloaded via the `GET DATA` URLs in the UMM-G records."
@@ -106,10 +107,11 @@
106107
"source": [
107108
"import json\n",
108109
"\n",
109-
"from varinfo import VarInfoFromNetCDF4\n",
110+
"from varinfo import VarInfoFromNetCDF4, VarInfoFromDmr\n",
110111
"\n",
111112
"# Update the following paths to where you have downloaded the data using the links above:\n",
112113
"gpm_granule_path = '/path/to/locally/saved/file/3B-HHR.MS.MRG.3IMERG.20200201-S233000-E235959.1410.V06B.HDF5'\n",
114+
"gpm_dmr_granule_path = '/path/to/locally/saved/file/GPM_3IMERGHH.06_3B-HHR.MS.MRG.3IMERG.20200201-S233000-E235959.1410.V06B.HDF5.dmr.xml'\n",
113115
"gedi_l4a_granule_path = '/path/to/locally/saved/file/GEDI04_A_2021216232727_O14984_01_T04304_02_002_01_V002.h5'"
114116
]
115117
},
@@ -118,7 +120,7 @@
118120
"id": "c1ef3f8a",
119121
"metadata": {},
120122
"source": [
121-
"### Instantiate VarInfoFromNetCDF4 for GPM_3IMERGHH collection:"
123+
"### Instantiate VarInfoFromNetCDF4 and VarInfoFromDmr for GPM_3IMERGHH collection:"
122124
]
123125
},
124126
{
@@ -128,7 +130,11 @@
128130
"metadata": {},
129131
"outputs": [],
130132
"source": [
131-
"gpm_imerg = VarInfoFromNetCDF4(gpm_granule_path, short_name='GPM_3IMERGHH')"
133+
"# Instantiate VarInfoFromNetCDF4\n",
134+
"gpm_imerg = VarInfoFromNetCDF4(gpm_granule_path, short_name='GPM_3IMERGHH')\n",
135+
"\n",
136+
"# Instantiate VarInfoFromDmr\n",
137+
"gpm_imerg_dmr = VarInfoFromDmr(gpm_dmr_granule_path, short_name='GPM_3IMERGHH')"
132138
]
133139
},
134140
{
@@ -146,9 +152,21 @@
146152
"metadata": {},
147153
"outputs": [],
148154
"source": [
155+
"# Get all variables from VarInfoFromNetCDF4\n",
149156
"gpm_imerg.get_all_variables()"
150157
]
151158
},
159+
{
160+
"cell_type": "code",
161+
"execution_count": null,
162+
"id": "640c4fdf",
163+
"metadata": {},
164+
"outputs": [],
165+
"source": [
166+
"# Get all variables from VarInfoFromDmr\n",
167+
"gpm_imerg_dmr.get_all_variables()"
168+
]
169+
},
152170
{
153171
"cell_type": "markdown",
154172
"id": "0f91829a",
@@ -164,14 +182,33 @@
164182
"metadata": {},
165183
"outputs": [],
166184
"source": [
185+
"# Get variables from VarInfoFromNetCDF4\n",
167186
"calibrated_precipitation = gpm_imerg.get_variable('/Grid/precipitationCal')\n",
168-
"print('Variable attributes:')\n",
187+
"print('Variable attributes from NetCDF4:')\n",
169188
"print(calibrated_precipitation.attributes)\n",
170189
"\n",
171-
"print('\\n\\nVariable references:')\n",
190+
"# Get references from VarInfoFromNetCDF4\n",
191+
"print('\\n\\nVariable references from NetCDF4:')\n",
172192
"print(calibrated_precipitation.get_references())"
173193
]
174194
},
195+
{
196+
"cell_type": "code",
197+
"execution_count": null,
198+
"id": "ee2d7dbc",
199+
"metadata": {},
200+
"outputs": [],
201+
"source": [
202+
"# Get variables from VarInfoFromDmr\n",
203+
"calibrated_precipitation_dmr = gpm_imerg_dmr.get_variable('/Grid/precipitationCal')\n",
204+
"print('Variable attributes from DMR:')\n",
205+
"print(calibrated_precipitation_dmr.attributes)\n",
206+
"\n",
207+
"# Get references from VarInfoFromDmr\n",
208+
"print('\\n\\nVariable references from DMR:')\n",
209+
"print(calibrated_precipitation_dmr.get_references())"
210+
]
211+
},
175212
{
176213
"cell_type": "markdown",
177214
"id": "0a69e2ab",
@@ -189,13 +226,29 @@
189226
"metadata": {},
190227
"outputs": [],
191228
"source": [
229+
"# Get required variables from VarInfoFromNetCDF4\n",
192230
"gpm_imerg.get_required_variables(\n",
193231
" {\n",
194232
" '/Grid/precipitationCal',\n",
195233
" }\n",
196234
")"
197235
]
198236
},
237+
{
238+
"cell_type": "code",
239+
"execution_count": null,
240+
"id": "3c5310cb",
241+
"metadata": {},
242+
"outputs": [],
243+
"source": [
244+
"# Get required variables from VarInfoFromDmr\n",
245+
"gpm_imerg_dmr.get_required_variables(\n",
246+
" {\n",
247+
" '/Grid/precipitationCal',\n",
248+
" }\n",
249+
")"
250+
]
251+
},
199252
{
200253
"cell_type": "markdown",
201254
"id": "61c8e746",
@@ -231,7 +284,8 @@
231284
"metadata": {},
232285
"outputs": [],
233286
"source": [
234-
"print('Spatial dimensions for /Grid/precipitationCal')\n",
287+
"# Spatial dimensions for /Grid/precipitationCal from VarInfoFromNetCDF4\n",
288+
"print('Spatial dimensions for /Grid/precipitationCal from VarInfoFromNetCDF4')\n",
235289
"print(\n",
236290
" gpm_imerg.get_spatial_dimensions(\n",
237291
" {\n",
@@ -240,7 +294,9 @@
240294
" )\n",
241295
")\n",
242296
"\n",
243-
"print('\\nTemporal dimensions for /Grid/precipationCal')\n",
297+
"# Temporal dimensions for /Grid/precipationCal from VarInfoFromNetCDF4\n",
298+
"\n",
299+
"print('\\nTemporal dimensions for /Grid/precipationCal from VarInfoFromNetCDF4')\n",
244300
"print(\n",
245301
" gpm_imerg.get_temporal_dimensions(\n",
246302
" {\n",
@@ -250,6 +306,34 @@
250306
")"
251307
]
252308
},
309+
{
310+
"cell_type": "code",
311+
"execution_count": null,
312+
"id": "3afe54bb",
313+
"metadata": {},
314+
"outputs": [],
315+
"source": [
316+
"# Spatial dimensions for /Grid/precipitationCal from VarInfoFromDmr'\n",
317+
"print('Spatial dimensions for /Grid/precipitationCal from VarInfoFromDmr')\n",
318+
"print(\n",
319+
" gpm_imerg_dmr.get_spatial_dimensions(\n",
320+
" {\n",
321+
" '/Grid/precipitationCal',\n",
322+
" }\n",
323+
" )\n",
324+
")\n",
325+
"\n",
326+
"# Temporal dimensions for /Grid/precipationCal from VarInfoFromDmr\n",
327+
"print('\\nTemporal dimensions for /Grid/precipationCal from VarInfoFromDmr')\n",
328+
"print(\n",
329+
" gpm_imerg_dmr.get_temporal_dimensions(\n",
330+
" {\n",
331+
" '/Grid/precipitationCal',\n",
332+
" }\n",
333+
" )\n",
334+
")"
335+
]
336+
},
253337
{
254338
"cell_type": "markdown",
255339
"id": "b187e644",
@@ -273,9 +357,21 @@
273357
"metadata": {},
274358
"outputs": [],
275359
"source": [
360+
"# Get group variables by dimensions from VarInfoFromNetCDF4\n",
276361
"gpm_imerg.group_variables_by_dimensions()"
277362
]
278363
},
364+
{
365+
"cell_type": "code",
366+
"execution_count": null,
367+
"id": "9fe25a71",
368+
"metadata": {},
369+
"outputs": [],
370+
"source": [
371+
"# Get group variables by dimensions from VarInfoFromDmr\n",
372+
"gpm_imerg_dmr.group_variables_by_dimensions()"
373+
]
374+
},
279375
{
280376
"cell_type": "markdown",
281377
"id": "8299fc18",
@@ -308,6 +404,7 @@
308404
"metadata": {},
309405
"outputs": [],
310406
"source": [
407+
"# Get UMM-Var from VarInfoFromNetCDF4\n",
311408
"from varinfo import VarInfoFromNetCDF4\n",
312409
"from varinfo.umm_var import get_all_umm_var, get_umm_var\n",
313410
"\n",
@@ -320,6 +417,26 @@
320417
"print(json.dumps(single_umm_var, indent=2))"
321418
]
322419
},
420+
{
421+
"cell_type": "code",
422+
"execution_count": null,
423+
"id": "c4f2e607",
424+
"metadata": {},
425+
"outputs": [],
426+
"source": [
427+
"# Get UMM-Var from VarInfoFromDmr\n",
428+
"from varinfo import VarInfoFromDmr\n",
429+
"from varinfo.umm_var import get_umm_var\n",
430+
"\n",
431+
"var_info_dmr = VarInfoFromDmr(gpm_dmr_granule_path, short_name='GPM_3IMERGHH')\n",
432+
"\n",
433+
"# Get single UMM-Var record by variable name:\n",
434+
"precipitation_variable_dmr = var_info_dmr.get_variable('/Grid/precipitationCal')\n",
435+
"single_umm_var_dmr = get_umm_var(var_info_dmr, precipitation_variable_dmr)\n",
436+
"\n",
437+
"print(json.dumps(single_umm_var_dmr, indent=2))"
438+
]
439+
},
323440
{
324441
"cell_type": "markdown",
325442
"id": "6fc563c3",
@@ -337,11 +454,25 @@
337454
"metadata": {},
338455
"outputs": [],
339456
"source": [
457+
"# Get all UMM-Var from VarInfoFromNetCDF4\n",
340458
"all_umm_var = get_all_umm_var(var_info)\n",
341459
"\n",
342460
"print(json.dumps(all_umm_var, indent=2))"
343461
]
344462
},
463+
{
464+
"cell_type": "code",
465+
"execution_count": null,
466+
"id": "9ab64897",
467+
"metadata": {},
468+
"outputs": [],
469+
"source": [
470+
"# Get all UMM-Var from VarInfoFromDmr\n",
471+
"all_umm_var_dmr = get_all_umm_var(var_info_dmr)\n",
472+
"\n",
473+
"print(json.dumps(all_umm_var_dmr, indent=2))"
474+
]
475+
},
345476
{
346477
"cell_type": "markdown",
347478
"id": "5cf8246b",
@@ -434,7 +565,6 @@
434565
"\n",
435566
"The list below refers to potential improvements that can be made within the core Python package to improve the schema coverage of the generated UMM-Var records:\n",
436567
"\n",
437-
"* Ensuring variables parsed from a DMR file also contain shape information, similar to current functionality of NetCDF-4 file parsing. This information is stored within a DMR in separate `<Dimension />` XML elements. This would allow UMM-Var JSON generated from DMR data to have sizes on their dimensions.\n",
438568
"* Adding suitable heuristics to the `VariableBase` class to identify vertical spatial dimensions. Currently these map to the dimension type of \"OTHER\".\n",
439569
"* Adding along- and across-track swath dimension identification heuristics to `VariableBase`.\n",
440570
"* Improving the metadata for projected horizontal spatial dimensions. These are currently mapped to a dimension type of \"OTHER\". While they can be identified within the `Variable` classes, there is not currently an applicable UMM-Var option in the `DimensionType.Type` enumeration:\n",
@@ -470,7 +600,7 @@
470600
"name": "python",
471601
"nbconvert_exporter": "python",
472602
"pygments_lexer": "ipython3",
473-
"version": "3.11.9"
603+
"version": "3.11.11"
474604
}
475605
},
476606
"nbformat": 4,

0 commit comments

Comments
 (0)