Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a tutorial to access NASA CMR STAC api using pystac & stackstac #102

Closed

Conversation

srmsoumya
Copy link

In this tutorial, we learn to

  1. Select an AOI using leafmap
  2. Pull STAC items for HLS data from NASA CMR STAC API using pystac_client
  3. Use stackstac to create lazy xarray's, filter by cloud-cover, compute monthly mosaics
  4. Visualize the change over time

@fnattino
Copy link
Contributor

fnattino commented Mar 16, 2022

Hi @srmsoumya, I really like the content of this episode! I have tried to run the code blocks, I wanted to check whether the NASA CMR STAC index could be an alternative to the EarthSearch endpoint used in the data-access episode. However, I get lots of errors like the following when calling .compute() at the very end of the episode:

RuntimeError: Error opening 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T36QUK.2019225T081947.v2.0/HLS.L30.T36QUK.2019225T081947.v2.0.B04.tif': RasterioIOError("'/vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSL30.020/HLS.L30.T36QUK.2019225T081947.v2.0/HLS.L30.T36QUK.2019225T081947.v2.0.B04.tif' not recognized as a supported file format.")

I could nail this down to rasterio not being able to open the remote assets:

import rasterio
f = rasterio.open("https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T36QUK.2021190T081611.v2.0/HLS.S30.T36QUK.2021190T081611.v2.0.B8A.tif")
---------------------------------------------------------------------------
CPLE_OpenFailedError                      Traceback (most recent call last)
File rasterio/_base.pyx:261, in rasterio._base.DatasetBase.__init__()

File rasterio/_shim.pyx:78, in rasterio._shim.open_dataset()

File rasterio/_err.pyx:216, in rasterio._err.exc_wrap_pointer()

CPLE_OpenFailedError: '/vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T36QUK.2021190T081611.v2.0/HLS.S30.T36QUK.2021190T081611.v2.0.B8A.tif' not recognized as a supported file format.

During handling of the above exception, another exception occurred:

RasterioIOError                           Traceback (most recent call last)
Input In [23], in <module>
      1 import rasterio
----> 2 f = rasterio.open("https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T36QUK.2021190T081611.v2.0/HLS.S30.T36QUK.2021190T081611.v2.0.B8A.tif")

File /opt/miniconda3/envs/geospatial/lib/python3.10/site-packages/rasterio/env.py:437, in ensure_env_with_credentials.<locals>.wrapper(*args, **kwds)
    434     session = DummySession()
    436 with env_ctor(session=session):
--> 437     return f(*args, **kwds)

File /opt/miniconda3/envs/geospatial/lib/python3.10/site-packages/rasterio/__init__.py:220, in open(fp, mode, driver, width, height, count, crs, transform, dtype, nodata, sharing, **kwargs)
    216 # Create dataset instances and pass the given env, which will
    217 # be taken over by the dataset's context manager if it is not
    218 # None.
    219 if mode == 'r':
--> 220     s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
    221 elif mode == "r+":
    222     s = get_writer_for_path(path, driver=driver)(
    223         path, mode, driver=driver, sharing=sharing, **kwargs
    224     )

File rasterio/_base.pyx:263, in rasterio._base.DatasetBase.__init__()

RasterioIOError: '/vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T36QUK.2021190T081611.v2.0/HLS.S30.T36QUK.2021190T081611.v2.0.B8A.tif' not recognized as a supported file format.

The error seems to suggest that the extension is not recognised (maybe because of the multiple dots in the file name?). Specifying the driver (driver=COG) does not help. Any clue what is going wrong here? I am working with rasterio version 1.2.10 and GDAL version 3.4.1.

@srmsoumya
Copy link
Author

@fnattino Did you set up a ~/.netrc file to access NASA CMR STAC data?

You can sign-up here & run this script to set things up.

@fnattino
Copy link
Contributor

Wonderful, seems to work indeed - thank you so much!

@rbavery
Copy link
Collaborator

rbavery commented May 24, 2022

@srmsoumya do you recall where you found the instruction to se these configs?

import os
os.environ["GDAL_HTTP_COOKIEFILE"] = "./cookies.txt"
os.environ["GDAL_HTTP_COOKIEJAR"] = "./cookies.txt"

love that you found it! this was needed for me to use the CMR STAC API

@srmsoumya
Copy link
Author

@rbavery I had my fair share of trouble trying to access NASA CMR STAC and tried multiple things.

I guess I found this from one of the tutorials: https://nasa-openscapes.github.io/2021-Cloud-Workshop-AGU/how-tos/Earthdata_Cloud__Single_File__HTTPS_Access_COG_Example.html

@rbavery
Copy link
Collaborator

rbavery commented Oct 20, 2022

I'll look to finish this in November: #82 (comment)

given this issue, we may not want to use CMR STAC as an example anymore and instead switch to something else, maybe the Sentinel-2 data. I'm not sure when the cloud cover filtering will be fixed: nasa/cmr-stac#206

@rbavery
Copy link
Collaborator

rbavery commented Aug 3, 2023

closing since this is a bit stale and any material needs to be ported to the new lesson template in #158

@rbavery rbavery closed this Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants