Skip to content

Latest commit

 

History

History
89 lines (66 loc) · 6.46 KB

app_demo.md

File metadata and controls

89 lines (66 loc) · 6.46 KB

FastAPI application demo

⭐ For background, check out the first ~7 chapters of the FastAPI tutorial.

⭐ Read about pydantic model-based validation here

⭐ Read the # comments in app.py to see how the application is set up and how the pydantic models are used.

Setup 🔧

  • build a conda environment from environment.yml (this is just the snap-geo environment with fastapi added)
conda env create -f environment.yml
  • start the application in dev mode like so:
fastapi dev app.py

OpenAPI JSON schema 📖

This is automagically generated from the code itself:

Which allows documentation to be automagically generated from that schema. It comes in two flavors (and we could also cook up our own):

Demo of query validation ⚡

⚠️ Right now, queries just return messages to test if pydantic parameter validation worked. No data is fetched yet.

Good queries ✅

Bad queries ❌

Metadata Catalog Rant 📂

This app has no 1:1 relationship between endpoints and coverages! Requests focus on the variable(s), so we are dealing with one-to-many relationships where a variable may be represented in multiple coverages.

In order for us to direct these one-to-many requests towards our resources, we need some easily searchable database of our holdings. One way of accomplishing this is to build a metadata catalog.

The goal is to have a single, structured, authoritative record of the data that we want to expose via the API that can answer the question: "Is there any data available to fulfill this request?"

Ideally, the metadata catalog should be:

  • populated programmatically directly from our holdings (via Rasdaman get capabilities and describe coverage requests?)
  • populated on-the-fly to immediately reflect changes in our holdings
  • structured to allow search of any validated request

This of course relies on there being rich metadata in the holdings themselves! Coverages may have to be re-ingested to improve metadata uniformity (i.e., use the same metadata schema for every coverage) and possibly data uniformity (e.g., use the same axis id's and datatypes for time, variables, etc.)

The holy grail 🏆

Can we add / subtract / update data in our holdings without revising the application or documentation?

This demo uses a metadata catalog mockup (in catalog.py) where the highest levels of organization are the service_category and variable. Using those parameters, requests can be validated against the metadata catalog without hard-coding any constraints in app.py. In other words, the metadata catalog items can be updated and the valid parameter ranges adjusted to the datasets without touching app.py, so long as the catalog structure is static.

  • 🍪 Try it out! Copy/paste a variable record in the metadata catalog, and revise the variable name and data ranges. You should now be able to query for that variable and recieve meaningful error messages without touching any of the code in the application.

This setup should dramatically reduce effort in bringing new resources online (or taking old ones offline), and reduce the overall number of endpoints in the API. In a way, the effort would be transferred to the maintenance of coverage metadata instead.

As for documentation, we can see how having the application translated into the OpenAPI JSON schema allows for automatic generation of API documentation pages. We could consider building our HTML documentation directly from the application's OpenAPI JSON schema in a similar way, which would also reduce effort when we update our holdings.