Merge pull request #477 from ACCESS-Hive/dev/sven/intake

shorter MED intake catalog description with showcase
ACCESS-NRI · Jul 28, 2023 · e3964bf · e3964bf
2 parents 7c1a3fe + 112f829
commit e3964bf
Show file tree

Hide file tree

Showing 3 changed files with 33 additions and 45 deletions.
diff --git a/docs/assets/model_evaluation/accessnri_intake.png b/docs/assets/model_evaluation/accessnri_intake.png
diff --git a/docs/assets/model_evaluation/intake_example.png b/docs/assets/model_evaluation/intake_example.png
diff --git a/docs/model_evaluation/model_evaluation_model_catalogs/index.md b/docs/model_evaluation/model_evaluation_model_catalogs/index.md
@@ -1,59 +1,47 @@
 # ACCESS-NRI intake Model Catalog
 
-ACCESS-NRI is hosting a number of calculated models for you through National Computational Infrastructure (NCI) storage.
+The ACCESS-NRI intake catalog aims to provide a way for Python users to discover and load data across a broad range of climate data products available on the Australian NCI supercomputer Gadi. For detailed information, tutorials and more, please go to the
+<div class="card-container">
+    <a href="https://access-nri-intake-catalog.readthedocs.io/en/latest/index.html" class="aspect1to2-card default-text-color">
+        <div class="squared-card-image-container">
+            <img src="../../assets/model_evaluation/accessnri_intake.png" alt="ACCESS-NRI intake catalog documentation"></img>
+        </div>
+        <div class="squared-card-text-container bold">Documentation</div>
+    </a>
+</div>
 
-We have set up an [ACCESS-NRI intake Catalog](https://github.com/ACCESS-NRI/access-nri-intake-catalog) package that allows you to easily search and load the model data on this storage.
-The premise of this ACCESS-NRI intake Catalog is to provide a ("meta") catalog of intake-esm ("sub") catalogs, which each correspond to different "experiments".
+## What is the ACCESS-NRI intake Model Catalog?
 
-## The ACCESS-NRI intake catalog
+The ACCESS-NRI catalog is essentially a table of climate data products that exist on Gadi. Each entry in the table corresponds to a different product, and the columns contain attributes associated with each product–things like the models, frequencies and variables available. Users can search on the attributes to find the products that might be useful to them. For example, a user might want to know which data products contain variables X, Y and Z at monthly frequency. The ACCESS-NRI catalog enables users to find products that satisfy their query and to subsequently load their data without having to know the location and structure of the underlying files.
 
-To have the huge amount of data from different experiments on the NCI storage at the palm of your hand, we provide a ("meta") catalog for you to query via python as part of the `#!python intake` package with our curated catalog plugin `#!python intake.cat.access_nri` .
+## Showcase: use intake to easily find, load and plot data
 
-``` py
-import intake
-access_nri_catalog_sections = intake.cat.access_nri
-```
-
-To use this catalog, you need access to NCI's Gadi. Check out our [Get Started with ACCESS at NCI](../model_evaluation_getting_started/index.md)   guide on how to get access.
+In this showcase, we'll demonstrate one of the simplest use-cases of the ACCESS-NRI intake catalog: a user wants to plot a timeseries of a variable from a specific data product. Here, the variable is a scalar ocean variable called "temp_global_ave" and the product is an ACCESS-OM2 run called "025deg_jra55_iaf_omip2_cycle1".
 
-Once logged in to Gadi, you will need to add the `#!python access-nri-catalog` to your `#!python conda` environments and start an [ARE JupyterLab Session](https://are.nci.org.au/pun/sys/dashboard). Check out our [ACCESS-NRI Intake Catalog](https://github.com/ACCESS-NRI/access-nri-intake-catalog/blob/main/docs/getting_started/index.rst) guide  for the specific setup (note that you can only read in data from specific experiments if they are loaded through the *Storage* keyword).
+First we load the catalog using
 
-Once your JupyterLab session started, you can access the `#!python intake` catalog to load the data. Take a look at this [Tutorial](https://github.com/ACCESS-NRI/access-nri-intake-catalog/blob/main/docs/how_tos/example_usage.ipynb) .
-
-## Example Search with our intake catalog
-
-``` py
-# Impport packages for searching/loading/plotting
+```python
 import intake
-from distributed import Client
-import matplotlib.pyplot as plt
-
-# The search process is a 2-step one
-# Comparable with searching for a book in a library:
-# 1) You look for the right book/catalog sections
-# 2) You look for the right book/catalog in the these sections
-
-# Load the ACCES-NRI list of catalogs for available experiment data
-# Similar to an overview of library section
-access_nri_catalog_sections = intake.cat.access_nri
+catalog = intake.cat.access_nri
+```
 
-# Perform a search for names, models, variables etc.
-example_section_search = access_nri_catalog_sections.search(name="cmip6_oi10")
+Now we can load and plot available datasets of the variable "temp_global_ave" from the product "025deg_jra55_iaf_omip2_cycle1" using
 
-# Once you are sufficiently happy with your search, you can load the "section"
-catalog_sections = access_nri_catalog_sections.search(name="025deg_jra55_iaf_omip2_cycle1").to_source()
-# and start looking for the right catalogs of interest
-catalogs_of_interest = catalog_sections.search(filename="ocean_scalar.*")
+```python
+import matplotlib.pyplot as plt
 
-# Call the client that allows use load the data efficiently
-client = Client(threads_per_worker=1)
-client.dashboard_link
+dataset_dict = catalog["025deg_jra55_iaf_omip2_cycle1"].search(
+    variable="temp_global_ave"
+).to_dataset_dict()
 
-# Actually load the data
-experiment_data = catalogs_of_interest.to_dataset_dict(progressbar=False)
+# `dataset_dict` contains two xarray Datasets, one at daily frequency and one at monthly
+dataset_dict["ocean_scalar_snapshot.1day"]["temp_global_ave"].plot(label="daily")
+dataset_dict["ocean_scalar.1mon"]["temp_global_ave"].plot(label="monthly")
+plt.title("")
+plt.legend()
+plt.grid()
+```
 
-# Et voilà, you have loaded the data and can start plotting
-experiment_data["ocean_scalar_snapshot.1day"]["temp_global_ave"].plot(label="daily")
-experiment_data["ocean_scalar.1mon"]["temp_global_ave"].plot(label="monthly")
-_ = plt.legend()
-```
+<div style="text-align: center;">
+    <img src="../../assets/model_evaluation/intake_example.png" alt="Plot af timeseries of global average temperatures" width="50%"/>
+</div>