From 96f0080df6a1701d73fbca2bc7b6eb6b04d6dd40 Mon Sep 17 00:00:00 2001
From: BYERS Edward <byers@iiasa.ac.at>
Date: Thu, 22 Feb 2024 00:15:22 +0100
Subject: [PATCH 1/3] adding docs - WIP

---
 README.md                  | 19 ++++++++++-----
 doc/configuration.rst      | 24 +++++++++++++++++++
 doc/data_preprocessing.rst | 48 ++++++++++++++++++++++++++++++++++++++
 doc/processing_maps.rst    | 41 ++++++++++++++++++++++++++++++++
 doc/processing_tables.rst  | 29 +++++++++++++++++++++++
 pyproject.toml             |  1 +
 6 files changed, 156 insertions(+), 6 deletions(-)
 create mode 100644 doc/configuration.rst
 create mode 100644 doc/data_preprocessing.rst
 create mode 100644 doc/processing_maps.rst
 create mode 100644 doc/processing_tables.rst

diff --git a/README.md b/README.md
index 8e6cdd3..d2a119a 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,6 @@
 # RIME - Rapid Impact Model Emulator
 
-2023 IIASA
+2024 IIASA
 
 [![latest](https://img.shields.io/github/last-commit/iiasa/CWatM)](https://github.com/iiasa/CWatM)
 [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)  
@@ -15,6 +15,7 @@
 
 When accompanied by climate impacts data (table and/or maps), RIME can be used to take a global mean temperature timeseries (e.g. from an IAM or climate model like [FaIR](https://github.com/OMS-NetZero/FAIR)/[MAGICC](https://live.magicc.org/)), and return tables and maps of climate impacts through time consistent with the warming of the scenario.  
 
+*** Key use cases ***
 There are two key use-cases for the RIME approach:  
 1. **Post-process**: Estimating a suite of climate impacts from a global emissions or temperature scenario.  
 2. **Input**: Reformulating climate impacts data to be used as an input to an integrated assessment model scenario.  
@@ -49,10 +50,10 @@ Pre-processing of tabular impacts data of exposure by GWL, into netcdf datasets
 ### [`process_tabledata.py`](https://github.com/iiasa/rime/blob/main/rime/process_tabledata.py)  
 Example script that takes input table of emissions scenarios with global temperature timeseries, and output tables of climate impacts data in IAMC format. Can be done for multiple scenarios and indicators at a time. 
 
-### [`process_maps.py`](https://github.com/iiasa/rime/blob/main/rime/process_tabledata.py)  
+### [`process_maps.py`](https://github.com/iiasa/rime/blob/main/rime/process_maps.py)  
 Example script that takes input table of emissions scenarios with global temperature timeseries, and output maps of climate impacts through time as netCDF. Ouptut netCDF can be specified for either for 1 scenario and multiple climate impacts, or multiple scenarios for 1 indicator.
 
-### [`pp_combined example.ipynb`](https://github.com/iiasa/rime/blob/main/rime/pp_combined_example.py)  
+### [`pp_combined example.ipynb`](https://github.com/iiasa/rime/blob/main/rime/pp_combined_example.ipynb)  
 Example jupyter notebook that demonstrates methods of processing both table and map impacts data for IAM scenarios.
 
 ### [`test_map_notebook.html`](https://github.com/iiasa/rime/blob/main/rime/test_map_notebook.html)
@@ -61,15 +62,21 @@ Example html maps dashboard. CLick download in the top right corner and open loc
 ![image](https://github.com/iiasa/rime/assets/17701232/801e2dbe-cbe8-482f-be9b-1457c92ea23e)
 
 
-## Installation
+## Code and installation
 
 At command line, navigate to the directory where you want the installation, e.g. your Github folder.  
 
 	git clone https://github.com/iiasa/rime.git
 
-Change to the rime folder and install the package including the requirements.  
+### Using a dedicated environment (Optional but recommended)
+
+Due to the dependencies, using a dedicated python environment for using RIME is recommended in order to avoid conflicts during installation. Depending on your python installation, this can be done using venv, pyenv, pipenv, (ana/mini)-conda, mamba.
+
+### Installation
+Activate the right environment, change to the rime folder (e.g. `cd c:/Github/rime`) and install the package including the requirements.  
+
+	pip install .
 
-	pip install --editable .
 
 ## Further information
 This package is in a pre-release mode, currently work in progress, under-going testing and not formally published.  
diff --git a/doc/configuration.rst b/doc/configuration.rst
new file mode 100644
index 0000000..7f387e0
--- /dev/null
+++ b/doc/configuration.rst
@@ -0,0 +1,24 @@
+Configuring your RIME runs
+
+
+process_config.py
+=================
+
+This file is designed to configure settings and working directories for the project. It acts as a central configuration module to be imported across other scripts in the project, ensuring consistent configuration.
+
+Key Features
+------------
+
+- **Central Configuration**: Stores and manages settings and directory paths that are used throughout the project.
+- **Easy Import**: Can be easily imported with ``from process_config import *``, making all configurations readily available in other scripts.
+
+Dependencies
+------------
+
+- ``os``: For interacting with the operating system's file system, likely used to manage file paths and directories.
+
+Usage
+-----
+
+This script is not meant to be run directly. Instead, it should be imported at the beginning of other project scripts to ensure they have access to shared configurations, settings, and directory paths.
+
diff --git a/doc/data_preprocessing.rst b/doc/data_preprocessing.rst
new file mode 100644
index 0000000..ef5eb29
--- /dev/null
+++ b/doc/data_preprocessing.rst
@@ -0,0 +1,48 @@
+Pre-processing input table data
+*********************
+
+To work with table data, some pre-processing is likely required to achieve the correct formats.  
+
+The aim is to go from typically tabular or database data, into a compressed 4-D netCDF format that is used in the emulation. For a given climate impacts dataset, this pre-processing only needs to be done once for preparation, and only if working with table data. 
+
+The output netCDF has the dimensions:
+	"gmt": for the global mean temperature / warming levels, at which impacts are calculated. (float)
+	"year": for the year to which the gmt corresponds, if relevant, for example relating to exposure of a population of land cover in year x.
+	"ssp": for the Shared Socioeconomic Pathway, SSP1, SSP2, SSP3, SSP4, SSP5. (str)
+	"region": for the spatial region for the impact relates and might be aggregated to, e.g. country, river basin, region. (str)
+	
+	
+Thus, the input data table should also have these dimensions, normally as columns, and additionally one for `variable`.
+
+[example picture of IAMC input file]
+
+The script `generate_aggregated_inputs.py` gives an example of this workflow, using a climate impacts dataset in table form (IAMC-wide), and converting it into a netCDF. In this case the data also has the `model` and `scenario` columns, which are not needed in the output dataset.
+
+generate_aggregated_inputs.py
+=============================
+
+
+
+Key Features
+------------
+
+- **Data Aggregation**: Combines data from multiple files or data streams.
+- **File Operations**: Utilizes glob and os modules for file system operations, indicating manipulation of file paths and directories.
+- **Data Processing**: Imports ``xarray`` for working with multi-dimensional arrays, and ``pyam`` for integrated assessment modeling frameworks, suggesting complex data manipulation and analysis.
+
+Dependencies
+------------
+
+- ``alive_progress``: For displaying progress bars in terminal.
+- ``glob``: For file path pattern matching.
+- ``os``: For interacting with the operating system's file system.
+- ``pyam``: For analysis and visualization of integrated assessment models.
+- ``re``: For regular expression matching, indicating text processing.
+- ``xarray``: For working with labeled multi-dimensional arrays.
+- ``time``: For timing operations, possibly used in performance measurement.
+
+Usage
+-----
+
+While specific usage instructions are not provided, it's likely that the script reads from specified input files or directories, processes the data, and outputs aggregated results. Usage may require customization based on the specific data format and desired output.
+
diff --git a/doc/processing_maps.rst b/doc/processing_maps.rst
new file mode 100644
index 0000000..1cefcc0
--- /dev/null
+++ b/doc/processing_maps.rst
@@ -0,0 +1,41 @@
+Example script that takes input table of emissions scenarios with global temperature timeseries, and output maps of climate impacts through time as netCDF. Ouptut netCDF can be specified for either for 1 scenario and multiple climate impacts, or multiple scenarios for 1 indicator.
+
+This example script takes an input table of emissions scenarios along with global temperature time series and generates maps of climate impacts over time as NetCDF files. It exemplifies the application of the RIME framework to spatially resolved climate impact data, facilitating the visualization and analysis of geographic patterns in climate impacts.
+
+
+process_maps.py
+===============
+
+This script is likely involved in processing geographical data, given its name suggests map-related functionalities. It may involve operations related to spatial data and possibly climate or environmental data analysis.
+
+Key Features
+------------
+
+- **Geographical Data Processing**: Implied by the name, it might handle operations on map data, such as transforming, analyzing, or visualizing geographical information.
+- **Data Handling**: The script might deal with large datasets, considering the use of ``dask``, which is known for parallel computing and efficient data processing.
+
+Dependencies
+------------
+
+- ``dask``: For parallel computing in Python, needed for handling large datasets and efficient computation.
+
+Usage
+-----
+
+
+
+
+process_maps.py
+===============
+
+This example script takes an input table of emissions scenarios along with global temperature time series and generates maps of climate impacts over time as NetCDF files. It exemplifies the application of the RIME framework to spatially resolved climate impact data, facilitating the visualization and analysis of geographic patterns in climate impacts.
+
+Overview
+--------
+
+The script's flexibility allows for the specification of outputs either for a single scenario across multiple climate impacts or for multiple scenarios focused on a single indicator. This adaptability makes it a valuable tool for in-depth climate impact studies that require spatial analysis and visualization.
+
+Usage
+-----
+
+By processing emissions scenarios and associated temperature projections, ``process_maps.py`` produces NetCDF files that map climate impacts over time. These outputs are instrumental in visualizing the geographic distribution and evolution of climate impacts, aiding in the interpretation and communication of complex climate data.
diff --git a/doc/processing_tables.rst b/doc/processing_tables.rst
new file mode 100644
index 0000000..35abbb2
--- /dev/null
+++ b/doc/processing_tables.rst
@@ -0,0 +1,29 @@
+Example script that takes input table of emissions scenarios with global temperature timeseries, and output tables of climate impacts data in IAMC format. Can be done for multiple scenarios and indicators at a time. 
+
+
+
+
+
+process_tabledata.py
+====================
+
+This script is intended for processing table data, potentially involving large datasets given the use of Dask for parallel computing. It likely includes functionalities for reading, processing, and possibly aggregating or summarizing table data.
+
+Key Features
+------------
+
+- **Table Data Processing**: Focuses on operations related to table data, including reading, manipulation, and analysis.
+- **Parallel Computing**: Utilizes Dask for efficient handling of large datasets, indicating the script is optimized for performance.
+
+Dependencies
+------------
+
+- ``dask``: For parallel computing, particularly with ``dask.dataframe`` which is similar to pandas but with parallel computing capabilities.
+- ``dask.diagnostics``: For performance diagnostics and progress bars, providing tools for profiling and resource management during computation.
+- ``dask.distributed``: For distributed computing, allowing the script to scale across multiple nodes if necessary.
+
+Usage
+-----
+
+The script is structured to be executed directly with a ``__main__`` block. It imports configurations from ``process_config.py`` and functions from ``rime_functions.py``, suggesting it integrates closely with other components of the project. Users may need to customize the script to fit their specific data formats and processing requirements.
+
diff --git a/pyproject.toml b/pyproject.toml
index 6309cee..153fa2b 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -11,6 +11,7 @@ documentation = "https://github.com/iiasa/rime"
 version = "0.1.0"
 license = "GNU CPL v3"
 readme = "README.md"
+keywords = ""
 
 [tool.poetry.dependencies]
 python = ">=3.10, <3.11"

From 93f296c8eafe3a2724692118daddfb648b8680ef Mon Sep 17 00:00:00 2001
From: BYERS Edward <byers@iiasa.ac.at>
Date: Sat, 24 Feb 2024 20:59:38 +0100
Subject: [PATCH 2/3] edit readme

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index d2a119a..203ec1d 100644
--- a/README.md
+++ b/README.md
@@ -18,7 +18,7 @@ When accompanied by climate impacts data (table and/or maps), RIME can be used t
 *** Key use cases ***
 There are two key use-cases for the RIME approach:  
 1. **Post-process**: Estimating a suite of climate impacts from a global emissions or temperature scenario.  
-2. **Input**: Reformulating climate impacts data to be used as an input to an integrated assessment model scenario.  
+2. **Input**: Reformulating climate impacts data to be used as an input to an integrated assessment model scenario. First the scenario is run without climate impacts, to determine the emissions and global warming trajectory. Then, RIME can be used to generate climate impact-adjusted input variables for the IAM scenario.
 
 ![RIME_use_cases](https://github.com/iiasa/rime/blob/main/assets/rime_use_cases.jpg?raw=true)  
 

From 81502e8a0f20484dd02b993ee70f406c37490953 Mon Sep 17 00:00:00 2001
From: BYERS Edward <byers@iiasa.ac.at>
Date: Thu, 21 Mar 2024 00:54:25 +0100
Subject: [PATCH 3/3] minor edits to docs

---
 doc/data_preprocessing.rst | 13 ++++++-------
 doc/processing_maps.rst    | 33 +++++----------------------------
 2 files changed, 11 insertions(+), 35 deletions(-)

diff --git a/doc/data_preprocessing.rst b/doc/data_preprocessing.rst
index ef5eb29..913747d 100644
--- a/doc/data_preprocessing.rst
+++ b/doc/data_preprocessing.rst
@@ -1,12 +1,12 @@
 Pre-processing input table data
 *********************
 
-To work with table data, some pre-processing is likely required to achieve the correct formats.  
+To work with table data, some pre-processing is likely required to achieve the correct formats. 
 
-The aim is to go from typically tabular or database data, into a compressed 4-D netCDF format that is used in the emulation. For a given climate impacts dataset, this pre-processing only needs to be done once for preparation, and only if working with table data. 
+The aim is to go from typically tabular or database data, into a compressed 4-D netCDF format that is used in the emulation. For a given climate impacts dataset, this pre-processing only needs to be done once for preparation, and only if working with table data. Depending on the input dataset size, this can take some time.
 
 The output netCDF has the dimensions:
-	"gmt": for the global mean temperature / warming levels, at which impacts are calculated. (float)
+	"gwl": for the global warming levels, at which impacts are calculated. (float)
 	"year": for the year to which the gmt corresponds, if relevant, for example relating to exposure of a population of land cover in year x.
 	"ssp": for the Shared Socioeconomic Pathway, SSP1, SSP2, SSP3, SSP4, SSP5. (str)
 	"region": for the spatial region for the impact relates and might be aggregated to, e.g. country, river basin, region. (str)
@@ -16,13 +16,12 @@ Thus, the input data table should also have these dimensions, normally as column
 
 [example picture of IAMC input file]
 
-The script `generate_aggregated_inputs.py` gives an example of this workflow, using a climate impacts dataset in table form (IAMC-wide), and converting it into a netCDF. In this case the data also has the `model` and `scenario` columns, which are not needed in the output dataset.
+The script `generate_aggregated_inputs.py` gives an example of this workflow, using a climate impacts dataset in table form (IAMC-wide), and converting it into a netCDF, primarily using the function `loop_inteprolate_gwl()`. In this case the data also has the `model` and `scenario` columns, which are not needed in the output dataset.
 
 generate_aggregated_inputs.py
 =============================
 
 
-
 Key Features
 ------------
 
@@ -39,10 +38,10 @@ Dependencies
 - ``pyam``: For analysis and visualization of integrated assessment models.
 - ``re``: For regular expression matching, indicating text processing.
 - ``xarray``: For working with labeled multi-dimensional arrays.
-- ``time``: For timing operations, possibly used in performance measurement.
+- ``time``: For timing operations.
 
 Usage
 -----
 
-While specific usage instructions are not provided, it's likely that the script reads from specified input files or directories, processes the data, and outputs aggregated results. Usage may require customization based on the specific data format and desired output.
+Based on the test data, the intention here is to read in a file like `table_output_cdd_R10.xlsx` and output a file that looks like `cdd_R10.nc`
 
diff --git a/doc/processing_maps.rst b/doc/processing_maps.rst
index 1cefcc0..60c910a 100644
--- a/doc/processing_maps.rst
+++ b/doc/processing_maps.rst
@@ -1,41 +1,18 @@
-Example script that takes input table of emissions scenarios with global temperature timeseries, and output maps of climate impacts through time as netCDF. Ouptut netCDF can be specified for either for 1 scenario and multiple climate impacts, or multiple scenarios for 1 indicator.
-
-This example script takes an input table of emissions scenarios along with global temperature time series and generates maps of climate impacts over time as NetCDF files. It exemplifies the application of the RIME framework to spatially resolved climate impact data, facilitating the visualization and analysis of geographic patterns in climate impacts.
-
 
 process_maps.py
 ===============
 
-This script is likely involved in processing geographical data, given its name suggests map-related functionalities. It may involve operations related to spatial data and possibly climate or environmental data analysis.
-
-Key Features
-------------
-
-- **Geographical Data Processing**: Implied by the name, it might handle operations on map data, such as transforming, analyzing, or visualizing geographical information.
-- **Data Handling**: The script might deal with large datasets, considering the use of ``dask``, which is known for parallel computing and efficient data processing.
-
-Dependencies
-------------
-
-- ``dask``: For parallel computing in Python, needed for handling large datasets and efficient computation.
-
-Usage
------
 
 
-
-
-process_maps.py
-===============
-
-This example script takes an input table of emissions scenarios along with global temperature time series and generates maps of climate impacts over time as NetCDF files. It exemplifies the application of the RIME framework to spatially resolved climate impact data, facilitating the visualization and analysis of geographic patterns in climate impacts.
-
 Overview
 --------
 
-The script's flexibility allows for the specification of outputs either for a single scenario across multiple climate impacts or for multiple scenarios focused on a single indicator. This adaptability makes it a valuable tool for in-depth climate impact studies that require spatial analysis and visualization.
+This example script takes an input table of emissions scenarios along with global temperature time series (`emissions_temp_AR6_small.xlsx`), and input gridded climate impacts data by global warming levels (e.g. `ISIMIP2b_dri_qtot_ssp2_2p0_abs.nc`) and generates maps of climate impacts over time as NetCDF files. It exemplifies the application of the RIME framework to spatially resolved climate impact data, remapping climate impacts data by global warming level to a trajectory of global mean temperature.
 
 Usage
 -----
 
-By processing emissions scenarios and associated temperature projections, ``process_maps.py`` produces NetCDF files that map climate impacts over time. These outputs are instrumental in visualizing the geographic distribution and evolution of climate impacts, aiding in the interpretation and communication of complex climate data.
+The script's flexibility allows for the specification of outputs either for a single scenario across multiple climate impacts or for multiple scenarios focused on a single indicator. 
+
+
+By processing emissions scenarios and associated temperature projections, ``process_maps.py`` produces NetCDF files that map climate impacts over time.