Metsi forestry simulator is a Python based forest growth and maintenance operation simulator developed in Natural Resources Institute Finland.
The simulator is a alternative state simulator operating upon forest state data. The state data is manipulated by simulator operations over a progression of time steps. The event and branching structure for simulator operations is declared as functional declaration. This declaration is used to generate a simulation event tree holding the full branching possibilities for the simulation. The event tree is evaluated with the simulator engine to produce alternative end results.
This repository is licensed under the Mozilla Public License 2.0 (MPL-2.0).
MetsiGrow is not part of this license. It is distributed under a separate Source Available – Non-Commercial license.
- Metsi (this repository): Licensed under MPL-2.0
- MetsiGrow: Licensed under a custom Non-Commercial license (see its LICENSE)
To use features that rely on MetsiGrow, you must comply with MetsiGrows’s license terms.
To get started:
- Install Python >= 3.12 for your platform.
- Install git for your platform.
- Ensure that the commands
python,pipandgitare available in your command line interface (CLI). We assume a UNIX-like shell CLI such as Git Bash for Windows users. - Initialize the project with the commands below.
git clone https://github.com/lukefi/metsi
cd metsi
pip install .
This installs the project into the site-packages of your Python deployment, using the project's pyproject.toml file.
The program is then usable from command line by simply invoking metsi.
For developer usage, application entry point is the file lukefi/metsi/app/metsi.py or the namespace package
lukefi.metsi.app.metsi.
To obtain the latest changes use the command git pull.
To be able to use forestry operations depending on R modules
- Install R runtime version >=4.1 for your platform.
- Ensure that the
Rcommand is available in CLI. - Install the
rpy2Python module with the commands below.
pip install .[rpy2]
To be able to use forestry operations depending on C sources
- Compile c sources by with
makeMakefilelocated inlukefi/metsi/forestry/c- c sources located in
c/source - Running the
Makefilegenerates ac/libfolder which includes the.dllfiles used by the operations
To access the external source code of MetsiGrow use the git submodules functionality.
git submodules init
git submodules update
MetsiGrow/LICENSE for the full license text.
This code project is divided into following python packages in the lukefi.metsi namespace.
| package | description |
|---|---|
| l.m.app | Application entry points. Side-effectful program logic. |
| l.m.sim | Simulator engine. |
| l.m.domain | Operations for the forest development simulation. |
| l.m.tests | Unit test suites for above packages. |
Dependency libraries for this project are listed in pyproject.toml.
The project contains a single application entry point. This is the lukefi/metsi/app/metsi.py.
The application implements a 5 phase pipeline. These phases are: preprocess, export_preprocess, simulate, postprocess and export. Each of
the phases can be run independetly or as sequences, as long as their logical order for input data structuring is
preserved. Each application phase is inclued in python configuration file (default control.py).
| phase | description |
|---|---|
| preprocess | Operations for filtering or modifying the input data set |
| export preprocessed | Operations for filtering or modifying the input data set |
| simulate | Discrete time-step simulation and data collection with given event tree and operations |
| postprocess | Operations for further deriving and marshalling of data from the simulation results |
| export | File output into different formats based on simulation or post-processing results |
Input for preprocess and simulate phases is a forest data file. This file is a list of forest stand objects with lists of associated reference trees and tree strata. The file can be of following types and formats:
- a .json file or .pickle file containing Forest Data Model type source data.
- a .dat file containing VMI12 or VMI13 type source data
- a .xml file containing Forest Centre type source data
- a .gpkg file containing Forest Centre type source data
Input for postprocess and export phases is a directory produced by the simulate phase.
There are several example input files in the project test resources tests/resources/file_io_test and tests/data/resources directories.
Preprocessing extracts and runs conversions for the input data. To export prerocessed data use the export_prepro pipeline component to generates different formats such as csv, pickle or json file with the computational units as a list. The data is always in FDM
format. The file is written into the configured output directory as preprocessing-result.{csv,pickle,json}.
Simulate collects a nested data structure containing the final states for each produced alternatives of each computational unit. A dictionary data structure for computed data during the simulation is included for each such alternative. This data structure can be outputted as a directory structure into the configured output directory.
Post-processing utilizes exactly same nested data structure as produced by simulation. It will derive further data within and across the produced alternatives and stores them within the derived data structure. This data can be outputted as a directory structure into the configured output directory.
Exporting uses the nested data structure above. It will select partial data sets as configured. These data sets may be written to any supported and compatible file containers. Such support must be implemented on a case-by-case basis as modules into the exporting functionality.
When phases are run in order on the same run, intermediate result files do not need to be written out. Data is kept and propagated in-memory.
Despite other output formats the computational units can be outputted as a RST format which is in itself a special format used only by the Natural Resource Institutes MELA simulator. The RST format is only genereted as preprocessing output and contains the reference tree information of forest stands. Along with the RST format an RSTS format file is genereted which contains the stratum information of the computationals units.
Use the following command to output simulator application help menu.
python -m lukefi.metsi.app.metsi --help
Input path, output path and the control file path must be supplied as CLI positional arguments. All other application level configuration
parameters in commands below should be set in the control.py file under the app_configuration key. Control file settings
override program defaults (see app/app_io.py MetsiConfiguration class).
At the time of writing this, there are no post-processing operations ready to be used. Post-processing
will do_nothing
Main documentation resource for all configuration is inclued in the examples directory and the control.py file.
All examples below default to control.py in the working directory as the control file source unless otherwise
specified in the command line.
Preprocessing, simulation and post-processing phases do not produce output files by default. Configuration for preprocessing output container, state output container and derived data output container need to be set to produce output files. The default mode of operation is to run the full pipeline and only the export phase will create files as configured.
See table below for a quick reference of forestry operations usable in control.py.
| operation | description | source | model library |
|---|---|---|---|
| do_nothing | This operation is no-op utility operation to simulate rest | native | |
| grow_acta | A simple ReferenceTree diameter and height growth operation | Acta Forestalia Fennica 163 | metsi-forestry |
| first_thinning | An operation reducing the stem count of ReferenceTrees as a first thinning for a forest | Reijo Mykkänen | metsi-forestry |
| thinning_from_below | An operation reducing the stem count of ReferenceTrees weighing small trees before large trees | Reijo Mykkänen | metsi-forestry |
| thinning_from_above | An operation reducing the stem count of ReferenceTrees weighing large trees before small trees | Reijo Mykkänen | metsi-forestry |
| even_thinning | An operation reducing the stem count of ReferenceTrees evenly regardless of tree size | Reijo Mykkänen | metsi-forestry |
| report_collectives | Save the values of collective variables | native | |
| calculate_biomass | Calculate biomass accruals from the current reference tree properties | Laura Jaakkola | native |
| report_state | save the values of state variables at the current time point | native | |
| filter | Filter stands, trees and strata | native | |
| cross_cut_felled_trees | Perform cross cut operation to results of previous thinning operations | Annika Kangas | metsi-forestry |
| cross_cut_standing_trees | Perform cross cut operation to all standing trees on a stand | Annika Kangas | metsi-forestry |
| calculate_npv | Calculate net present value of stand and harvest revenues subtracted by renewal operation costs. | Urho Niemelä | native |
| calculate_npv | Calculate net present value of stand and harvest revenues subtracted by renewal operation costs. | Urho Niemelä | native |
| clearcutting | Clear the stand of reference trees and produce data for cross-cutting. | Laura Jaakkola | native |
| planting | Plant sapling reference trees on an empty stand. | Laura Jaakkola | native |
Performes renewal operation planting which plants saplings on a empty stand.
| parameter name | type | description | location in control.py | notes |
|---|---|---|---|---|
| planting_instructions | string | In file rows represent site types, first column contains the tree species, second column the stems per hectar value and third column is the soil preparation type. | operation_file_params | optional |
| tree_count | int | Number of sapling trees to be plante. | operation_params | optional, default 10 |
The output is written into the derived data container with two keys renewal and regeneration.
Output of renewal is a regeneration description python dictionary containing following values regeneration key, soil prepration type, tree species and stem count.
Output of regeneration is is a list of PriceableOperationInfo objects:
| attribute name | type | description |
|---|---|---|
| operation | string | renewal operation name |
| units | float | stand area |
| time_point | int | time point of operation execution timepoint |
Performs even thinning which removes stems evenly from all reference tree classes. Removal bounds are defined by basal area.
| parameter name | type | location in control.py | notes |
|---|---|---|---|
| thinning_limits | float | operation_file_params | optional parameter |
| e | float | operation_params | residue constant |
| thinning_factor | float | operation_params | removal intensity |
Operation outputs a list of CrossCuttableTree objects
Object attributes:
| attribute name | type | description |
|---|---|---|
| stems_per_ha | float | number of removed stems |
| species | TreeSpecies | tree species of removed reference tree |
| breast_height_diameter | float | trees diameter at breast height |
| height | float | trees height |
| source | string | standing or harveste |
| operation | string | operation that produced such output |
| time_point | int | time point of operation execution |
| cross_cut_done | bool | cross cut operation executed |
- parameter e is a residue constant so that the removal ratio would not go under the lower limit.
- For example e=0.2
Performs thinning from below which primarily removes trees with a smaller diameter. Removal bounds are defined by basal area.
| parameter name | type | location in control.py | notes |
|---|---|---|---|
| thinning_limits | float | operation_file_params | optional parameter |
| e | float | operation_params | residue constant |
| thinning_factor | float | operation_params | removal intensity |
Operation outputs a list of CrossCuttableTree objects
Object attributes:
| attribute name | type | description |
|---|---|---|
| stems_per_ha | float | number of removed stems |
| species | TreeSpecies | tree species of removed reference tree |
| breast_height_diameter | float | trees diameter at breast height |
| height | float | trees height |
| source | string | standing or harveste |
| operation | string | operation that produced such output |
| time_point | int | time point of operation execution |
| cross_cut_done | bool | cross cut operation executed |
- parameter e is a residue constant so that the removal ratio would not go under the lower limit.
- For example e=0.2
Performs thinning from above which primarily removes trees with a larger diameter. Removal bounds are defined by basal area.
| parameter name | type | location in control.py | notes |
|---|---|---|---|
| thinning_limits | float | operation_file_params | optional parameter |
| e | float | operation_params | residue constant |
| thinning_factor | float | operation_params | removal intensity |
Operation outputs a list of CrossCuttableTree objects
Object attributes:
| attribute name | type | description |
|---|---|---|
| stems_per_ha | float | number of removed stems |
| species | TreeSpecies | tree species of removed reference tree |
| breast_height_diameter | float | trees diameter at breast height |
| height | float | trees height |
| source | string | standing or harveste |
| operation | string | operation that produced such output |
| time_point | int | time point of operation execution |
| cross_cut_done | bool | cross cut operation executed |
- parameter e is a residue constant so that the removal ratio would not go under the lower limit.
- For example e=0.2
Performs first thinning which primarily removes trees with a smaller diameter. Removal bounds are defined by number of stems.
| parameter name | type | location in control.py | notes |
|---|---|---|---|
| dominant_height_lower_bound | float | operation_params | |
| dominant_height_upper_bound | float | operation_params | |
| e | float | operation_params | residue constant |
| thinning_factor | float | operation_params | removal intensity |
Operation outputs a list of CrossCuttableTree objects
Object attributes:
| attribute name | type | description |
|---|---|---|
| stems_per_ha | float | number of removed stems |
| species | TreeSpecies | tree species of removed reference tree |
| breast_height_diameter | float | trees diameter at breast height |
| height | float | trees height |
| source | string | standing or harveste |
| operation | string | operation that produced such output |
| time_point | int | time point of operation execution |
| cross_cut_done | bool | cross cut operation executed |
- parameter e is a residue constant so that the removal ratio would not go under the lower limit.
- For example e=0.2
Compute total biomass tonnages of a single forest stand.
| parameter name | type | location in control.py | notes |
|---|---|---|---|
| model_set | int | operation_params | accepted values: 1 and 2 |
- model_set accepts following values 1, 2, 3 or 4
- if value is 1 wood, bark, living and dead branches, foliage, stumps and roots are collected with model set Y
- if value is 2 wood, bark, living and dead branches, foliage, stumps and roots are collected with model set X
Attributes of the BiomassData object
| attribute name | type |
|---|---|
| stem_wood | float |
| stem_bark | float |
| stem_waste | float |
| living_branches | float |
| dead_branches | float |
| foliage | float |
| stumps | float |
| roots | float |
Enables collecting the states of user-defined variables at the time of the operation call.
The parameters passed to the operation are the variables that the user wants to report. The parameters are key-value pairs, where the key defines the name of the variable, and the value defines how the variable is constructed.
The operation makes available a set of collections that can be used in the definition of desired variables. These collections are:
| name | description | class whose attributes are available |
|---|---|---|
| state | forest stand | ForestStand |
| reference_trees | stand's reference trees | ReferenceTrees |
| felled_trees | trees that have been thinned/clearcut | CrossCuttableTree |
| cross_cutting | results of cross cutting | CrossCutResult |
| renewal | results of renewal operations | PriceableOperationInfo |
| net_present_value | results of net present value calculations | NPVResult |
For example, to get the total stems per hectare in the years the operation is defined for, one would
define report_state's operation parameters as:
"report_state": [
{
"total_stems_per_ha": "reference_trees.stems_per_ha"
}
]The stand's reference trees are stored under the name reference_trees, and the attributes defined for that name can
be used to get values. The returned total_stems_per_ha is the sum of the stand's trees' stems_per_has.
However, often one needs more detailed information about the state, and therefore filter only certain variables. For example, to get the stems per hectare of pines:
{
"report_state": [
{
"total_stems_per_ha": "reference_trees.stems_per_ha[reference_trees.species == 1]"
}
]
}or, to be even more fine-grained, get the stems_per_ha of pines that are not saplings:
{
"report_state": [
{
"total_stems_per_ha": "reference_trees.stems_per_ha[(reference_trees.species == 1) & (reference_trees.sapling == False)]"
}
]
}Notice the parentheses around the filter conditions, when using multiple conditions.
Operation makes it possible to collect sums of derived data collections between periods.
For example in a 30 year long simulation one could define three 10-year periods by defining report_period on time points 10, 20 and 30. As the simulation executes the report_period operation collects user-defined derived data between years 0-9, 10-19 and 20-29.
See documentation of report_state operation for definition and usage.
Clears the stands reference tree list and stores them as a list of CrossCuttableTree objects into derived data with
the felled_trees keyword.
| parameter name | type | description | location in control.py | notes |
|---|---|---|---|---|
| clearcutting_limits_ages | string | In file the rows represent site type values and colums represent tree species and the values represent the smallest possible age enabled. | operation_file_params | example file data/parameter_files/renewal_ages_southernFI.txt |
| clearcutting_limits_diameters | string | In file the rows represent site type values and colums represent tree species and the values represent smallest possible breast height diameters enabled. | operation_file_params | example file data/parameter_files/renewal_diameters_southernFI.txt |
| minimum_time_interval | int | After the operation is executed it will not be executed again until the minimum_time_interval is reached. |
run_constrains |
Output object attributes:
| attribute name | type | description or value |
|---|---|---|
| stems_per_ha | float | number of removed stems |
| species | TreeSpecies | tree species of removed reference tree |
| breast_height_diameter | float | trees diameter at breast height |
| height | float | trees height |
| source | string | value: harvested |
| operation | string | value: clearcutting |
| time_point | int | time point of operation execution |
| cross_cut_done | bool | cross cut operation executed or not |
Calculates the volume and value of harvested trees using Annika Kangas' cross cutting algorithm. Whenever this operation is called, it cross-cuts all thinning and clearcutting results that have been produced before it, but have not yet been cross-cut. Given this, it is enough to call this operation once before cross cutting results are needed.
The time_point attribute of the resulting CrossCutResult objects will be determined by the tree's havest year, not the
year when this operation is called.
| parameter name | type | location in control.py | notes |
|---|---|---|---|
| timber_price_table | file (csv) | operation_file_params | timber grades must be given as integers |
| implementation | str | operation_params | py and lupa (lua) implementations available |
Attributes of the CrossCutResult object
| attribute name | type |
|---|---|
| species | TreeSpecies |
| timber_grade | int |
| volume_per_ha | float |
| value_per_ha | float |
| stand_area | float |
| source | str (either "harvested" or "standing") |
| operation | str (operation tag, or '' if source == "standing") |
| time_point | int |
Calculates the volume and value of standing trees using Annika Kangas' cross cutting algorithm at the time of the operation call. This operation does not actually harvest the stand, but rather evaluates the the volume and value of its trees if they were cross cut. Therefore, this operation is different from clearcutting.
| parameter name | type | location in control.py | notes |
|---|---|---|---|
| timber_price_table | file (csv) | operation_file_params | timber grades must be given as integers |
| implementation | str | operation_params | py and lupa (lua) implementations available |
Attributes of the CrossCutResult object
| attribute name | type |
|---|---|
| species | TreeSpecies |
| timber_grade | int |
| volume_per_ha | float |
| value_per_ha | float |
| stand_area | float |
| source | str (either "felled" or "standing") |
| operation | str |
| time_point | int |
Calculates the Net Present Value (NPV) of a given schedule.
| parameter name | type | location in control.py | notes |
|---|---|---|---|
| interest_rates | list of int | operation_params | e.g. [3], where 3 stands for 3% |
| land_values | file (json) | operation_file_params | |
| renewal_costs | file (csv) | operation_file_params |
- This operation expects that
cross_cut_felled_treeshas been called previously to cross cut any previous thinning output. - This operation expects that
cross_cut_standing_treeshas been called in the same time point, so that the present value of the standing trees can be evaluated correctly.
$$ NPV = \underbrace{\sum_{t=0}^T \frac{h_ta}{(1+r)^t}}\text{(1)}+ \underbrace{\frac{S_Ta}{(1+r)^T}}\text{(2)}- \underbrace{\sum_{t=0}^T \frac{c_ta}{(1+r)^t}}\text{(3)}+ \underbrace{LV}\text{(4)}
$$
where:
-
$h_t$ is the per-hectare harvest revenue from the stand at time$t$ -
$a$ is the stand's area in hectares -
$r$ is the interest rate -
$S_T$ is the value of standing tree stock at the final time point$T$ -
$c_t$ is the per-hectare costs of stand treatment at time$t$ -
$LV$ is the bare land value of the stand, calculated for the interest rate$r$ .
(1) harvest revenues originate from cross_cut_felled_trees
(2) stand value originates from cross_cut_standing_trees
(3) currently, costs originate only from renewal operations
(4) Bare land values are passed in as a file parameter for the NPV operation. These values are already discounted with the given interest rate, so no discounting happens here. See MELA 2016 reference manual p.175-176 for more information about this.
To run unit test suites, run in the project root
python -m pytest
You can also use python internal module unittest
python -m unittest <test suite module.class path>
A run is declared in the control file control.py.
- Application configuration in
app_configurationdictionary.state_formatspecifies the data format of the input computational unitsfdmis the standard Forest Data Model.vmi12andvmi13denote the VMI data format and container.forest_centredenotes the Forest Centre XML data format and container.geo_packagedenotes the Forest Centre GPKG data format and container.
state_input_containeris the file type forfdmdata format. This may becsv,pickleorjson.state_output_containeris the file type for outputting thefdmformatted state of individual computational units during and after the simulation. This may becsv,pickleorjsonor commented out for no output.derived_data_output_containeris the file type for outputting derived data during and after the simulation. This may bepickleorjsonor commented out for no output.run_modesMetsi pipeline considers two conceptual parts. The data conversion and the simulation. From which first one is defined with thepreprocessandexport_preproand the second one withsimulate,postprocessandexport.strategyis the simulation event tree formation strategy. Can bepartialorfull.measured_treesinstructs thevmi12andvmi13data converters to choose reference trees from the source.TrueorFalse.stratainstructs thevmi12andvmi13data converters strata from the source.TrueorFalse.strata_origininstructs theforest_centreconverter to choose only strata with certain origin to the result.1,2or3.multiprocessinginstructs the application to parallelizes the computation to available CPU cores in the system.TrueorFalse.
- Operaton run constrains in the object
run_constraints - Operation parameters in the object
operation_params. Operation parameters may be declared as a list of 1 or more parameter sets (objects). Operations within analternativesblock are expanded as further alternatives for each parameter set. Multiple parameter sets may not be declared for operations within anysequenceblock. - List of
simulation_events, where each object represents a single set of operations and a set of time points for those operations to be run.time_pointsis a list of integers which assign this set of operations to simulation time pointsgeneratorsis a list of chained generator functions (see section on simulation event generators)sequencea list of operations to be executed as a chainalternativesa list of operations which represent alternative branches
- Preprocessing operations can be passed as a list of strings under
preprocessing_operations, and their (optional) arguments underpreprocessing_paramsas key-value pairs. - Operation parameters that exist in files can be passed in
operation_file_paramsas demonstated below:
operation_file_params = {
"first_thinning": {
"thinning_limits": "/path/to/file/thinning-limits.txt"
},
"cross_cut_felled_trees": {
"timber_price_table": "/path/to/file/timber-prices.csv"
}
}Note, that it is the user's responsibility to provide the file in a valid format for each operation.
7. Post-processing is controlled in the post_processing section of the file
1. operation_params section sets key-value pairs to be passed as parameters to named post-processing operations
2. post_processing section lists a non-branching list of post-processing operations to be run in sequence for the
given data
8. Export is controlled in the export section of the file (TODO: structure is in works)
The following example declares a simulation, which runs four event cycles at time points 0, 5, 10 and 15. Images below describe the simulation as an event tree, and further as the computation chains that are generated from the tree.
- At time point 0,
reportingof the simulation state is done. - At time point 5, the
growoperation is done on the simulation state and the simulation is branched by 3. One branch does not modify the forest state data withdo_nothing, another performs athinningoperation on the forest state data with parameter set 1, and anotherthinningoperation with parameter set 2. - At time point 10, the 3 branches from time point 5 are extended separately with a
growoperation, then branched again withdo_nothingandthinningoperations with two parameter sets, resulting in 9 branches. - At time point 15,
reportingis done on the 9 individual state branches.
# example of operation run constraints
# minimum time interval constraint between thinnings is 10 years
run_constraints = {
"thinning": {
"minimum_time_interval": 10
}
}
# example of operation parameters
# reporting operation gets one parameter set
# thinning operation gets two parameter sets
operation_params = {
"reporting": [
{"level": 1}
],
"thinning": [
{"thinning_factor": 0.7, "e": 0.2},
{"thinning_factor": 0.9, "e": 0.1}
]
}
# simulation_events are a collection of operations meant to be executed at
# the specified time_points
simulation_events = [
{
"time_points": [5, 10],
"generators": [
{
"sequence": [
"grow"
]
},
{
"alternatives": [
"do_nothing",
"thinning"
]
}
]
},
{
"time_points": [0, 15],
"generators": [
{
"sequence": [
"reporting"
]
}
]
}
]Event tree from declaration above
Operation chains from event tree above, as produced by the full tree formation strategy. See below for the partial tree strategy.
The report_collectives operation and data export make use of collective variables. A collective variable is a python
expression that is evaluated on a ForestStand. For example the expression year will collect ForestStand.year.
You can collect tree variables by typing reference_trees.variable_name. You can also slice the variable,
eg. reference_trees.volume[reference_trees.species==1] will collect the total volume of pine. The collective arrays
are just numpy arrays, so all numpy slicing operations are supported.
For export data collection you can index the collection array by collection year, so Vpine would export the
collective Vpine from all periods, and Vpine[0,5] would export it from years 0 and 5.
You can use the filter operation to control what data to keep or remove during pre-processing.
It takes a dict of action: expression where action is of the form select|remove[ stands|trees|strata]
and expression is a Python expression.
select and remove are aliases for select stands and remove stands. select removes all objects for
which expression evaluates to a falsy value. remove removes objects for which expression evaluates to a truthy
value.
The following example removes all sapling trees, trees without stem count and stands without reference trees:
preprocessing_params = {
"filter": [
{
"remove trees": "sapling or stems_per_ha == 0",
"remove": "not reference_trees"
}
]
}Evaluation order is the order of parameters, so the example would first remove trees and then stands.
You can also reuse filters with named filters. A named filter is an expression given in the named parameter, and it
can be used in other expressions (including other named filters). The following example is equivalent to the previous
one:
preprocessing_params = {
"filter": [
{
"named": {
"nostems": "stems_per_ha == 0",
"notrees": "not reference_trees"
},
"remove trees": "sapling or nostems",
"remove": "notrees"
}
]
}The three important concepts in the sim package are operations, processor, event tree and
event generators.
An operation is a function whose responsiblities are 1) to trigger manipulation of simulation state and 2) to compute
derived data about simulation state before and/or after state manipulation. For the purposes of the simulator, the
operation is a partially applied function from the domain package (forestry) such that it will take only one argument.
They are produced as lambda functions based on the control.py declaration.
As an example, a single operation such as grow would receive a single argument of type ForestStand upon which it
operates and finally returns a ForestStand for the modified/new state.
The processor is a function wrapper which handles running a prepared operation (see above). The parameter is
an OperationPayload instance. The OperationPayload object is the container for simulation state data, along with a
record of simulation run history and operation run constraints. Responsibilities of the processor function are as
follows:
- Determine if run constraints apply to the operation to be run. Abort and raise an exception if so.
- Execute the operation function with simulation state data.
- Create a record of the run in simulation run history.
- Pack results as a new
OperationPayloadand return it.
Processor functions are produced as lambda functions based on the control.py declaration.
The event tree is a tree data structure where each individual node represents a prepared simulation operation. It is
generated based on the control.py declaration. Unique operation chains are generated based on the event tree for
individual chain runs, or the event tree can be evaluated by depth-first walkthrough. This is controlled by the
evaluation strategy.
sequence and alternatives are functions which produce EventTree instances for given input functions and as
successors of previous EventTree instances. For the simulation purposes, these input functions are the prepared
processors (see above), but the simulator implementation literally does not care what these functions are. Sequences are
linear chains of event. Alternatives are branching events. EventTree instances are generated and bound to earlier
EventTree leaf nodes as branches.
The generators are chainable and nestable such that they can expand the event tree in formation based on the results of
a previous generator's results. The NestableGenerator represents a tree structure for nested generator declarations.
It is constructed from the simulation_events structure given from a configuration source. A SimConfiguration
structure, likewise populated from a configuration source is used as a template for binding the created generator
functions with prepared domain operation functions. The control source is an application's control.py file's dict
structure or another compatible source.
compose_nested function executes the given NestableGenerator which in turn utilizes its prepared sequence
and alternatives calls to build a complete simulation event tree.
The forestry package contains the operations necessary to represent the simulation state data and operations acting
upon that data.
The class ForestStand and the ReferenceTree and TreeStratum instances it refers to. A single ForestStand
instance fully represents a forestry simulation state.
Operations are functions which take two arguments
- A tuple of a
ForestStandinstance and aCollectedDatacontaining derived/computed data during the simulation run - Python
dictcontaining parameters for this operation
By convention (since Python as a language does not allow us to properly enforce this), these functions must remain pure
and not trigger side-effectful program logic. Operations may do in-place mutation of the argument tuple. Operations may
not mutate the operation parameter dict
sim.runners module has two functions of interest:
evaluate_sequenceexecutes a prepared chain of functions (from the event tree), returning the final simulated stated or raising an execption upon any failure.run_chains_iterativelyis a simple iterator for given chain of prepared operations (from the event tree)
Note that this package is a simple run-testing implementation. In the future we wish to expand upon this to allow for distributed run scenarios using Dask.
The simulator strategies determine how the program traverses down the simulation tree. The full strategy
executes run_full_tree_strategy, whereas the partial strategy executes run_partial_tree_strategy.
It should be noted that while the full strategy is simpler to understand conceptually, it carries a significant memory
and runtime overhead for large simulation trees, and therefore the partial strategy should be focused on as the
performant solution.
Your primary responsibility is to write the functionality that acts upon simulation state data based on parameters of your choosing. For the forestry case, the state data is a single instance of a ForestStand, which always has a member list of ReferenceTree instances.
You write an operation function, which is an entry point to your work.
- It shall take in a single argument, a tuple of (ForestStand, CollectedData).
- It shall return a tuple of (ForestStand, CollectedData), or it shall raise an Exception when the operation can not complete its work for any reason.
The operation function can internally be whatever you require it to be. Write out as many other functions you need for the underlying scientific models. Consider developing these under the metsi-forestry library.
Keep your work functionally pure. This means that the implementations must never access input and output (API calls, file access, etc.). This also means that all data must be passed as function arguments, return values and exceptions. Do not use shared memory outside of the scope of the functions you write. Python does not allow us to be strict about this, so it is up to you! Being strict about it is crucial for producing safe, testable, provable and deterministic implementations.
Implement unit tests for functions that you write. Unit tests allow you to develop your functions completely independently from the rest of the system. You do not need to run the simulation to test your work, but use a test to ensure that your function returns what is expected and behaves like it's intended.
- Coordinate with simulator developers when some need arises that you feel can not be addressed with the model described above. A solution that doesn't require breaking functional purity can most certainly be found by developing the simulator and operations interface structure.
- Coordinate with other developers when the work you do and models you write can be shared with other operations.
- Coordinate with simulator creators about the parameters names and structures that can be passed from control.py file.
TODO: this section is no longer current and will need to be rewritten when a new implementation for R wrapping is made. report_volume no longer exists and the R codes are in metsi-forestry.
To run operations using the R functions you need R available in the local environment.
You must additionally install the Python rpy2 package for the necessary API.
For convenience, this can be installed via the optional-requirements.txt.
pip install --user -r optional-requirements.txt
The r_utils.py module contains functions and examples of how to bind into R functions.
The r directory houses related R script and data files.
An example implementation lmfor_volume exists for forest stand volume calculation.
Currently, this can be taken into use with the report_volume operation function in the control.py, following the
example below
operation_params:
report_volume:
lmfor_volume: true
The project contains a DESCRIPTION file which must be used to declare R library dependencies for R scripts. This is
not necessary for running the R scripts locally, but is required for dependency resolution in the GitHub Actions test
runs pipeline. Local dependencies are handled by the script files as exemplified by the beginning of
the lmfor_volume.R. It will install its dependencies on run if they are not found from the local environment R
libraries.
library_requirements <- c("lmfor")
if(!all(library_requirements %in% installed.packages()[, "Package"]))
install.packages(repos="https://cran.r-project.org", dependencies=TRUE, library_requirements)
library(lmfor)
To understand the partial tree strategy better, consider a simulator instruction such as:
simulation_events = [
{
"time_points": [0, 5],
"generators": [
{
"sequence": [
"grow"
]
},
{
"alternatives": [
"do_nothing",
"thinning"
]
}
]
}
]which will produce a simulator tree as below:
the full strategy would create operation chains (one for each possible path in the tree) and run them independently
from one another. In this case, we would have four separate chains, each chain having four operations.
On the other hand, the partial strategy would proceed as follows:
- create operation chains from the nodes in the first time point, and run them:
Here you'll notice that the first period's grow operation is executed twice, whereas the full tree strategy would
have executed it four times.
- For all successful results from the first period, create operation chains from the nodes in the second period:
Now, let's assume that the second chain from time point 0 would not complete successfully, e.g. due to a constraint set
on the thinning operation. At time point 5, The partial tree strategy would then only create operation chains
for output 1, i.e. the two chains on the left in the above diagram. This logic reduces the unnecessary computation the
simulator has to make, compared to the full strategy, and this is true especially for large simulation trees.
"Partial" in the strategy name refers to the fact that the strategy creates trees from only the nodes (EventTrees) in
one time point at a time (i.e. partial trees/subtrees) and traverses those partial trees (with the same post-order
traversal algorithm) to create partial (or sub-) chains of operations. Therefore, the strategy can be thought to
operate depth-first within time points, but breadth-first across time points.
Use the control.py structure declaration to control the simulation structure, domain operations and parameters that are to be used.




