This code solves for the Highest Above Nearest Drainage (HAND) values of cells from a Digital Elevation Model (DEM). Data processing is done in two major stems:
- Searching for all drainage connected to a node.
- Selecting the drainage node where the HAND value will be derived from for a given starting node.
The HAND values are the differences in the elevation of the starting node and the selected node.
The input data required is the network data in JSON format. With the following sample content:
{
"0": {
"row": 0,
"col": 0,
"accum": 1.3729063722245636,
"elev": 650.0,
"neighbors": [[1, 1], [0, 1]]
},
"1": {
"row": 0,
"col": 1,
"accum": 3.609570510679968,
"elev": 641.0,
"neighbors": [[1, 2], [0, 2]]
},
"2": {
"row": 0,
"col": 2,
"accum": 7.949967548702894,
"elev": 629.0,
"neighbors": [[1, 3], [0, 3]]
}
//-- snip --
}
The keys represent the node id which is generated from the rows and columns of the current node with a simple function:
# node_id - the resulting single number id
# node_row - the row of the node in the DEM
# node_col - the column of the node in the DEM
# dem_cols - the total number of columns in the DEM
node_id = node_row * dem_cols + node_col
The data must be arrange in order of increasing node_ids (nodes going left to right from the topmost row and repeat the same pattern in the next lower rows).
The neighbors field contains the rows and and columns of the neighbors or specifically the cells identified during the flow accumulation which receives flow the current node.
The accumulation level and elevations are contained in the aptly named fields.
Running the program with:
hand_calculate --help
provides the following help text:
Calculates the Heighest Above Nearest Drainage (HAND) values give the network data. The network data is a json file contaning the node_id incrementing from 0 to the data length. Each node must contain the accumulation value, row and column location of the node,the flow accumulation value and the list of neighbor nodes.
Options:
-i, --input-file input file name
-r, --rows number of DEM rows
-c, --cols number of DEM columns
-t, --drainage-threshold
minimum accumulation ammount to clasify a node as drainage
-d, --max-drainage
maximum number of connected drainage (default = 5)
-l, --max-path-length
maximum number of connected drainage (default = 30)
-a, --alpha path length bias over average accumulation value (range 0 to 1) (default = 0.9)
-p, --plot plot the result or not
--help display usage information
Make sure that the hand_calculate.exe is in the root folder together with the accumulations folder. Running the executable for example can be done as,
hand_calculate -i "accumulations\cells.json" -r 1047 -c 1613 -t 1000 -d 5 -l 40 -a 0.9 -p
The output files are the HAND numpy array,a JSON file containing the connected neighbors per node_id and another JSON file where each node is paired to the selected drainage node.
The search_drainage function uses Breadth-First Search algorithm to find the closest drainage to each starting node. The search is slightly modified to prevent exploration from drainage cells (as identified using the drainage threshold value) to its co-drainage cells.
The search returns multiple values if the flow-accumulation algorithm used a Multiple-Flow-Direction (MFD) approach. The user provided max_drainage value limits the number of connected drainage nodes. Nodes are sorted from least to greatest Manhattan Distance from the starting node where,
The find_all_paths function is a wrapper for the dfs_recursive function which returns all possible paths from a starting node to a drainage node. The latter function uses a recursive approach while employing Depth-First Search (DFS) traversal. During the recursion, a copy of the path-so-far is being kept and a depth first search is recursively apply to extend the path until a drainage is found. A list of explored nodes is referenced to prevent a possible possible cyclic paths.
To improve searching speed, the user provied max_path_length parameter terminates the search if the current path length is over the said value. This is okay for nodes that have multiple paths leading up into a drainage but this would also affect nodes that are further away from drainage nodes (areas in relatively higher elevation with large slopes) where no path is going to be returned. This however, would not affect the resulting HAND calculations that much since high risk areas are generally closer to streams.
Once all of the possible paths for all connected drainage nodes are identified, each will be ranked according to a certain score controlled by the alpha
Where,
To select the drainage node for the current node, we let,
The final node selection will be given as,
In case of a tie the final node will be chosen as,
Where,
This all boils down with the highest path score and in case of tie, choose the neighbor with the higher elevation.
Once the drainage node
Both drainage search and path finding algorithms were implemented via multithreading to improve runtime speed. This result into magnitudes of improvements over a single threaded operation. Proper data structures were also selected such as employing hashsets to check whether or not a node is already explored. Double ended queues were used for operations that pops elements from the front of a vector. Some vectors were also pre-allocatted at some estimated capacity.