Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi level hydrofabric #497

Closed
wants to merge 44 commits into from

Conversation

donaldwj
Copy link
Contributor

@donaldwj donaldwj commented Mar 7, 2023

[Short description explaining the high-level reason for the pull request]

Additions

  • Support for multiple processing levels
  • Support for separate models runtimes for each level

Removals

  • Old time loop has been replaced

Changes

  • Added Layer Objects to Ngen driver
  • Layer objects now manage model realizations

Testing

Manual Test of 5 catchment data set

Screenshots

Notes

Todos

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Testing checklist (automated report can be put here)

Target Environment support

  • Linux

@donaldwj donaldwj marked this pull request as ready for review March 7, 2023 19:02
@donaldwj donaldwj marked this pull request as draft March 7, 2023 19:05
@donaldwj donaldwj requested a review from mattw-nws March 7, 2023 19:05
src/NGen.cpp Show resolved Hide resolved
src/NGen.cpp Outdated Show resolved Hide resolved
@donaldwj donaldwj marked this pull request as ready for review April 21, 2023 17:08
Copy link
Member

@hellkite500 hellkite500 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An initial pass through this code shows a few things that need to be addressed, and some potentially larger design an integration discussions/decisions. I think functionally it will serve the purpose we need, and can likely be lightly refactored to help with the code verbosity and visibility of certain components. As I write this, one thing that I'm thinking about heavily is the complete abstraction of the feature updates behind the layer interface update_models and whether moving that logic that far "out of sight" of the driver/entrypoint is going to be good or if it will make understanding the flow of information and computation difficult to understand. Just some musings...

Most of the comments here are addressable by themselves, and we can think about the bigger picture things as this progresses. I definitely want to expand on some testing and test cases as well.

.gitignore Outdated Show resolved Hide resolved
include/core/HY_Features.hpp Show resolved Hide resolved
include/core/HY_Features.hpp Show resolved Hide resolved
src/core/HY_Features.cpp Show resolved Hide resolved
src/core/HY_Features.cpp Show resolved Hide resolved
cat_id = "terminal";
else
{
layer_min_next_time = prev_layer_time = layer->current_timestep_epoch_time();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment explaining why this happens, especially in the driver/entrypoint, its good to be as descriptive as possible.

src/NGen.cpp Outdated
#endif
//std::cout<<"\tNexus "<<id<<" has "<<contribution_at_t<<" m^3/s"<<std::endl;
} //done levels
} while( layer_min_next_time < next_time );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be the first do...while loop structure in this code base haha

protected:

MultiLayerParserTest() :
nexus_data_path("../data/multilayer/nexus_data.geojson"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should implement this using the find file helper functions as seen in other tests. See #519 for some relevant discussion as to why this useful for different test interfaces/uses

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this file just doesn't exist... nor does the directory. Missed commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it existed only in my old working folder might be able to find it still, It could be gone if I removed the old directory after moving to local storage for the builds

include/core/Partition_Parser.hpp Show resolved Hide resolved
include/core/Layer.hpp Show resolved Hide resolved
@hellkite500
Copy link
Member

I also suggest a local update of your branch to fix conflicts:

git fetch upstream
git rebase upstream/master
#fix conflicts
git add <conflicted file>
git rebase --continue
git push -f origin multi_level_hydrofabric

@donaldwj donaldwj force-pushed the multi_level_hydrofabric branch from 5f70b86 to c093a33 Compare May 5, 2023 14:47
Comment on lines +54 to +55
std::unordered_map<int,LayerDescription> stored_layers;
std::vector<int> keys;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having an unordered_map and then a separate collection of the keys is redundant. We can just use std::map or even a sorted std::vector, and make LayerDataStorage directly iterable.

If this is all just used in initialization logic, then the concern is less salient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The map will be accessed a lot. Unorderedmap is O(1) vs O(lg(N)) of map, particularly when there will be a small number of keys. The use of vector is also not ideal because not all layer slots will be filled.

Currently layers range in id from -20 (lowest ground water) to 80 (upper atmosphere) but few ids will be used at any time. This means allowing retrieving a layer with its id requires a map (or an index array), since the list of keys is both small and will be needed by the main execution loop there is no reason not to build the key list while layers are added, and then store it for reference.

Comment on lines +104 to +110
//TODO put this somewhere else. For now, just trying to ensure we get m^3/s into nexus output
try{
response *= (catchment_data->get_feature(id)->get_property("areasqkm").as_real_number() * 1000000);
}catch(std::invalid_argument &e)
{
response *= (catchment_data->get_feature(id)->get_property("area_sqkm").as_real_number() * 1000000);
}
Copy link
Contributor

@PhilMiller PhilMiller Jul 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if the actual variation between the two arms (the key name) were isolated from the repeated logic.

Perhaps this whole thing should actually be encapsulated inside a member catchment_data::get_area() (cf Law of Demeter), and then it can be made to cache it so we don't incur repeated lookup costs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance concern holds doubly so because catchment_data is a (should-be-const) member of this class, and so the area can be looked up once outside any timestep logic.

Comment on lines +136 to +142
LayerDescription description;
std::vector<std::string> processing_units;
Simulation_Time simulation_time;
feature_type& features;
geojson::GeoJSON catchment_data;
std::shared_ptr<pdm03_struct> pdm_et_data;
long output_time_index;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whichever of these members can be const ought to be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not many of them can be constant although in most cases the accessors could be. Catchment_data probably could be but I would have to check to see if the GeoJSON class works correctly with a const pointer.

response *= (catchment_data->get_feature(id)->get_property("area_sqkm").as_real_number() * 1000000);
}
//TODO put this somewhere else as well, for now, an implicit assumption is that a modules get_response returns
//m/timestep
Copy link
Contributor

@PhilMiller PhilMiller Jul 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use different temporary variables to reflect each step of dimension conversion and scaling, so that any one variable can be understood as having fixed dimension and units. If they're named accordingly, then comments like this will be much less load-bearing for someone like me trying to understand what the code is doing.

return next_timestep_index(current_date_time_epoch);
}

inline time_t next_timestep_epoch_time(int epoch_time_seconds){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time_t wasn't previously part of the interface of Simulation_Time other than what's passed in through the simulation_time_params constructor. Could we use the better specified std::chrono types and routines for this instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguments that name a duration with a unit should just be std::chrono::duration, too

time_t get_current_epoc_time()
{
return current_date_time_epoch;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mark all methods that ought to be const as such, to ease understanding and guard against mistakes.

// layer.

// this is the time that layers are trying to reach (or get as close as possible)
auto next_time = manager->Simulation_Time_Object->next_timestep_epoch_time();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like manager->Simulation_Time_Object never has its current_date_time_epoch advanced in the revised code. That was (ludicrously) happening inside get_timestamp, which is no longer being called.

Is the current testing of this PR sufficient to see whether time is actually advancing appropriately, and the simulation state remains consistent with it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be looked at with a multi time interval test. I dont think the standard tests include this.

@PhilMiller
Copy link
Contributor

I've gotten as much as I think I'm going to out of this PR in its present state. I think my last comment about advancing the simulation time stamp is of greatest concern.

}

std::cerr << "Requesting water from nexus, id = " << id << " at time = " <<current_time_index << ", percent = 100, destination = " << cat_id << std::endl;
double contribution_at_t = features.nexus_at(id)->get_downstream_flow(cat_id, current_time_index, 100.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
double contribution_at_t = features.nexus_at(id)->get_downstream_flow(cat_id, current_time_index, 100.0);
// The call to `get_downstream_flow` updates the accounting of water flow out of the queried nexus - should be named something more like `withdraw_water`
double contribution_at_t = features.nexus_at(id)->get_downstream_flow(cat_id, current_time_index, 100.0);

@mattw-nws
Copy link
Contributor

mattw-nws commented Aug 17, 2023

I did a PR to rebase/bring multi_level_hydrofabric up to date with master. It's a bit much to look at. It may make more sense to force-push my rebase to your branch than to try to merge the PR, if we're satisfied that the changes are correct. However I also can't run the new multilayer tests because ./data/multilayer/* isn't in the branch...but I think everything works.

@mattw-nws mattw-nws mentioned this pull request Aug 18, 2023
11 tasks
@mattw-nws
Copy link
Contributor

Closing in favor of #615, though the history here is worth referring to.

@mattw-nws mattw-nws closed this Aug 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants