-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EPIC] Fixes, Enhancements and Improvements for CatFIM #1182
Comments
Some of these tidbits were fixed in PR 1165 which is the rebuild of CatFIM. However, this Issue will remain open as there are more things to cover, each a different importance levels. |
Notes from 9/11/24 meeting1 & 2: These issues will probably go away when we move CatFIM to AWS. 4: Should be quick. 5: A simple reorganization task. 6: A larger reorganization task, but will probably not be needed when we move CatFIM to AWS. 7&8: Need too make an issue, we have decided to maintain these functions. Would be good to clarify their function in docs and add comments in the code. 9&10: Choose one because they’re very similar. I think we will choose #10 because it’s closer to finished. 11: Do it! 12: More of a rewrite task, lower priority. 13: Important! But doesn’t need to be its own issue. Rather, this documentation is something that we can and should be fixing as we go. 14: Big! Out-of-date hard coding could be causing a lot of the issues noted in the subsequent bullet points. A note from Carson: Would be good to add “and” statements to make sure that the known issue is true before applying a hard-coded fix. Pseudocode example: It would be good to add the AHPS restricted sites list to the catfim folder of our online inundation-mapping repo so that the field can check and provide feedback on whether certain sites can be removed from the list. Would be good to sort the LIDs by RFC and then alphabetical order. 15: New issue: Make a CatFIM folder to store the CatFIM |
For |
This might be something we tackle after addressing all of this technical debt, but I wanted to track it here since it's a CatFIM task... It's been suggested that we map Stage-Based CatFIM for sites that are not forecast sites. There are some AHPS points that are observation-only but still have flood thresholds that we could map. Even more sites for Stage-based (yay!). |
Added separate card for future enhancement: 1308 Add lake and levee protected area masking |
Previous version of this card before it was reorganized on 12/4/24:
Examples of how huc_messages turn into status and end up in the UI.
Note from Derek:
27). (Issue 1353)[https://github.com//issues/1353] Ensure CatFIM key column names are consistent. |
Archive - Completed or Tabled in CatFIM v2.1 :
Note from Derek:
|
This is an EPIC card. As items from this list are addressed, their active cards will be linked. For issues addressed in CatFIM 2.1, see the CatFIM 2.1 Archive comment.
Current CatFIM Fixes and Upgrades - CatFIM v2.2
(Lower priority, should be fixed by AWS shift) Improve efficiency for calling the WRDS API. In our "get_thresholds" in generate_categorical_fim_flows.py, it calls WRDS to get some nws_lid but only passes in lid data and not huc data. BUT... our process_generate_flow function, it calls that get_threshhold for every huc and every ahps site. This is a massive amount of duplication. We can call WRDS for each ahps or even all ahps via the specific WRDS api call, save it to our disk and use it. We can likely even load it in certain parts of the code and pass that dataframe around to other functions including into a Multi-Proc. This would be a huge performance gain.
(Lower priority, might be fixed by AWS shift) Facilitate a pre-download option for the threshold files (similar to what we've done with the metadata files). We now have the ability to save metadata file from WRDS in
generate_categorical_fim_flows.py
but can we make a similar system for the threshold files so we can run catfim tools in a non OWP server enviro? Assuming that the fix has not yet been made to let AWS all the WRDS api. Can we just get a new DB or something we can put on a file server instead of making 5,000+ Web API calls which is a huge amount of overhead?(Lower priority) Review and maybe upgrade the HUC_messages system. The system creates a file per HUC / lid which says if if the lid was loaded or if it failed and why. It later reloads all of the files in the "huc_messages" folder, maps them to each record for the output csv with some saying why they failed. There are huge opportunities for improvement here, but it does work so it may not be worth the effort to fix. The current setup is "multi-proc" safe.
(Partially completed, Issue 1286 ) Add in an argument to remove intermediate extent files. In the mapping/{huc}/{lid} are a bump of inundated images changed extent files. It is done to each branch, then merged to a single "extent" file for that {huc}/{lid}/{lid}_{magnitude}_extent.tif. We obviously need to keep the final extent files per magnitude and not all of the branch intermediary files. Note: that system is now in but the code commented for version or two for tracing reasons. Maybe uncomment in the next HV 2.1.8 release (in a few months) to clean up all of those intermediate tif files as there is A LOT. Update: We have commented out code in there to delete them, just make it driven by an arg to delete or not (default delete). Can helps with debugging during dev. Add an argument in for deleting or not, default to delete. Nice to have when code/debugging and most of the code is there, easy to finish.
(Architecture overhaul) Fix duplication of processing paths for stage and flow based as it calls functions in the code. There is a lot of duplication of processing between stage- and flow-based. This would be an architectural overhaul to simplify the workflow. Fix will be to implement the
is_stage_based
flag throughout code.(Architecture overhaul, do at same time as 5) Adjust and review the multi-proc job number arguments. We have three job numbers, but we have some parts that are MP, inside MP, inside MP. ie
process_generate_categorical_fim
-> (MP)iterate_through_huc_stage_based
-> (MP)produce_stage_based_catfim_tifs
-> (MP)produce_inundation_map_with_stage_and_feature_ids
. This may not be the most efficient use of jobs or even MP but needs to be reviewed. There are others such aspost_process_cat_fim_for_viz
-> (MP)post_process_huc_level
-> (MP)reformat_inundation_maps
.(Lower priority, Issue 1283) Test whether the "past_major_interval_cap" and the "search" systems work, repair if needed. (arguments
-mc
and-s
).(Lower priority) Test and repair command line functionality of
generate_categorical_fim_mapping.py
andgenerate_catfim_flow.py
scripts. Rob rebuilt the generate_categorical_fim_mapping.py part way through the rebuild but did not come back to finish it. Thegenerate_catfim_flow.py
script is definitely broken.(Issue 1275) CatFIM stage site issues - jrsu1 and okfi2 (and other misc sites) in FIM 4.5.2.11
(Issue 1353) Ensure CatFIM key column names are consistent.
(Issue 1359, Priority) Facilitate CatFIM visualization for developers. Create and upload layer symbology presets. Add symbology notes to the CatFIM README. Assemble a CatFIM Vis Jupyter notebook.
(Issue 1356) Redo point filtering for non-CONUS regions (AK, PR, and HI). We should loosen filters to include points that are not NWM forecast points for these areas, but that will require us to adjust our point filtering downstream to prevent duplicate LIDs.
(Issue 1390) Troubleshoot CatFIM performance in Puerto Rico. Currently there is zero stage-based CatFIM in PR. Is there a location-specific issue that is preventing stage-based CatFIM from being produced in PR?
(High priority, Issue 1344) Fix overly short Alaska CatFIM (which is being caused by a lack of upstream/downstream IDs for Alaska in the WRDS API).
(Issue 1358) Add lake masks to stage-based CatFIM processing.
Flooding cuts off just downstream of site TOAP4 (Puerto Rico), inundating a stream line that is much shorter than expected.
(Issue 1384 ) Update CatFIM to accommodate sites where the flood levels are elevations, not stages. At site prdk2 in Paradise, KY, the observations, forecasts, and flood thresholds are elevations rather than stage values. This means that stage-based CatFIM is inundating that site up to 378 ft for 'Action' stage, which is creating an inundated area over 10 miles wide. Write a workaround into the stage-based CatFIM code to detect and mitigate this issue.
Remove remaining references to the LID selection functionality, because that has been officially replaced by the HUC list functionality.
(Issue 1396) Create and implement a restricted sites list for flow-based CatFIM. (From VLAB ticket #141798).
Add new issues here using (Priority tag, Issue _ )
The text was updated successfully, but these errors were encountered: