Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor iati_transactions code into a general-purpose reload function that could create multiple tables #772

Open
akmiller01 opened this issue Jul 19, 2022 · 2 comments

Comments

@akmiller01
Copy link
Contributor

At present we have the following data update scripts:

  1. iati.sh
  2. iati_datastore.sh
  3. iati_registry_refresh.sh
  4. iati_transactions.sh
  5. iati_transactions_retry.sh

We should clean up, simplify, and refactor these scripts such that they:

  1. Run Python/iati_refresh.py, marking IATI datasets as either new, modified, or stale in the iati_registry_metadata table.
  2. Run a new script based on Python/iati_transactions.py, which is capable of modularly modifying multiple IATI-based tables at the moment new data is loaded (IATI nomenclature would call this iati_reload.py).
  3. Run auxiliary scripts such as Python/iati_rhfp.py which rely on the entire data structure and don't need to be run strictly on new/modified files.

The idea is that instead of running iati_transactions.sh run once, at which point the information about whether a dataset is new is destroyed, any number of tables can be progressively built during the reload process just as iati_transactions.py is.

@akmiller01 akmiller01 added the ETL label Jul 19, 2022
@wakibi
Copy link
Contributor

wakibi commented Jul 21, 2022

Will depend on #773

@stale
Copy link

stale bot commented Sep 21, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Sep 21, 2022
@stale stale bot closed this as completed Oct 1, 2022
@edwinmp edwinmp added pinned and removed wontfix This will not be worked on labels Oct 3, 2022
@edwinmp edwinmp reopened this Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants