Added more processing logic to find institution name, applying credits, and to export HU and BU invoices #12

QuanMPhm · 2024-04-03T16:45:02Z

Closes #6 This is a draft PR that implemented most of the functionalities @joachimweyl asked for. These include:

add_institution() to get the institute name for all PIs in the invoice
apply_credits_new_pi() to apply the New PI credit to all projects in the invoice
export_HU_only() and export_HU_BU() to export HU and BU specific invoices

A feature worth considering is to export the CSVs to a storage service like Google Drive, will per @knikolla's suggestion, should be put in a separate issue.

Aside from the implementation of the processing functions to apply credits and institution, as well as the export functions, some structural changes are added:

The invoice fields names have been placed in top-level constants to provide a single convenient place to edit these field names, should they change in the future
The mapping between institution email domains and institution name is moved to the configuration file institute_map.json

process_report/process_report.py

process_report/tests/unit_tests.py

joachimweyl · 2024-04-04T19:49:09Z

I don't know Python well enough to check your code but the logic is sound.

process_report/process_report.py

knikolla

Thanks for proposing this PR and the good work.

Some suggestions, besides the comments inline. This PR implements at least 3 new features to this codebase.

Deriving institution name from a PIs username.
Exporting two additional versions of the output for only BU, and HU/BU.
The credits system.

There's not a lot of overlap between them and in situations like this in the future I would encourage you to break them up into several PRs. In this case I would have gone with 3.

This makes the code easier to review. You don't need to break this one up, just a suggestion for the future.

process_report/process_report.py

process_report/tests/unit_tests.py

process_report/process_report.py

process_report/tests/unit_tests.py

process_report/process_report.py

QuanMPhm · 2024-04-09T20:23:18Z

The ordering of institute_map.json no longer matter, as we are using exact matches instead of substrings. Per @naved001's suggestion.

process_report/process_report.py

knikolla · 2024-04-11T12:56:35Z

process_report/process_report.py

+
+def get_institution_from_pi(pi_uname):
+
+    dir_path = os.path.dirname(__file__)


When loading a file from a program, it is good practice to have the path be relative to the current working directory, not the location of the script.

knikolla · 2024-04-11T12:58:08Z

process_report/process_report.py

+###
+
+
+def get_institution_from_pi(pi_uname):


This function will read and load this file at every function call. Split the function into two, load_institution_map() -> dict() and get_institution_for_pi(pi_name) -> str.

knikolla · 2024-04-11T12:59:00Z

process_report/process_report.py

    filtered_dataframe.to_csv(output_file, index=False)

+
+def validate_billables(dataframe):
+    # Validate PI name


This comment would be unnecessary if that was the function name instead. I know that you are writing it in preparation of validating more things, but for now it's the only thing you are validating.

knikolla · 2024-04-11T13:00:42Z

process_report/process_report.py

+    for i, row in dataframe.iterrows():
+        pi_name = row[PI_FIELD]
+        if pandas.isna(pi_name): 
+            print(f"Project {row[PROJECT_FIELD]} has no PI") # Nan check


This comment is unnecessary because the function name describes the same thing due to its name isna.

knikolla · 2024-04-11T13:06:59Z

process_report/process_report.py

+    with open(f'{dir_path}/institute_map.json', 'r') as f:
+        institute_map = json.load(f)
+
+    if '@' in pi_uname:


This can be rewritten as the following, due to how split will still work regardless of whether there is @ in the string, and how we always want the last entry in the list, accessible by -1 index.

institution_key = pi_uname.split('@')[-1] institution = institute_map.get(institution_key, '')

knikolla · 2024-04-11T15:58:03Z

process_report/process_report.py

+
+
+def is_old_pi(old_pi_dict, pi, invoice_month):
+    if pi in old_pi_dict and old_pi_dict[pi] != invoice_month: 


This will not work if you are processing old invoices. Say you want to process invoices for 2024-02 that is 2 months ago. A PI with first invoice of 2024-03 is going to appear as an old PI.

I'm okay with fixing this in a follow-up PR while introducing a mechanism to update the PI file.

knikolla

This looks okay, though I'd like @naved001 to review as well before merging in case I missed something.

naved001 · 2024-04-11T17:12:21Z

process_report/tests/unit_tests.py

+        self.dataframe = pandas.DataFrame(data)
+        old_pi = ['PI2,2023-09', 'PI3,2024-02', 'PI4,2024-03'] # Case with old and new pi in pi file
+        old_pi_file = tempfile.NamedTemporaryFile(delete=False, mode='w', suffix='.csv')
+        for pi in old_pi: old_pi_file.write(pi + "\n")


please write this in 2 lines

@QuanMPhm after this is addressed you can merge this PR!

… for each PI, and exporting HU and BU invoices

QuanMPhm requested review from knikolla, naved001 and joachimweyl April 3, 2024 16:45

QuanMPhm force-pushed the 6/more_processing branch 2 times, most recently from 9bc423c to f23dd4e Compare April 4, 2024 14:21

QuanMPhm marked this pull request as ready for review April 4, 2024 18:17

QuanMPhm mentioned this pull request Apr 4, 2024

Added processing and exporting for Lenovo SU Types #14

Merged

joachimweyl reviewed Apr 4, 2024

View reviewed changes

process_report/process_report.py Outdated Show resolved Hide resolved

joachimweyl reviewed Apr 4, 2024

View reviewed changes

process_report/process_report.py Show resolved Hide resolved

joachimweyl reviewed Apr 4, 2024

View reviewed changes

process_report/process_report.py Outdated Show resolved Hide resolved

joachimweyl reviewed Apr 4, 2024

View reviewed changes

process_report/tests/unit_tests.py Outdated Show resolved Hide resolved

joachimweyl reviewed Apr 4, 2024

View reviewed changes

process_report/tests/unit_tests.py Outdated Show resolved Hide resolved

joachimweyl reviewed Apr 4, 2024

View reviewed changes

process_report/tests/unit_tests.py Show resolved Hide resolved

joachimweyl reviewed Apr 4, 2024

View reviewed changes

process_report/process_report.py Outdated Show resolved Hide resolved

QuanMPhm force-pushed the 6/more_processing branch 2 times, most recently from 22170f8 to 3f135bd Compare April 4, 2024 21:03

knikolla requested changes Apr 8, 2024

View reviewed changes

QuanMPhm force-pushed the 6/more_processing branch 3 times, most recently from 2f53b15 to 916549d Compare April 9, 2024 20:21

knikolla requested changes Apr 9, 2024

View reviewed changes

process_report/process_report.py Outdated Show resolved Hide resolved

process_report/process_report.py Outdated Show resolved Hide resolved

process_report/process_report.py Outdated Show resolved Hide resolved

QuanMPhm force-pushed the 6/more_processing branch 2 times, most recently from af098d6 to e233a3e Compare April 10, 2024 15:47

knikolla requested changes Apr 11, 2024

View reviewed changes

QuanMPhm force-pushed the 6/more_processing branch 2 times, most recently from 37e6a45 to 49400f9 Compare April 11, 2024 14:56

knikolla reviewed Apr 11, 2024

View reviewed changes

knikolla approved these changes Apr 11, 2024

View reviewed changes

naved001 reviewed Apr 11, 2024

View reviewed changes

Added processing to apply project credits, determine institution name…

c5ff060

… for each PI, and exporting HU and BU invoices

QuanMPhm force-pushed the 6/more_processing branch from 49400f9 to c5ff060 Compare April 11, 2024 18:51

QuanMPhm merged commit 7ba0ee0 into CCI-MOC:main Apr 11, 2024
1 check passed

knikolla mentioned this pull request May 2, 2024

Track usage of credit for a second month #37

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added more processing logic to find institution name, applying credits, and to export HU and BU invoices #12

Added more processing logic to find institution name, applying credits, and to export HU and BU invoices #12

QuanMPhm commented Apr 3, 2024 •

edited

Loading

joachimweyl commented Apr 4, 2024

knikolla left a comment

QuanMPhm commented Apr 9, 2024

knikolla Apr 11, 2024

knikolla Apr 11, 2024

knikolla Apr 11, 2024

knikolla Apr 11, 2024

knikolla Apr 11, 2024

knikolla Apr 11, 2024

knikolla left a comment

naved001 Apr 11, 2024

naved001 Apr 11, 2024


		def get_institution_from_pi(pi_uname):

		dir_path = os.path.dirname(__file__)



		def is_old_pi(old_pi_dict, pi, invoice_month):
		if pi in old_pi_dict and old_pi_dict[pi] != invoice_month:

Added more processing logic to find institution name, applying credits, and to export HU and BU invoices #12

Added more processing logic to find institution name, applying credits, and to export HU and BU invoices #12

Conversation

QuanMPhm commented Apr 3, 2024 • edited Loading

joachimweyl commented Apr 4, 2024

knikolla left a comment

Choose a reason for hiding this comment

QuanMPhm commented Apr 9, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

knikolla left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QuanMPhm commented Apr 3, 2024 •

edited

Loading