Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMS Part D back from the dead #368

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

jrlegrand
Copy link
Member

Resolves #239

Explanation

Changed to data.cms.gov API to get file name.

For some reason, zipfile works fine without needing zipfile_deflate64 now.

Found all sorts of other odd things that I will have to deal with for automating. lightning round:

  • random double space in the insulin beneficiary cost file
  • various files lost a column since the last time i ran
  • 2 files no longer exist since last time i ran
  • zip file says 2025 but each interior zip file named 2024 Q4 - which broke my DAG and i had to hard code some stuff for now

Rationale

CMS Part D prescription plan data.

Tests

Ran DAG.

extract (download) takes 1 min
unzip takes under 5 min
Each of the load tasks run in parallel with the Postgres COPY command

  • almost all take under 10 seconds
  • pricing takes 3 mins
  • pharmacy networks took 9 mins (i cram all 6 files into one SQL load which might be dumb) and errors out b/c my old laptop is out of space

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PIP additional requirements throws Docker error
1 participant