-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jon colorado data (boulder) #2
Conversation
Some issues neeed to be addressed before merging this in:
|
|
||
|
||
--fields for politician table | ||
----Where is the id field coming from? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Id's are generated on INSERT for pretty much every table (they are UUIDs which aren't sequential or dependent on one another)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, that has been confusing me (and is why I had a few fields. commented out) because if the IDs are generated on insert they don't exist in my table yet, I won't be able to add them to the select, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, they aren't needed for these staging SELECTs
populist.code-workspace
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add this file to .gitignore
31de5e9
to
88987fd
Compare
I created the boulder_updated_filings table from the boulder_updated_filings.csv, that should be the only file needed to run the models. |
ELSE 'district' | ||
END AS election_scope, | ||
CASE | ||
WHEN office ILIKE '%Mayor%' THEN 'Mayor' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we want to set seat
as "Mayor" in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll also need to join this model to our existing public.politician table so that we can deduplicate politicians and insert the race_candidate records properly. Look at what i did in the mn intermediate model
LEFT JOIN transformed_filings AS tf ON f.email = tf.email | ||
LEFT JOIN transformed_filings_1 AS tf1 ON tf.email = tf1.email |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think these joins are needed as you can select from either of the CTEs in this final select statement to get exactly what you need
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed up a bunch of changes and dbt run is looking good so far! Lets get this merged then we can get the data into the public schema. You can click into my commits above to see what changes were needed to get this working.
I first created an intermediate table where I did most of the transformations, then created tables for boulder races, offices, politicians, and race candidates. I used the existing tables as a guide and they appear to almost match. One area where I see differences is when it comes to the ID fields. I left those fields in models, but commented out.
I was under the impression those uuid fields are generated when we insert the rows into the tables, so maybe they just need to be inserted in a specific order so that the first uuid field is generated, and the rest of the uuid fields are based on that?
I commented out the insert statements.
I used sql fluff to lint the models.