-
Notifications
You must be signed in to change notification settings - Fork 0
Updates and correspondence
We are keeping a log of the updates and changes that have been made to process/inputs/outputs. Please pay attention to the dates - if you are starting after the date heading, you can (most likely) disregard the information below it because the updates will have already been made to the template files and the documentation.
The PUMS/OSPI indicators are being updated to 2022. This has involved developing a workflow to replace existing files and visuals with the updated ones. As part of this process the template data-gen and vis scripts were revised and are available in GitHub.
Notes about updating these indicators are in available in this wiki: Documentation > PUMS update process.
Running these scripts is similar to before, with some small differences. Analysts new to this process should still review the background information on the two scripts.
- The data-gen script is shorter and more efficient by incorporating more functional programming principles. Warning messages and iterative code was added to the data review phase to hopefully make it easier for analysts to QC the data.
- The vis script includes additional code chunks at the end that need to be run (after knitting). These code chunks help to save/transfer/clean the GitHub and network folders so that the older data and outputs are saved in an archive folder, while the new, updated data and outputs are saved in the general folders. This workflow was developed in order to ensure that the webpages could continue to pull the visuals without any adjustments to the address paths.
In both scripts, code that was common across all indicators saved and referenced in a supplemental script.
There have been a series of revisions to the visuals and webpages before the soft launch. These are noted in the master spreadsheet tab "details to check" and the corresponding code edits have been made to the two template vis files. All of the vis scripts for the soft-launch indicators should be updated to reflect these adjustments. There are some edits that are specific for the webpages as well, which will be described below:
General changes:
- reorganization of themes/indicators (after extensive conversation with planning colleagues) - folder structure still reflects original organization
- regional collaboration and health > regional collaboration
- development patterns -> communities & health
- voter participation (public services > regional collaboration)
- imprisonment (regional collaboration and health > public services)
- cardiovascular disease mortality, life expectancy, health insurance, snap participation (regional collaboration and health > public services)
- simplifying indicator names (Median Household Income -> Household Income) - removing technical terms
- consistent source info/format, as described in Y:\Equity Indicators\tracker-webpage-content\source examples.docx
- removing source info from visuals - this required commenting out lines 94-96 in the equity-tracker-chart-creator.Rmd
- revisions to indicator webpage section headings and chart titles/subtitles, simplified map legend format
- visual color changes (map: purple, column chart: orange, line chart: blue)
In the vis scripts - listing specific changes to lines of code:
person-based (vis-pums-template.Rmd):
- map legend title (line 202) - simplifying indicator name
- map source info (line 268) - including new source format with table number
- column chart color (line 351) - orange
- column chart title/subtitle/source (lines 354-356) - new, more consistent formatting
- column chart source (line 395) - removing text from visual (will be on webpage instead)
- line chart color (line 525) - blue
- line chart title/subtitle/source (lines 526-528) - new, more consistent formatting
- line chart source (line 570) - removing text from visual (will be on webpage instead)
place-based (vis-tract-template.Rmd):
- map source info (line 270) - including new source format with table number
- column chart color (line 350) - orange
- column chart title/subtitle/source (lines 353-355) - new, more consistent formatting
- column chart source (line 394) - removing text from visual (will be on webpage instead)
- line chart color (line 562) - blue
- line chart title/subtitle/source (lines 563-565) - new, more consistent formatting
- line chart sources (line 607) - removing text from visual (will be on webpage instead)
On the webpages:
equity landing page
- reorder theme icons - alphabetical
theme pages
- remove photo
- reorder related indicator icons - alphabetical
- revisions to the resources section - less circular PSRC references
indicator pages
- links to external data sources (depending on data source) in intro text
- section links for navigating below intro and below each section
- renamed sections: ...Map, ...Now, ...Trend
- insights & analysis sections distinguished in green
- source(s) information moved from charts to webpage HTML
- more standardized note language below charts (depending on data source - describing quintiles, missing data)
In vis-tract-template.Rmd, we added the chunk of code between lines 516-560, before the "Create Facet Line Chart" code chunk. This is to ensure that x-axis years are consistent across all facets, regardless if there are data available, for example, data for people with a disability is not available in 2010 and data for households with limited English proficiency is not available for 2010 and 2015.
In addition to adding this code, we edited line 566 to ensure that the data frame referenced in the visual code is the one with the NULL values for the missing years. For more information, see here.
In data-gen-tract-template.Rmd, there has been an additional chunk of code added between lines 947-996. It is labeled with the subheading Address "Low - Low Medium" quintile group and was added because of a revision to the Elmer equity tracts view (equity.v_tract_shares) in Elmer. We noticed that for Kitsap LEP (also the case in Pierce County for LEP), the High
quintile was missing because of the low numbers of LEP households in Kitsap - there were more than one quintile with 0%. This meant that the Low
quintile had about 35-40% of the households or contained the bottom two quintiles. We changed this by adding a unique quintile group Low - Low Medium
for any county/equity group where more than the bottom quintile has 0% households. With this addition, we have to isolate those with the Low - Low Medium
designation, calculate the average or median (depending on the indicator) of the values of all of those tracts, then assign that new value to Low
and Low Medium
quintiles. This means that the Low
and Low Medium
values for these specific county/equity groups will be the same in the visuals.
To implement this change, you will need to add the code (make sure the variable names align with your data set, like County.Name
and data_year
), run the data-gen script, then re-knit your vis script.
Unfortunately, the edit to the Elmer view will make it so that re-running any previously drafted tract-level indicators will yield charts with misssing data, so eventually all tract-level indicators will need to be revised.
In the vis scripts (starting ~ line 280) the first Data call outs
chunks included code that helped analysts calculate county- and region-level values. Because of the code that was added to the data-gen scripts to calculate/generate the county and regional data for the maps (update from October 2, 2023), the vis script code is was redundant and has been removed.
There are some small edits to the legend in the map of the vis-tract-template.Rmd and vis-pums-template.Rmd scripts. These updates to the map code will ensure that the maximum value in the map legend will include the largest indicator value. Before this change, there were some cases where there were one or two tracts with slightly higher values than the maximum value displayed in the legend beause of rounding defaults. To incorporate this update there are a steps:
-
Copy/paste the code in lines 188-190 of the vis templates (starting with 'check maxiumum value of data'). In checking for the maximum value of data using
summary()
, the output will provide some descriptive stats of the indicator. Based on theMax.
value, set the variablemax_value
to the most reasonable value for the legend - this may require some adjusting based on the data and the way in which the data is rounded. You may need to change after generating the map. -
Adjust the map palette around of the map by changing the domain to include the new maximum value, as well as making sure to keep NA as part of the legend for the tracts without data. This will require changing
domain = data_tract$estimate
todomain = c(min(data_tract$estimate,na.rm = TRUE),max_value, NA)
for the vis-tract-template.Rmd or changingdomain = acs_data_plus_zero
todomain = c(min(acs_data_plus_zero,na.rm = TRUE),max_value, NA)
for the vis-pums-template.Rmd.This may be more complicated if the legend for your indicator should or should not start at zero. For example, in the case of life expectancy, which is the template indicator for the vis-tract-template.Rmd, the legend does not need to begin at zero which is why the
tract_data_plus_zero
is commented out. If you do want zero included as the minumim value on the legend, thetract_data_plus_zero
should not be commented out and the domain edits should reflect the changes included above for the vis-pums-template.Rmd (domain = c(min(tract_data_plus_zero,na.rm = TRUE),max_value, NA)
). Please reach out if you have questions. -
Adjust the
addLegend_decreasing()
code around line 245 in both vis templates by changingvalues = data_tract$estimate
tovalues = c(min(data_tract$estimate,na.rm = TRUE),max_value, NA)
for the vis-tract-template.Rmd or changingvalues = acs_data_plus_zero
tovalues = c(min(acs_data_plus_zero,na.rm = TRUE),max_value, NA)
for the vis-pums-template.Rmd.Similar to the step above, this may require some adjustments based on whether you want the minimum value on the legend to include zero. Please reach out if you have questions.
This update will result in shifting the line numbers below beause of the added lines of code.
Additional code has been developed to fix the inconsistent x-axis issue for the trend line charts for tract-level data, as described in Encountering issues.
I made an update to the trend line chart so that the lines are thicker and the circles representing the data are larger. This doesn't require any changes to the vis-script because its an adjustment to the equity-tracker-chart.R script (lines 184-5), but the vis-script will need to be re-run for all of the indicators where visuals have already been produced.
Because of the many recent changes made to the scripts, as a result of of the visualization edits, all of the line #s below have been adjusted to reflect the current template scripts - data-gen and vis.
For reference, the three major visualization edits that have been made in the past month include:
- map legend color palette adjustments (October 10, 2023, October 9, 2023) - this impacts vis-pums and vis-tract scripts
- facet chart color palette adjustments (October 10, 2023, October 6, 2023, September 22, 2023) - this impacts vis-pums and vis-tract scripts
- map labels including regional and county context, and reliability rating (October 2, 2023) - this impacts data-gen-pums, data-gen-tract, vis-pums, and vis-tract scripts (for generating the values and for visualizing them)
- An additional adjustment to the update from September 22, 2023 and October 6, 2023.
The code was adjusted to require the number of colors and the direction of the color ramp to produce the facet charts. Previously, it was stated that these adjustments were only needed for the vis-pums-template.Rmd, but it is also required for the vis-tract-template.Rmd.
For the website visuals to reflect the reverse color ramp, num_colors = 5
and color_rev = FALSE
were added to lines 369 and 370 in the vis-tract-template.Rmd to set the variables for the visualization functions. Around lines (of the template script) 386/387, 409/410, 547/548, 573/574, add num_colors = num_colors
and color_rev = color_rev
which reference the variables and ensure that the colors are correct in the html output (for review) and in the visuals for the webpages.
- In addition, a slight adjustment to the update from October 9, 2023.
For the map, we edited the psrc_purple_plus palette (psrc_purple_plus<-c("#FFFFFF", "#FFFFFF", "#F6CEFC", psrc_colors$purples_inc)
) to reduce amount of white, but maintain color contrast. This has been adjusted in both the vis-pums-template.Rmd and vis-tract-template.Rmd files at line 189.
The legend of the map has been updated to include zero and display more color contrast between the range of values. This will impact both vis templates (vis-pums-template.Rmd and vis-tract-template.Rmd). This update requires the addition of some more colors to the color ramp to include more color variation and the addition of zero as an item in the list of estimates, so that it understood that's where the lower range on the legend should start.
In the vis-pums-template.Rmd script, the changes are reflected on lines 189, 190, and 192. The new range, including zero, also replaces the old range in lines 230 and 239.
The vis-tract-template.Rmd script includes this adjusted code (lines 189, 190, 192, 233, and 243), but it is commented out because life expectancy is an exception to the general practice of having the legend begin at zero. If you have any questions for your indicators, please reach out to me to ask about how to incorporate this change in your code.
A slight adjustment to the update from September 22, 2023. In order to generate the visuals for the website, it requires the script to be knit
because of the lines of code starting around 402 and 575 in vis-pums-template.Rmd (or lines starting around 390 and 553 in the vis-tract-template.Rmd).
For the website visuals to reflect the reverse color ramp, num_colors = 2
and color_rev = TRUE
were added to lines 381 and 382 in the vis-pums-template.Rmd to set the variables for the visualization functions. Around lines 398/399, 421/422, 567/568, 595/596, add num_colors = num_colors
and color_rev = color_rev
which reference the variables and ensure that the colors are correct in the html output (for review) and in the visuals for the webpages.
Suzanne has added a functionality to get_acs_recs()
that provides the coefficient of variation and the corresponding data reliability, based on PSRC's internal guidance. This is helpful because it provides a more easily accessible way to add this information to the map labels for PUMS/person-based indicators, without each analyst having to calculate it themselves.
In addition to adding data reliability to the map for PUMS/person-based indicators, we are also adding the regional and county based values for easy comparison. The region and county additions impact both the data-gen-pums and the data-gen-tract scripts as well as the vis-pums and vis-tract scripts.
These edits change portions of the map code and the resulting labels will be slightly different depending on data availability.
- Re-install tidycensus (
install.packages("tidycensus")
) and pscrcensus (devtools::install_github('psrc/psrccensus')
)
data-gen template
- Add the reliability field to the
dplyr::select
code around lines 483 (pums) when filtering for fields of interest - this mostly applies to pums-based data because it is not always possible to access/calculate reliability data when data sets are available by tract (reference available here) - After merging census tract-based data to spatial file, add code that calculates region and county average values around lines 496-515 (pums) or 617-641 (tract) - this could be as simple as group_by/summarise or it may require pulling data from other sources depending on the indicator
- Generate and save the .rda for the map
vis-gen template
- Add adjusted label code around lines 194-210 (pums) or 200-206 (tract), which includes the region, county, and data reliability information
Chris updated the PUMS view in Elmer by adding NULL rows for every data_year
, county
, indicator_type
, focus_type
, and focus_attribute
that didn't previously have data associated with it. One widespread example of this is for every county
and indicator_type
where the data_year == 2011
, focus_type == Disability_cat
, and focus_attribute == With disability
or focus_attribute == Without disability
. The reason for this is because PUMS didn't collect data on disability in 2011. To ensure that the x-axis for the charts are consistent across focus_type
for the facet line charts (same years), we added NULL rows.
This means that some of the data-gen script needs to be revised. The data-gen-pums-template.Rmd has the updated changes - commented out lines 99 and 103, so that the full data set is available. Additional changes occur when exploring the data for NULLs - around line 150, where analysts start to explore where NULLs occur in the data set.
If you have already run the analysis and generated visuals, there is no need to revise after line 150 because you have already completed the data exploration part of the script. The revisions around line 150 are for analysts who are beginning the process after this update.
Because the two charts - the facet bar with most recent data (equity_tracker_column_facet()
) and the facet trend line (equity_tracker_line_facet()
) - use opposite color ramps depending on the data type (PUMS/person-based or tract/place-based), we have adjusted these chart functions to include new arguments. The code has been adjusted with default settings so if the additional arguments are not included, the quintile column bars will have all colors in increasing darkness from left to right.
This means that the code for the PUMS/person-based data will require some adjustements to the chart functions. Two additional arguments: num_colors=2
and color_rev=TRUE
ensure that the focus equity group is the darker color and that the two columns will have colors that are distinct enough.
For reference, these edits have been added to the vis-pums-template script (around lines 398-9 and 567-8) and these adjustments will need to be applied to all existing vis scripts that have been created prior to this date.
We are adding a section to the vis script so that the analyst can provide a general explanation about the definition of their indicator and its significance to this project. After data review, this text will be added to the top of the indicator webpage so that users are familiar with the measurement and its connection to equity.
The addition is at lines #59-60 of the vis template script, for both PUMS and tract-level data.
There is an additional general review form for the thematic landing page on the website. The additional information is here. You do not need to do anything, as this has been created and saved in each thematic subfolder.
With the addition of this review file, there have been revisions to the:
- Planning Review section, as part of the Review Process and Webpage Development page
- Y drive and More Detailed Review Process sections, as part of the Planning Review Process page
There have been updates to the visualization functions (fixing the positions of elements in the charts, or z-index). As a result, any .html outputs that have been created before today will need to be re-created. This will require re-knitting the vis script (to HTML) and then re-running the chunks at the bottom of ths script that transfers the locations of the files. This change is explained in more detail here in the Review Process and Webpage Development instructions. If you are encountering issues, review this.
An addition to the Data Generation and Visualization page, 1. Set up the network folders and subfolders (file explorer):
"In addition to creating the subfolders, you will also need to include the form that the planning reviewer will use to review the draft webpage (much later in the process). The easiest way to do this is to copy the X##-webpage-review.docx file in the 'Y:\Equity Indicators\tracker-webpage-content' folder and paste it into your indicator folder. Rename the file to reflect the correct alpha-numeric code for your theme and indicator. In the document, you can fill in the theme, indicator, and analyst information at the top of the form."
This edit ensures that the form for the planning reviewer will be ready for their step in the process.
After doing the first review round with Brian, I’m editing a few things in the templates that you should edit in your scripts. If you need a reference, these changes will be in the template scripts on GitHub (both pums/ospi and tract versions):
-
Add source information under the map – the basic structure of this is data source, year, data type [example: U.S. Census Bureau, American Community Survey (ACS) 2021 5-Year Public Use Microdata Sample (PUMS)] – you are combining the data (from ACS or wherever) with the shapefile so there will be two sources to list:
a. vis-pums-template (line 262) – EXAMPLE: Sources: U.S. Census Bureau, American Community Survey (ACS) 2021 5-Year Estimates; U.S. Census Bureau, Geography Division 2020 TIGER/Line Shapefiles
b. vis-tract-template (line 266) – EXAMPLE: Sources: U.S. Environmental Protection Agency (EPA) 2021 Toxic Release Inventory (TRI) Program; U.S. Census Bureau, Geography Division 2020 TIGER/Line Shapefiles
-
Adjust the ‘People with Limited English Proficiency to ‘Household with Limited English Proficiency’
a. data-gen-pums-template (line 258 and 279) – “People with Limited English Proficiency” should be changed to “Households with Limited English Proficiency”
b. data-gen-tract-template (line 961 and 972) – “People with Limited English Proficiency” should be changed to “Households with Limited English Proficiency”
-
A note on numbers/rounding:
a. if you’re using % - round to the nearest ‘whole’ percent (30% instead of 30.4%) - no decimals (this has been changed in the vis-template)
b. if you’re using $ - round to the nearest hundred ($106,700 instead of $106,748)
Other notes:
- The weird line issue I was encountering has mostly been resolved, but needs to be fixed in the code – this is more relevant for the pums data because of the missing pums disability data in 2011 – I am working with Michael to create a solution, so you may get another update email for the data-gen-pums script
- I’ve added some notes about incorporating edits from the review into the vis script (GitHub wiki), if it’s needed (if the visuals need to be changed). This is probably not as relevant yet as your outputs haven’t been reviewed, but will be useful in the future once the review process picks up.