-
Notifications
You must be signed in to change notification settings - Fork 0
Relative reliability calculations
For most of the PUMS indicators we are generating maps at the tract-level using the corresponding 5-year ACS data tables. As part of the map we are including the relative reliability ratings to express how much confidence we have in the estimates. More information about PSRC's internal statistical significance guidance is available here.
When accessing ACS data using get_acs_recs()
, the MOE and relative reliability are generated automatically through the function, which we display on the map through the hover labels. If the ACS data provides the indicator in the correct form, such as median gross rent, nothing additional needs to be done to generate/prepare the data set for visualization. When the ACS data requires additional aggregations/transformations and calculations, the relative reliability values need to re-calculated. This is the case for indicators such as housing cost burden (renters), homeownership, and overcrowding, all of which require calculating percentages of a (sub)population.
- Housing cost burden (renters): the percentage by tract is calculated by dividing the number of renter households that are housing cost burdened by the total number of renter households (with the 'Not computed' households subtracted from the total)
- Homewonership: the percentage by tract is calculated by dividing the owner-occupied housing units by the total number of housing units
- Overcrowding (renters): the percentage of renters by county who are overcrowded (defined as 1.5+ persons per room) is calculated by dividing the number of households that are 'overcrowded' by the total number of renter households -- this variable may or may not be included in the Equity Tracker because of data reliability issues (described below)
Although these three examples represent a proportion of the total population of households, we are generating the relative reliability values using moe_sum()
instead of moe_prop()
from the tidycensus
R package. This means that we are more interested in the reliability of the numerator value, which in some of these instances is extremely small/unreliable, instead of the relationship between the small numerator and the larger/more reliable denominator. This decision was based on a few tracts where the data reliability seemed to improve when using moe_prop()
, while the values themselves were unreliable. See example below:
Raw data for two census tracts (2021)
When the rent_burden categories are aggregated to distinguish between rent burdened (>=30%) and not rent burdened (<30%), we can do a rough calculation for the share of rent burdened households in census tract 53033000101 - by adding the estimates (107+96+143+391) and dividing by the total number of households (we are subtracting the 'not_computed' (1561-74) from the total), which returns 0.4956288 or 49.56% of renter households in that census tract being cost-burdened. This can be done for census tract 53033000102 as well: (207+34+260+270)/(1339-114)= 0.6293878 or 62.94%.
For each of the rent_burden categories (before aggregation), the numbers are so small that the reliability is low (53033000101: 2 categories rated as 'use with caution,' 1 rated as 'use with extreme caution,' and one that is 'fair'; 53033000102: 3 rated as 'use with caution,' 1 rated as 'use with extreme caution').
Using moe_prop()
:
When we tried using moe_prop
the resulting relative reliability score improved to 'good' for both of these test census tracts. The estimates for the census tracts 53033000101 and 53033000102 match the calculations from above.
Using moe_sum()
:
Using moe_sum()
also improved the relative reliability score, but to 'fair' which seems more reasonable. The estimates for the census tracts 53033000101 and 53033000102 match the calculations from above.
Raw data for three census tracts (2021)
To calculate the share of home ownership per census tract, we divide the number of owner-occupied housing units by the total number of occupied housing units.
- 53033000101: 130/1691 = 0.07687759, or 8% (relative reliability: 'good' and 'use with caution')
- 53033000102: 807/2146 = 0.3760485, or 38% (relative reliability: 'good')
- 53033000201: 1139/2326 = 0.4896819, or 49% (relative reliability: 'good' and 'fair')
Using moe_prop()
:
Using moe_sum()
:
Using moe_prop()
and moe_sum()
calculates relative reliability scores that match the numerator, or number of owner-occupied units, which seems appropriate. The estimates for the census tracts 53033000101, 53033000102, and 53033000201 match the calculations from above. In this case, the values/outputs from the two functions are likely the same because generating home ownership values didn't require calculating a new denominator like it did for housing cost burden.
The exceptions to including relative reliability calculations in the map are for kindergarten readiness and overcrowding (renters), which use alternative geographies and/or ways of generating the values for the map.
- Kindergarten readiness data is accessed through OSPI, not PUMS, which means that there isn't MOE data available. In addition, the geographies are at the school district level, instead of the more common census geographies
- Overcrowding (renters) is complicated. The data are available through ACS like the other PUMS indicators, so it does include MOE data. However, the indicator is defined differently between the available ACS table and the data we are pulling from PUMS by the six equity groups. Through PUMS we are using number of people per bedroom, with a threshold olf 1.5+ persons/bedroom, but the ACS table is based on the number of persons/room in the house. This is described in more detail in a separate document (currently in the Project folder: Y:\Equity Indicators\tracker-webpage-content\e-housing\e04-overcrowding\overcrowding_challenge.docx). Before realizing that the ACS table did not line up with the PUMS measurement (rooms vs. bedrooms) we were concerned about the data reliability (at the tract, PUMA, and place-level). Because of data reliability concerns, we have decided that visualizing the data at any geographic level more refined than the county-level would be misleading. In addition, because of the difference between the data available through ACS tables and data available through PUMS, we decided to be consistent and visualize PUMS data at the county-level. There are no ACS tables available that provide persons/bedroom by tenure - the two options are tables B25042 (Tenure by Bedrooms) or B25014 (Tenure by Occupants per Room). This is the only indicator so far that uses county level data from PUMS.