Skip to content

DataViz on Australian COVID-19 Vaccinations and related statistics

License

Notifications You must be signed in to change notification settings

Mike-Honey/covid-19-au-vaccinations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

covid-19-au-vaccinations

DataViz of Australian COVID-19 Vaccinations and other related statistics

Statistics by geography page - pick a stat (any stat) for a time series

Link to interactive dataviz

Click to view and interact with the report

Statistics by geography - log scale page

Link to interactive dataviz

Click to view and interact with the report

Statistics by geography - with forecast page

Link to interactive dataviz

Click to view and interact with the report

Cases, hospitalisation, ICU, deaths page

Link to interactive dataviz

Click to view and interact with the report

7-day cases, deaths, tesing, vaccinations page

Link to interactive dataviz

Click to view and interact with the report

Weekly cases, deaths, tesing, vaccinations page

Link to interactive dataviz

Click to view and interact with the report

Cases, Reff & deaths snapshot page

Link to interactive dataviz

Click to view and interact with the report

Risk Analysis page

Link to interactive dataviz

Click to view and interact with the report

Death Toll page

Link to interactive dataviz

Click to view and interact with the report

Death Toll - long page

Link to interactive dataviz

Click to view and interact with the report

Infographic on Vaccinations

Link to interactive dataviz

Click to view and interact with the report

Infographic page

Link to interactive dataviz

Click to view and interact with the report

Daily Doses page

Link to interactive dataviz

Click to view and interact with the report

Cases and hospitalisation vs CDC Community Levels page

Link to interactive dataviz

Click to view and interact with the report

Days Since page

An analysis based on the last date that CMOs/CHOs etc in each jurisdiction last held a press conference or similar and took questions from the media on COVID.

Link to interactive dataviz

Click to view and interact with the report

Reference:

Data on Australian COVID-19 statistics from health.gov.au, via a crucial citizen data science effort to convert that into useful data by dbRaevn.

Data on Australian Vaccinations and other COVID-19 statistics from covidlive.com.au. This is the source for Australian numbers presented in Our World In Data. Unfortunately concise government sources are not available, so there is a significant citizen data science effort to collate these figures in a consistent way.

For Victoria, a further citizen data effort is used to provide the daily key statistics from September 2022 until June 2024. https://github.com/dbRaevn/covid19

Data on Australian Population from abs.gov.au. Summarised population is unsuitable as their bands of age ranges do not align with the "Adult = 16+" definition used by Health Departments. So the method used is to get the most detailed time series spreadsheets, e.g. "Population - Victoria". Each column represents a single year age, by Gender. The columns for "Persons" Gender are selected. "Adult" is derived by summing the columns for ages 16+ (note: split across 2 sheets). The latest row available (dated June 2020) is selected.

Data on Australian Mortality from COVID-19 for the Death Toll pages is from the Australian Bureau of Statistics - Provisional Mortality Statistics. The weekly data provided is spread randomly among days of the week. It is updated monthly, but due to the death certification process it typically lags by 4 months or so.

Data is scraped from the PDF reports presented on the page COVID-19 outbreaks in Australian residential aged care facilities, using the python notebook shown in the health-aged-care folder. The current targets are the stats on Molnupiravir and Paxlovid prescriptions, and the data from all tables. Due to file format changes, these are currently extracted from 1 April 2024 onwards. The results are collated into a file: health-aged-care.xlsx, stored in that sub-folder.

Reff is calculated following the parameters used by Professor Adrian Esterman - first smooth the daily cases using a 7-day average, then divide the latest day's cases (7-day avg) by the cases (7-day avg) from 4 days prior. It shows the momentum in the rise of fall of cases, with a Reff of 1.0 meaning cases are neither declining nor falling.

The Death Toll charts were inspired by a NY Times visualisation, showing each death as a small black point. The data source is the Australian Mortality data from the ABS. To provide contrast to the Date axis, a random spread is introduced on the other axis.

Infographic was inspired by the visualisation design of Marta Fioni, as featured on the UK government "equivalent" dashboard.

The cases, deaths, tesing, vaccinations analyses were inspired by the visualisation design of John Burn-Murdoch, as featured in the Financial Times.

The Cases, hospitalisation, ICU, deaths analysis was inspired by the visualisation design of Louis Rossouw.

The Cases and hospitalisation vs CDC Community Levels analysis compares case and hospitalisation rates with Community Levels set by the US CDC.

The first metric they look at is "Cases per 100,000 people for the past 7 days", being fewer than or more than 200. All regions of Australia have been well over that mark for the Omicron outbreak. So I'm focussing on the last 2 rows in the CDC table.

Their next metric is Admissions per 100K (7-day total). Australia doesn't publish that data, but you can see from the first 2 charts on this OWID page the "in Hospital" numbers are a very close proxy, for the US and other comparable countries. https://ourworldindata.org/covid-hospitalizations

The last CDC metric is % of staffed inpatient beds occupied. I don't have data on hand for that for Australia. But as Australia is well into the "High" zone on the first 2 indicators, and the CDC instruction is "Use the Highest Level that applies", that metric is not needed.

The Days Since analysis is based on crowdsourced data for the last date that CMOs/CHOs etc in each jurisdiction last held a press conference or similar and took questions from the media on COVID. The number of days since that event is derived and used to present a "league table", along with some key COVID statistics that have been reported from that jurisdiction in that period (after the Last Media Date).

Summary

The Statistics by Geography page shows a time series chart with a line for each Geography (states & territories, plus the national aggregate). A wide range of statistics are available, covering cases, hospitalisations, ICU, deaths, testing, vaccinations. I favour ratios per X population for comparison of the Geographies, as their population varies widely. The raw numbers are also available. You can choose which Geographies to include, and the time period.

The Statistics by Geography - log axis page shows a time series chart with a line for each selected statistic, for a selected Geography (states & territories, plus the national aggregate). The default statistics selected are cases, hospitalisations, ICU & deaths. You can deleselect any of those and select any other statistics you want to compare. This presentation helps compare growth trends and align the timing of momentum changes.

The Statistics by Geography - forecast page shows a time series chart with a line for a selected statistic and Geography (states & territories, plus the national aggregate). A 10-day forecast is shown (where possible) using a logistic regression algorithm. Grey bands show the 95% Confidence Intervals.

The cases, deaths, Reff snapshot page compresses the recent trends on cases and deaths into a dense infographic. The latest number in each category (by Geography) is supported by a sparkline to show the recent trends.

The cases, deaths, tesing, vaccinations pages contrasts two key statistics - Cases vs Deaths. Testing % positive is shown as this informs the understanding of reported cases - when % positive is high, the true infection count is much higher than reported cases. Vaccinations are reported as % of the population (outcomes perspective, not work done).

The Cases, hospitalisation, ICU, deaths analysis page shows multiple key statistics as time-series on a single line chart. A log-scale Y-Axis helps comparison of growth for metrics that are at widely varying scales. Testing % positive is shown as this informs the understanding of reported cases - when % positive is high, the true infection count is much higher than reported cases.

The infographic seeks to boil down the many figures on vaccinations into an easily digested picture. I'd been looking for inspiration on this for quite a while - the key questions in my mind are:

  • How many people are protected?
  • How many to go?

The "waffle chart" visual chosen by Marta presents the vaccination data, standardised by the population of various Geographies. Geographies of wildly differing populations (e.g. New South Wales vs Tasmania) can be easily compared.

Risk Analysis

The Risk Analysis page estimates the % of the Australian population Currently Infectious, based on Aged Care Staff Cases. The (somewhat heroic) assumption is that this data series has been consistent across the time period, with data shared for all states and territories, with the same data collection and testing methods used in every jurisdiction and over time, and with the same relative relationship to population cases.

Starting from that assumption, the total of Aged Care Staff Cases were translated into population level infections using this method:

Starting from the Kirby seroprevalence surveys:

  • Between Round 3 and 4, seroprevalence increased by 19%
  • Add 20% for the limits of seroprevalence testing (maxes out at 80%) = 23% infected
  • 23% of the Australian population of ~26M = 6M
  • Between the end date for Round 3 (2 Sep 2022) and Round 4 (13 Dec 2022), around 7,600 Australian Aged Care Staff cases were reported
  • Therefore, each Aged Care Staff Case represents ~800 infections in the broader population (6M / 7,600 = 789)

The last 6 months are shown. A median Infectious Period of 10 days is used to get from daily cases to the percentage Currently Infected.

The estimated total number of people infected across that period is calculated, both as a % of the Australian population and as the number of people. This ignores re-infections during the period, which are less likely over a 6-month window.

The data is presented in an interactive data visualisation tool: Power BI. This allows interactive filtering of the data (e.g. by Geography or Date), and includes supporting charts and data tables.

Death Toll

The Death Toll charts use the Deneb Custom visual, which wraps the Vega-Lite grammar for interactive graphics in a convenient container for Power BI authors.

The Deneb Templates for the two styles of charts used for the Death Toll are available:

The dataviz is refreshed automatically, every day. The various data sources are updated somewhat unpredictably.

I have a similar project running for other countries globally:

  • World (select country, countries, continents or global summary)

THIS REPORT IS NOT HEALTH ADVICE - REFER TO YOUR LOCAL HEALTH AUTHORITY.

Note: Prior to 1 June 2021, the figure shown for % 1st dose (and the pale green boxes representing that figure) were overstated - I misunderstood the meaning of the total 1st dose count as excluding people who received a 2nd dose, when in fact they are included.

🤝 Support

Contributions, issues, feature requests and sponsorship are all welcome!

Give a ⭐️ if you like this project!

About

DataViz on Australian COVID-19 Vaccinations and related statistics

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

 

Packages

No packages published