covid-19-genomes

DataViz Projects on topic of COVID-19 genomic sequencing.

Mostly showing Australia by default, other countries available for selection.

Most pages now show the nextclade.org lineages, an alternative lineage classification tool. Alternates of most pages are also available showing the original GISAID lineages if preferred, but most experts recommend nextclade for quicker and more precise calls, especially of newer lineages. Pages with (nextclade) in the page title show nextclade lineages, otherwise GISAID lineages are show. The page navigation is at bottom-centre, e.g. < 2 of 30 >.

gisaid.org with nextclade lineages - top Lineages by Country/Location

Link to interactive DataViz

gisaid.org with nextclade lineages - top Countries for a selected Lineage

Link to interactive DataViz

gisaid.org with nextclade lineages - top Locations for a selected Lineage

Link to interactive DataViz

gisaid.org with nextclade lineages - Lineage growth comparison (log)

Link to interactive DataViz

gisaid.org with nextclade lineages - map

Visualise the geographical spread of a selected lineage. Use the play control at bottom for an animated view of the spread.

Locations are approximate - typically by reporting state/province or country. Bubble sizes are driven by the % of the total set of samples selected.

Link to interactive DataViz

gisaid.org with nextclade lineages - sankey

Rolls up the evolutionary tree of lineages from the highest level ancestors (far left) to the most evolved descendants (far right). Each segment of a vertical column shows the counts of that lineage plus all it's descendants. Slicers for the range of Levels and Minimum # of Samples can be used to produce a more focussed output.

The rollup logic is a bit heavy, so please be patient with this page.

Jeff Gilchrist wrote an excellent thread explaining how to drive this Sankey page.

Link to interactive DataViz

gisaid.org with nextclade lineages - geography frequencies

Track the weekly progress of a selected lineage for any combination of Continents and Countries. Shows the counts of that lineage vs the overall total, by week collected, also as a %.

Link to interactive DataViz

gisaid.org - archive

Link to interactive DataViz

Reference:

International data on COVID-19 genomic sequencing, for analysis and reporting on variant prevalence by country, region and even global.

Global data gathered from GISAID. Sequence data is processed through the Nextclade CLI to produce the generally preferred Nextclade Lineage classifications.

I'm mainly following the visualisation style I first saw presented by Trevor Bedford. The main feature are clean, simple line charts, filtered by default to the top 7 series in the selected data. For each chart point, the frequency of that lineage in the last 7 days is calculated, always comparing to all the sequencing data available for that country/location.

Other pages presented include showing a single Lineage by Country or by Location. The top 7 lineages in the selected Continent/Countries/Locations will be shown, with frequency calculated as above.

The Lineage growth comparison (log) page was suggested by Uffe Poulsen, based on a chart produced by Alex Selby.

The main gisaid dataset only presents data for the last few months, to save processing time. It is typically refreshed weekly. The "gisaid - archive" dataviz presents all the historical data, but is only refreshed monthly at best.

Summary

The available sites presenting data on genomic sequencing are typically limited to country or global perspectives, with limited interactivity and often using overly complex visualisations. Each site has its own visualisation style. They are each updated independently.

In this project, the data from those sources is presented in an interactive data visualisation tool: Power BI. This allows interactive filtering of the data in the table, for easier analysis.

A page is presented for each data source (now only gisaid, but formerly also microreact, nextstrain, UCSC and cdgn), and the gisaid data has alternate pages showing either the Nextclade lineage classifications, or GISAID's own lineage classifications.

Earlier lineages are translated into the commonly known variant names (e.g. Delta) following the WHO naming. More recent lineages are grouped into "clans", roughly following the work of the Variant Trackers group e.g. T. Ryan Gregory. These are grouped using the field Lineage L2, for example the Lineage L2 "clan" BA.2.86.* includes the BA.2.86 lineage and all it's descendents. The Lineage L2 "clans" are mutually exclusive, so XBB.1.9.* excludes all of the EG.5.* lineages.

The default country selection for most pages is Australia. As well as being where I live, genomic sequencing for Australia has a relatively high proportion of genomes sequenced vs total COVID-19 cases.

The user can choose any alternative country, and also filter the date range or Lineages included. It is possible to combine multiple countries, even all data for a continent or globally. However note that the sampling is most datasets is heavily skewed to a handful of countries.

The primary visual on each page is a line chart showing the Lineage Frequency (calculated as a moving average over the prior 7 days, compared to all the other lineages present in the data (regardless of selections)). To keep the line charts clean, only the seven most-frequently occuring Lineages are shown (dynamically determined). Alternate pages compare Countries or Locations for a selected Lineage, again typically showing the top seven.

The gray inverted column chart below each line chart shows the counts of all genomes sequenced over the same period. A typical pattern is that the sample volume drops for more recent

An interactive table at the bottom right lists the individual observations presented by each dataset.

From gisaid.org we gather their EpiCoV metadata dataset. For most countries, this dataset is the most complete and up-to-date available.

Elbe, S., and Buckland-Merrett, G. (2017) Data, disease and diplomacy: GISAID’s innovative contribution to global health. Global Challenges, 1:33-46. DOI:10.1002/gch2.1018 PMCID: 31565258

From nextclade we classify the gisaid samples to obtain the nextclade pango lineage (using the nextclade cli tool). These offer an alternative to the pango lineages presented by gisaid. Typically new lineages are defined first in nextclade, and are preferred by some experts.

THIS REPORT IS NOT HEALTH ADVICE - REFER TO YOUR LOCAL HEALTH AUTHORITY.

🤝 Support

Contributions, issues, feature requests and sponsorship are all welcome!

Give a ⭐️ if you like this project!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

covid-19-genomes

gisaid.org with nextclade lineages - top Lineages by Country/Location

gisaid.org with nextclade lineages - top Countries for a selected Lineage

gisaid.org with nextclade lineages - top Locations for a selected Lineage

gisaid.org with nextclade lineages - Lineage growth comparison (log)

gisaid.org with nextclade lineages - map

gisaid.org with nextclade lineages - sankey

gisaid.org with nextclade lineages - geography frequencies

gisaid.org - archive

Reference:

Summary

🤝 Support

About

Releases

Sponsor this project

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 475 Commits
output		output
Coronavirus - Genomic Sequencing - report Australia.pbix		Coronavirus - Genomic Sequencing - report Australia.pbix
Coronavirus - Genomic Sequencing - report Canada Ontario.pbix		Coronavirus - Genomic Sequencing - report Canada Ontario.pbix
Coronavirus - Genomic Sequencing - report Canada.pbix		Coronavirus - Genomic Sequencing - report Canada.pbix
Coronavirus - Genomic Sequencing - report EUR-UK.pbix		Coronavirus - Genomic Sequencing - report EUR-UK.pbix
Coronavirus - Genomic Sequencing - report Global.pbix		Coronavirus - Genomic Sequencing - report Global.pbix
Coronavirus - Genomic Sequencing - report NZ.pbix		Coronavirus - Genomic Sequencing - report NZ.pbix
Coronavirus - Genomic Sequencing - report UK.pbix		Coronavirus - Genomic Sequencing - report UK.pbix
Coronavirus - Genomic Sequencing - report USA.pbix		Coronavirus - Genomic Sequencing - report USA.pbix
Coronavirus - Genomic Sequencing.pbit		Coronavirus - Genomic Sequencing.pbit
Coronavirus - Genomic epidemiology - cdgn.png		Coronavirus - Genomic epidemiology - cdgn.png
Coronavirus - Genomic epidemiology - gisaid - Botswana 2021-11-27.png		Coronavirus - Genomic epidemiology - gisaid - Botswana 2021-11-27.png
Coronavirus - Genomic epidemiology - gisaid - Omicron Countries 2021-11-27.png		Coronavirus - Genomic epidemiology - gisaid - Omicron Countries 2021-11-27.png
Coronavirus - Genomic epidemiology - gisaid - archive.png		Coronavirus - Genomic epidemiology - gisaid - archive.png
Coronavirus - Genomic epidemiology - gisaid - countries.png		Coronavirus - Genomic epidemiology - gisaid - countries.png
Coronavirus - Genomic epidemiology - gisaid - lineage growth comparison log.png		Coronavirus - Genomic epidemiology - gisaid - lineage growth comparison log.png
Coronavirus - Genomic epidemiology - gisaid - locations.png		Coronavirus - Genomic epidemiology - gisaid - locations.png
Coronavirus - Genomic epidemiology - gisaid.png		Coronavirus - Genomic epidemiology - gisaid.png
Coronavirus - Genomic epidemiology - microreact.png		Coronavirus - Genomic epidemiology - microreact.png
Coronavirus - Genomic epidemiology - nextstrain PANGO.png		Coronavirus - Genomic epidemiology - nextstrain PANGO.png
Coronavirus - Genomic epidemiology - nextstrain emerging.png		Coronavirus - Genomic epidemiology - nextstrain emerging.png
Coronavirus - Genomic epidemiology - outbreak info.png		Coronavirus - Genomic epidemiology - outbreak info.png
Coronavirus - Genomic epidemiology - ucsc.png		Coronavirus - Genomic epidemiology - ucsc.png
Country-Codes-ISO-3166.xlsx		Country-Codes-ISO-3166.xlsx
Games-Related_Cases_data.csv		Games-Related_Cases_data.csv
LICENSE		LICENSE
README.md		README.md
cdgn-variants-of-concern.zip		cdgn-variants-of-concern.zip
continents-according-to-our-world-in-data.csv		continents-according-to-our-world-in-data.csv
covid-19-genomes-get-cdgn-voc.py		covid-19-genomes-get-cdgn-voc.py
covid-19-genomes-get-nextstrain.py		covid-19-genomes-get-nextstrain.py
covid-19-genomes-get-ucsc.py		covid-19-genomes-get-ucsc.py
covid-19-genomes-map-BA-2-86.mp4		covid-19-genomes-map-BA-2-86.mp4
covid-19-genomes-map-EG-5-1.mp4		covid-19-genomes-map-EG-5-1.mp4
covid-19-genomes-map-FU-1.mp4		covid-19-genomes-map-FU-1.mp4
covid-19-genomes-map-JN-1.mp4		covid-19-genomes-map-JN-1.mp4
covid-19-genomes-map-XBB-1-16-Arcturus.mp4		covid-19-genomes-map-XBB-1-16-Arcturus.mp4
covid-19-genomes-map-XBB-1-22-Bellatrix.mp4		covid-19-genomes-map-XBB-1-22-Bellatrix.mp4
covid-19-genomes-map-XBB-1-5-USA.mp4		covid-19-genomes-map-XBB-1-5-USA.mp4
covid-19-genomes-map-XBB-1-5.mp4		covid-19-genomes-map-XBB-1-5.mp4
covid-19-genomes-map-XBB-2-3-Acrux.mp4		covid-19-genomes-map-XBB-2-3-Acrux.mp4
covid-19-genomes-map-XBF-2023-01-20.mp4		covid-19-genomes-map-XBF-2023-01-20.mp4
covid-19-genomes-map.png		covid-19-genomes-map.png
covid-19-genomes-sankey-user-guide-by-Jeff-Gilchrist.pdf		covid-19-genomes-sankey-user-guide-by-Jeff-Gilchrist.pdf
covid-19-genomes-sankey.png		covid-19-genomes-sankey.png
covid-19-genomes-table.png		covid-19-genomes-table.png
covid-19-genomes-user-guide-by-Jeff-Gilchrist.pdf		covid-19-genomes-user-guide-by-Jeff-Gilchrist.pdf
iban country-codes.xlsx		iban country-codes.xlsx
nextclade-fix-seqName.py		nextclade-fix-seqName.py
nextclade-refresh.py		nextclade-refresh.py
nextclade-run.py		nextclade-run.py
nextclade-tree-to-text.py		nextclade-tree-to-text.py
nextclade-unpack-input.py		nextclade-unpack-input.py

License

Mike-Honey/covid-19-genomes

Folders and files

Latest commit

History

Repository files navigation

covid-19-genomes

gisaid.org with nextclade lineages - top Lineages by Country/Location

gisaid.org with nextclade lineages - top Countries for a selected Lineage

gisaid.org with nextclade lineages - top Locations for a selected Lineage

gisaid.org with nextclade lineages - Lineage growth comparison (log)

gisaid.org with nextclade lineages - map

gisaid.org with nextclade lineages - sankey

gisaid.org with nextclade lineages - geography frequencies

gisaid.org - archive

Reference:

Summary

🤝 Support

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages