Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding latitude and longitude range for the map #71

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ on:
push:
branches:
- master
schedule:
- cron: '0 1 * * *' # run at during the 1st hour of each day
jobs:
build-and-deploy:
runs-on: ubuntu-18.04
Expand Down
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,16 @@ situation for their local environment. The ultimate goal is to influence
individual behavior, to decrease the spread.

The goal is to reach the general public, not experts familiar with graphs
and numbers. For this reason, effort is put on simplifying the
and numbers. For this reason, we put a lot of effort on simplifying the
visualization and putting it along simple text.

The predictions and the associated text should be trustworthy, hence be
solid and sober, rather than fancy and dramatic.

## Well thougt-out visualization on COVID-19
## Well thought-out visualization on COVID-19

COVID-19 is a serious issue and our visualization and data analysis needs
to be thought through serious. The following is a good read:
to be approached in a thoughtful, serious manner. The following would be a good read: <br>
https://medium.com/nightingale/ten-considerations-before-you-create-another-chart-about-covid-19-27d3bd691be8

# Development workflow
Expand Down Expand Up @@ -70,3 +70,6 @@ The Makefile

Care is taken to have a static page, to be able to handle the load with
many visits.

An automatic schedule job is launched each day at 1:00 am (UTC) to build the
website and update with the latest available data.
39 changes: 35 additions & 4 deletions make_figures.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,13 @@ def make_map(df, df_fatalities, df_recovered):
df_recovered['value']],
color_continuous_scale='Plasma_r',
labels={'color': 'Active<br>cases<br>per<br>Million'})

fig.update_geos(lataxis_range=[-80, 90],
lonaxis_range=[-165, 180]
)
fig.update_layout(title='Click on map to add/remove a country',
yaxis=dict(scaleanchor='x',
scaleratio=10),
coloraxis_colorbar_tickprefix='1.e',
coloraxis_colorbar_len=0.6,
coloraxis_colorbar_title_font_size=LABEL_FONT_SIZE,
Expand Down Expand Up @@ -144,12 +150,18 @@ def make_timeplot(df_measure, df_prediction):
method="update",
),
dict(
args=["yaxis", {'type':'log'}],
args=[{'yaxis': {'type':'log'},
"legend": {'x':0.65, 'y':0.1,
"font":{"size":18},
}}],
label="log",
method="relayout",
),
dict(
args=["yaxis", {'type':'linear'}],
args=[{'yaxis': {'type':'linear'},
"legend": {'x':0.05, 'y':0.8,
"font":{"size":18},
}}],
label="lin",
method="relayout",
),
Expand All @@ -171,8 +183,27 @@ def make_timeplot(df_measure, df_prediction):
# The legend position + font size
# See https://plot.ly/python/legend/#style-legend
legend=dict(x=.05, y=.8, font_size=LABEL_FONT_SIZE,
title="Active cases in"),
)
)
)
fig.add_annotation(
x=0.1,
y=0.95,
xref='paper',
yref='paper',
showarrow=False,
font_size=LABEL_FONT_SIZE,
text="Active cases")
fig.add_annotation(
x=1,
y=-0.13,
xref='paper',
yref='paper',
showarrow=False,
font_size=LABEL_FONT_SIZE - 6,
font_color="DarkSlateGray",
text="Drag handles below to change time window",
align="right")

return fig


Expand Down
20 changes: 12 additions & 8 deletions text_block.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,23 @@

These visualizations give predictions about the future number of active COVID-19 cases. The predictions are based on extrapolating the growth observed in a given country over the last two weeks.

These predictions are only short-term extrapolation: predicting the future is hard, and epidemic dynamics will change with changes in public health measures, social interaction patterns, or even weather. Please also keep in mind that each data point in these visualizations represents a person who has suffered or lost their life to this disease.
These predictions are only short-term extrapolation: predicting the future is hard, and epidemic dynamics will change with changes in public health measures, social interaction patterns, or even weather. Please also keep in mind that each data point in these visualizations represents a person who has suffered or lost their lives to this disease.

## Understanding exponential growth

In their early stages, outbreaks display *exponential growth*: the number of cases grows as a multiple of itself. Let's say that Patient Zero infects two people, and then each of those infects two more people, and so on. The number of infected people will grow by a larger amount each day -- two on the first day, four on the second day, eight on the third day, and so on. This is what we call exponential growth, because the number of cases on each day is some number raised to the power of the number of days.
In their early stages, outbreaks display *exponential growth*: the number of cases grows as a multiple of itself. Let's say that Patient Zero infects two people, and then each of those infects two more people, and so on. The number of infected people will grow by a larger amount each day -- two on the first day, four on the second day, eight on the third day, and so on. This is what we call exponential growth because the number of cases on each day is some number raised to the power of the number of days.

For a deeper explanation of how exponential growth relates to epidemics, see [this video](https://www.youtube.com/watch?v=Kas0tIxDvrg).

### The growth rate is not only a property of the virus

The local growth of an outbreaks is related to how likely one infected individual is to transmit the disease to another person. It is related to properties of the virus (such as how long it can stay on a surface), but also to how much people interact with each other, and public health measures such hand washing.
The local growth of an outbreaks is related to how likely one infected individual is to transmit the disease to another person. It is related to properties of the virus (such as how long it can stay on a surface), but also to how much people interact with each other, and public health measures such as hand washing.

### Plotting in log scale

The plot of cases over time includes two different options: The linear plot shows the actual count of cases, while the log plot shows the *logarithm* of the number of cases - which is basically the number of times one has to multiply the number 10 in order to get the number of cases. This logarithm view has a direct relationship with the exponential growth of the epidemic: in such a view, an exponential growth appears as a straight line. You can think of the logarithm as the opposite of the exponential.

In addition, the log plot lets us more easily see the relationships between trends over time when the actual numbers are very different. Because the logarithm increasingly compresses large numbers, it makes it easier to see whether the rate of increase is similar between two countries, even when one has many more cases than the other.
Besides, the log plot lets us more easily see the relationships between trends over time when the actual numbers are very different. Because the logarithm increasingly compresses large numbers, it makes it easier to see whether the rate of increase is similar between two countries, even when one has many more cases than the other.


# Where do the data come from?
Expand All @@ -36,22 +36,26 @@ Those who want to know more details about how the estimates are computed can fin

## How can you be sure that the forecast is accurate?

We cannot. We are simply using the data to project further growth. However, you can see that the model has done well at predicting the growth rate over the last two weeks. The model should be relatively accurate for the next few days, but becomes less accurate for farther-out days.
We cannot. We are simply using the data to project further growth. However, you can see that the model has done well at predicting the growth rate over the last two weeks. The model should be relatively accurate for the next few days but becomes less accurate for farther-out days.


# What are the potential biases in the data?

Accurate measurements of health across populations are difficult. There are many sources of bias in the data.

## Reporting biases
Perhaps the greatest bias is that cases can only be counted if they seek out medical care or are tested. COVID-19 appears to cause mild or no symptoms in a sizeable proportion of people, which means that the reported counts underestimate the true total number of infected persons. This could also cause biases between countries --- for example, if people are told to stay home unless their disease worsens, then fewer cases will be detected than if people are told to seek medical care for mild symptoms and receive testing for the virus. In addition, some countries test systematically many individuals, while other countries only test individuals with severe symptoms. This testing strategy, as well as well as the diagnostic criteria, may vary across time in a given country.
Perhaps the greatest bias is that cases can only be counted if they seek out medical care or are tested. COVID-19 appears to cause mild or no symptoms in a sizeable proportion of people, which means that the reported counts underestimate the true total number of infected persons. This could also cause biases between countries --- for example, if people are told to stay home unless their disease worsens, then fewer cases will be detected than if people are told to seek medical care for mild symptoms and receive testing for the virus. Also, some countries test systematically many individuals, while other countries only test individuals with severe symptoms. This testing strategy, as well as the diagnostic criteria, may vary across time in a given country.

## Test accuracy

A perfect diagnostic test would provide a positive result for every infected person, and a negative result for every non-infected person. Unfortunately, it is almost impossible to create such a perfect test, so all diagnostic tests will result in some errors. These can either be *false positive* errors (that is, saying that someone is infected when they are not), or a *false negative* error (saying that a person is not infected when they actually are). For example, the commonly used rapid tests for flu viruses have false negative rates of 30-70% and false positive rates of about 10%. We don't yet know the error rates for the various testing methods in use for SARS-CoV-2, but we have already seen that the that test intially developed by the US Centers for Disease Control [had high rates of false positive results](https://www.propublica.org/article/cdc-coronavirus-covid-19-test).
A perfect diagnostic test would provide a positive result for every infected person, and a negative result for every non-infected person. Unfortunately, it is almost impossible to create such a perfect test, so all diagnostic tests will result in some errors. These can either be *false positive* errors (that is, saying that someone is infected when they are not), or a *false negative* error (saying that a person is not infected when they actually are). For example, the commonly used rapid tests for flu viruses have false negative rates of 30-70% and false positive rates of about 10%. We don't yet know the error rates for the various testing methods in use for SARS-CoV-2, but we have already seen that the that test initially developed by the US Centers for Disease Control [had high rates of false positive results](https://www.propublica.org/article/cdc-coronavirus-covid-19-test).

## Population differences
There are differences between populations within and across countries that could affect the spread of the disease. For example, the prevalence of chronic lung diseases (which increase the risk of severe COVID-19 infection) [vary between countries and between urban and rural environments](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4693508). Differences in population density and in local customs (such as hand-shaking or face-kissing greetings) could also affect the rates of disease transmission between different countries. In addition, the age distribution varies across countries, and as a consequence a larger fraction of the population is at risk in certain countries compared to others.
There are differences between populations within and across countries that could affect the spread of the disease. For example, the prevalence of chronic lung diseases (which increase the risk of severe COVID-19 infection) [vary between countries and between urban and rural environments](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4693508). Differences in population density and in local customs (such as hand-shaking or face-kissing greetings) could also affect the rates of disease transmission between different countries. In addition, the age distribution varies across countries, and as a consequence, a larger fraction of the population is at risk in certain countries compared to others.

# More detailed data

An even more detailed visualization of the Coronavirus situation can be found here: [Coronavirus Disease (COVID-19) – Statistics and Research](https://ourworldindata.org/coronavirus)

<!-- Below is a "microformat: to give information to facebook, twitter.. -->
<div class="h-feed">
Expand Down