-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create a vignette for Kaplan Meier #41
base: main
Are you sure you want to change the base?
Conversation
adding extras codes to the KM ggsurvfit
updated the example of creation of variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, this file should be added to another branch, linked to another issue and another PR.
Otherwise, we won't be able to merge this PR about the KM vignette before completing the AE vignette.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quite sure why the AE vignette is within the KM vignette.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not "within" it is "beside.
On GH Desktop you are on the KM branch, 6-vignette-kaplan-meier
.
If you want to work on another vignette, you should switch your branch to another one, preferably one created from #40 (there is a "create branch" link somewhere).
Then in GH Desktop you can change the branch you are working on easily.
This is a bit of gymnastics, but it helps keeping things clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much Livia, this is really coming together 😁😁
death_eos = status, | ||
date_progression = date_enrol + time*runif(n(), 0.6, 0.8), | ||
date_progression = if_else(runif(n())>0.2, NA, date_progression), | ||
date_first_visit = t0_v2 + abs(rnorm(n(), 500, 20)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure about the variable renaming? Survival is usually calculated from enrollment or randomization, even if there is a visit (for screening, clinical checking, etc.) that happened before and would be the "first visit".
I agree with you on date_of_last_visit
though, much better.
However, it would be better to stick to one naming convention: either date_of_xxx
or date_xxx
, as you wish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understood is not always the same name for every project/data. So that first date and last date might be more understandable. However, when Ill present it to our team members in our next meeting i'll ask them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't follow_up_start
and follow_up_end
be more correct then? (or something similar)
vignettes/kaplan_meier.Rmd
Outdated
print(km.model_PFS) | ||
# il y a beaucoup de NA, c pas genial |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Que veux-tu dire ?
Si tu parles de la médiane qui est NA, c'est normal : quand la médiane n'est pas atteinte (il y a moins d'événements que la moitié des patients), on ne peut calculer que la borne inf de l'intervalle de confiance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on peut voir ca ensemble?
vignettes/kaplan_meier.Rmd
Outdated
print(km.model_PFS) | ||
# il y a beaucoup de NA, c pas genial | ||
summary(km.model_PFS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In summary()
, the output is unreadable without the times
argument as it describes every time in the database.
Also, I prefer using tidy_survfit()
from ggsurvfit
, it has the broom::tidy()
vibe, although you have to use select()
on the output to reduce the clutter.
What do you think about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you think it is nice to explain the times
argument? Clinicians often want to see the "x-Year survival" and this is how I calculate it. Do you do it in a different way?
tidy_survfit(km.model_PFS, times=c(0.05, 0.10, 0.20)) %>%
select(strata, time, n.risk, n.event, estimate, conf.high, conf.low)
status == 0 ~ pmax(date_of_last_visit,na.rm = T) | ||
)) %>% | ||
mutate(time_days=status_date-date_first_visit ) %>% | ||
mutate(time_years=time_days/365.25 ) | ||
mutate(time_months=time_days/30.5 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our times are very low, aren't they?
In ggsurvfit::df_colon
, times range from 0.02 to 9 years, which is a rather natural scale.
I'm not sure where this comes from, but maybe we can cheat by rescaling when creating data_surv
?
Also, don't you think creating both time_os
and time_pfs
(and both KM curves) would be educationally valuable? I think juniors can struggle with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point! Alexi should work on this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so Alexis should work on this vignette before we publish it?
@@ -106,7 +126,7 @@ Kaplan-Meier by Daniel D. Sjoberg, Mark Baillie, Charlotta Fruechtenicht, Steven | |||
|
|||
```{r} | |||
|
|||
km = survfit2(Surv(time, status) ~ surg, data = df_colon) %>% | |||
km = survfit2(Surv(time_days, status) ~ surg, data = data_surv_v2) %>% | |||
ggsurvfit(linetype_aes = TRUE) + | |||
add_confidence_interval() + | |||
add_risktable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was an error in the ggsurvfit vignette, we should use this theme:
theme(legend.position="inside", legend.position.inside = c(0.85, 0.85))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it for your code. It works but when looking at the html output, the plots do not look great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
vignettes/kaplan_meier.Rmd
Outdated
@@ -106,7 +126,7 @@ Kaplan-Meier by Daniel D. Sjoberg, Mark Baillie, Charlotta Fruechtenicht, Steven | |||
|
|||
```{r} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is a vignette an not a standard HTML document, figures are tiny by default 😕
Try {r fig.width = 10}
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that answers to my presvious comment. Fabulous!
guidelines how to construct a KM at GR