Using xgboost with crankcompositor
/responsecompositor
#366
Replies: 12 comments 4 replies
-
Hi, please consult the crankcompose docs. Practically you will do something like:
But note that in general in survival analysis, there are issues when trying to compose the |
Beta Was this translation helpful? Give feedback.
-
@fa1999abdi question covered? |
Beta Was this translation helpful? Give feedback.
-
@bblodfon ,Thank you so much for your response. |
Beta Was this translation helpful? Give feedback.
-
You should use |
Beta Was this translation helpful? Give feedback.
-
@fa1999abdi I am going to soon split the xgboost objectives/learners (Cox vs AFT are very different) and for the Cox, the |
Beta Was this translation helpful? Give feedback.
-
but it didn't work tsk_s <- as_task_surv(tb, time = "time_to_death", event = "status", type = "right")
pipe = po("imputehist") %>>%
ppl("crankcompositor", learner = lrn("surv.xgboost"), response = TRUE, method = "sum_haz")
pipe$train(tsk_s)
p = pipe$predict(tsk_s)[[1]] # p will have a response (survival time) now
$compose_crank.output
NULL
> p = pipe$predict(tsk_s)[[1]] # p will have a response (survival time) now
Error: Assertion on 'distr' failed: FALSE.
This happened PipeOp compose_crank's $predict()
` |
Beta Was this translation helpful? Give feedback.
-
Yes, you need to estimate the library(mlr3proba)
#> Loading required package: mlr3
library(mlr3pipelines)
library(mlr3extralearners)
task = tsk("rats")
learner =
po("encode", method = "treatment") %>>%
ppl("crankcompositor",
# crank needs a distr prediction type, xgboost doesn't have one, so we have to estimate it:
learner = ppl("distrcompositor", learner = lrn("surv.xgboost", nrounds = 10),
estimator = "breslow", overwrite = FALSE),
response = TRUE, method = "sum_haz", overwrite = FALSE) |>
as_learner()
learner$train(task)
p = learner$predict(task)
p
#> <PredictionSurv> for 300 observations:
#> row_ids time status crank lp response distr
#> 1 101 FALSE -0.5318943 -0.5318943 3.987942 <list[1]>
#> 2 49 TRUE -0.9984229 -0.9984229 2.501140 <list[1]>
#> 3 104 FALSE -0.9984229 -0.9984229 2.501140 <list[1]>
#> ---
#> 298 92 FALSE -1.0661759 -1.0661759 2.337293 <list[1]>
#> 299 104 FALSE -0.8688244 -0.8688244 2.847226 <list[1]>
#> 300 102 FALSE -0.8688244 -0.8688244 2.847226 <list[1]>
p$score(msr("surv.cindex")) # uses lp prediction type
#> surv.cindex
#> 0.8984875
p$score(msr("surv.rmse")) # uses response prediction type
#> surv.rmse
#> 61.24336
p$score(msr("surv.brier")) # uses distr prediction type
#> surv.graf
#> 0.03333211 Created on 2024-02-10 with reprex v2.0.2 |
Beta Was this translation helpful? Give feedback.
-
@bblodfon thanks so much for your help. |
Beta Was this translation helpful? Give feedback.
-
FYI, even though you can do the above and get a |
Beta Was this translation helpful? Give feedback.
-
@fa1999abdi we now have the xgboost Cox learner with library(mlr3extralearners)
library(mlr3pipelines)
library(mlr3proba)
#> Loading required package: mlr3
task = tsk("rats")
learner =
po("encode", method = "treatment") %>>%
ppl("crankcompositor",
learner = lrn("surv.xgboost.cox", nrounds = 10),
response = TRUE, method = "sum_haz", overwrite = FALSE) |>
as_learner()
learner$train(task)
p = learner$predict(task)
p
#> <PredictionSurv> for 300 observations:
#> row_ids time status crank lp response distr
#> 1 101 FALSE -0.5318943 -0.5318943 3.987942 <list[1]>
#> 2 49 TRUE -0.9984229 -0.9984229 2.501140 <list[1]>
#> 3 104 FALSE -0.9984229 -0.9984229 2.501140 <list[1]>
#> ---
#> 298 92 FALSE -1.0661759 -1.0661759 2.337293 <list[1]>
#> 299 104 FALSE -0.8688244 -0.8688244 2.847226 <list[1]>
#> 300 102 FALSE -0.8688244 -0.8688244 2.847226 <list[1]> Created on 2024-04-12 with reprex v2.0.2 |
Beta Was this translation helpful? Give feedback.
-
Hi @fa1999abdi, we now have a new pipeop ( An example using library(mlr3extralearners)
library(mlr3pipelines)
library(mlr3proba)
#> Loading required package: mlr3
task = tsk("lung")
xgb = lrn("surv.xgboost.cox", nrounds = 10)
grlrn =
po("encode", method = "treatment") %>>%
ppl("responsecompositor", learner = xgb, method = "rmst") |>
as_learner()
p = grlrn$train(task)$predict(task)
p
#> <PredictionSurv> for 168 observations:
#> row_ids time status crank lp response distr
#> 1 455 TRUE -0.6479679 -0.6479679 357.4828 <list[1]>
#> 2 210 TRUE 0.7804120 0.7804120 179.7284 <list[1]>
#> 3 1022 FALSE -2.2456281 -2.2456281 692.1024 <list[1]>
#> ---
#> 166 105 FALSE -0.1915968 -0.1915968 289.1797 <list[1]>
#> 167 174 FALSE -0.4396934 -0.4396934 324.7442 <list[1]>
#> 168 177 FALSE -0.6972836 -0.6972836 365.6514 <list[1]> Created on 2024-08-17 with reprex v2.1.1 |
Beta Was this translation helpful? Give feedback.
-
Hi John, thanks for the update on the new pipeop!The new pipeop for composing survival time with RMST looks interesting. I also appreciate the example with xgboost—it will be helpful for my analyses. I tried to install version 0.6.7, but I'm having some issues and can't seem to get it to install properly. Have you experienced any similar problems, or do you have any suggestions on how to resolve them? library(mlr3extralearners)
library(mlr3pipelines)
library(mlr3proba)
#> Loading required package: mlr3
```
``` r
packageVersion("mlr3pipelines")
#> [1] '0.6.0'
```
``` r
remotes::install_version("mlr3pipelines", version = "0.6.7")
#> Error in download_version_url(package, version, repos, type): version '0.6.7' is invalid for package 'mlr3pipelines'
```
``` r
#> Loading required package: mlr3
task = tsk("lung")
xgb = lrn("surv.xgboost.cox", nrounds = 10)
grlrn =
po("encode", method = "treatment") %>>%
ppl("responsecompositor", learner = xgb, method = "rmst") |>
as_learner()
#> Error: Element with key 'responsecompositor' not found in DictionaryGraph!
```
``` r
p = grlrn$train(task)$predict(task)
#> Error in eval(expr, envir, enclos): object 'grlrn' not found
```
``` r
p
#> Error in eval(expr, envir, enclos): object 'p' not found
```
<sup>Created on 2024-08-18 with [reprex v2.1.0](https://reprex.tidyverse.org)</sup>
` |
Beta Was this translation helpful? Give feedback.
-
Hello
I'm doing ML survival study using the {MLR3proba} package, and I'm using three learners, "surv.rfsrc", "surv.xgboost" and "surv.penalized". I want to predict survival time for each individual and compare my three learners(with RMSE and C-index criteria). Would you please explain how can I use {mlr3pipelines} and {distrcompositor, crankcompositor} to do that?
The following are my codes:
Created on 2024-02-06 with reprex v2.1.0
Beta Was this translation helpful? Give feedback.
All reactions