Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving and Loading Boosters #189

Closed
gyansinha opened this issue Aug 30, 2023 · 13 comments
Closed

Saving and Loading Boosters #189

gyansinha opened this issue Aug 30, 2023 · 13 comments

Comments

@gyansinha
Copy link

gyansinha commented Aug 30, 2023

I'd like to save and transport a trained booster to another machine (estimation on linux, prediction on windows). I know the save function writes it out to a json, like this:

bst         = xgboost(dtrain;  watchlist=watchlist, params...)
model_fname = "xgboost_$(target)_$(replace(string(now()), ":" => "_"))_model.json"
XGBoost.save(bst, model_fname)

How do I load the model json file on the other machine without transporting the dtrain object as well? All of the arguments to the load (or load!) as well as the Booster function expect a DMatrix (or Array)? This behaviour seems anomalous to how this is supposed to work in say, Python or R wrappers.

Thanks for your help.

@gyansinha
Copy link
Author

FWIW, the example snippet from tests:

model_fname = "model.json"
bst2 = Booster(DMatrix[])
XGBoost.load!(bst2, model_fname)

results in an empty Booster when I load the JSON downloaded from Linux to Windows.

@ExpandingMan
Copy link
Collaborator

You should be able to load models saved with XGBoost.save with XGBoost.load. I was under the impression that save would dump the object in a binary format, not a JSON.

Note also that whether or not the exported model is compatible across versions is up to libxgboost, and I don't know what compatibility it claims, you can try checking the docs. Therefore, if you are still having problems, you should at least verify that the model exported with save can be retrieved with load on the same runtime. If that does not work it may be a bug. If it works within the same program, but not when exported to windows, it is likely that libxgboost doesn't support exports across subtly different versions.

@gyansinha
Copy link
Author

I went through the elimination process and the save/load procedure does work properly on the same runtime. It doesn't work properly when transporting from linux to windows - e.g., the booster remains empty after the load! command.

@ExpandingMan
Copy link
Collaborator

The silent failure doesn't seem like good behavior to me, it might be worth opening an issue at xgboost. I was not able to find anything in the documentation about what it considers a valid version difference, and I am confused by the fact that there seem to be 3 different methods for dumping a model, only one of which is explicitly documented as not used for re-constructing the model.

I am re-opening this because this should not silently fail, it should either work or throw an error. As of right now we can't implement that in XGBoost.jl because I don't even know how to check whether libxgboost thinks it's a compatible version.

@ExpandingMan ExpandingMan reopened this Aug 30, 2023
@ExpandingMan
Copy link
Collaborator

ExpandingMan commented Aug 30, 2023

Also, if the xgboost version is exactly the same between the linux and windows machines, this might be a windows-specific bug.

If it's working in the Python and R wrappers, it ought to work here, I just don't know what's going on without knowing what those wrappers are doing.

@gyansinha
Copy link
Author

Actually I did some more digging and am no longer confident the save/load! works in the linux runtime with an empty DMatrix. For example:

bst         = xgboost(dtrain;  watchlist=watchlist, params...)
model_fname = "../models/xgboost/xgboost_$(target)_$(replace(string(now()), ":" => "_"))_model.json"
XGBoost.save(bst, model_fname)

model_booster_2 = XGBoost.load!(XGBoost.Booster(DMatrix[]), model_fname)
XGBoost.load!(model_booster_2, model_fname)

julia> model_booster_2.feature_names
String[]

This works:

model_booster = XGBoost.Booster(dtest);
julia> model_booster.feature_names
108-element Vector{String}:
....
```

Unless I am missing something, this looks like a bug, or I am just using the functions the wrong way.

@ExpandingMan
Copy link
Collaborator

The feature names are expected not to be saved (this is something that should be documented but isn't). This is because the libxgboost model object doesn't support them, so they have to be saved in the Julia object, and we don't write this to disk because then we'd have to come up with our own (incompatible) file format. The python wrapper documents that it behaves the same way for the same reason. The only reliable way that I'm aware of to check if the model object is being properly loaded is to check the output of predict. This is unfortunate, but it's a constraint of the library we are wrapping.

@ExpandingMan
Copy link
Collaborator

@trivialfis , is there anything you can tell us about how we should be checking for this? Are we safe in assuming that if it loads without error then the library thinks it's a valid object? If that's the case there is likely a bug somewhere since @gyansinha couldn't load the object in windows, though I can't rule out that it's a windows-specific bug in the Julia wrapper.

@gyansinha
Copy link
Author

The feature names are expected not to be saved (this is something that should be documented but isn't). This is because the libxgboost model object doesn't support them, so they have to be saved in the Julia object, and we don't write this to disk because then we'd have to come up with our own (incompatible) file format. The python wrapper documents that it behaves the same way for the same reason. The only reliable way that I'm aware of to check if the model object is being properly loaded is to check the output of predict. This is unfortunate, but it's a constraint of the library we are wrapping.

If I try to use model_booster_2 with the dtest matrix, it runs without error - but returns zeros. That tells me that somewhere the prediction vector is allocated with zeros but then never properly assigned to.

@gyansinha
Copy link
Author

@trivialfis , is there anything you can tell us about how we should be checking for this? Are we safe in assuming that if it loads without error then the library thinks it's a valid object? If that's the case there is likely a bug somewhere since @gyansinha couldn't load the object in windows, though I can't rule out that it's a windows-specific bug in the Julia wrapper.

@ExpandingMan note the last round of tests is on the linux runtimes for both save and load!.

@gyansinha
Copy link
Author

gyansinha commented Aug 30, 2023 via email

@ExpandingMan
Copy link
Collaborator

I can't reproduce any problem with this on linux. Are you sure you are on latest? (XGBoost.jl 2.3.2 and XGBoost_jll 1.7.6)

@gyansinha
Copy link
Author

I can't reproduce any problem with this on linux. Are you sure you are on latest? (XGBoost.jl 2.3.2 and XGBoost_jll 1.7.6)

that was it! I was on 2.3.1 on linux, just upgraded to 2.3.2 and the load with an empty DMatrix Booster works fine. Also tested the model replication on windows and predictions all work out correctly. Sorry for the confusion but we can close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants