Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add clearing of static_gdscript_cache to GDScriptCache #104281

Merged
merged 1 commit into from
Mar 18, 2025

Conversation

mihe
Copy link
Contributor

@mihe mihe commented Mar 17, 2025

As talked about on Rocket Chat, this relates to an issue that I've been trying to debug for a few days now, but have so far been unsuccessful in figuring out what the exact cause is, hence why I can't provide an issue/MRP for it.

Description

In short, in the relatively large project I'm working with there seems to be some issue where a significant amount of GDScript instances are somehow ending up in a strange state where they're being kept alive by GDScriptCache::static_gdscript_cache, but are somehow not in GDScriptLanguage::script_list, which (as far as I can tell) should rightfully hold every non-freed GDScript instance.

This then leads to these GDScript instances having GDScript::clear called on them too late in the program flow, resulting in this error being tripped in SelfList::List::remove, for every single function in these classes, due to redundant removals from GDScriptLanguage::function_list:

ERR_FAIL_COND(p_elem->_root != this);

The common denominator between these classes seems to be that they all have static variables in them.

I've so far only seen this problem manifest when deleting the .godot folder and opening the project in the editor and then closing it, meaning it's reproducible with --import, with or without --headless, which means we're getting a lot of these errors when doing --headless --import in a CI context.

As best I can tell, this is most likely a regression in 4.4, as this only started happening after we migrated this project from 4.3-stable to 4.4-stable, and latest master (4ef0cd6 as of writing this) does not resolve the issue.

Unfortunately, bisecting this particular issue would take a considerable amount of effort, due to the time it takes to do a clean import in this particular project, along with some other complications, so I can't tell you where the regression originated from at this time.

Sequence of events

As mentioned, the loop here in GDScriptLanguage::finish never seems to iterate over these particular GDScript instances, for reasons I don't quite understand:

// Clear dependencies between scripts, to ensure cyclic references are broken
// (to avoid leaks at exit).
SelfList<GDScript> *s = script_list.first();
while (s) {
// This ensures the current script is not released before we can check
// what's the next one in the list (we can't get the next upfront because we
// don't know if the reference breaking will cause it -or any other after
// it, for that matter- to be released so the next one is not the same as
// before).
Ref<GDScript> scr = s->self();
if (scr.is_valid()) {
for (KeyValue<StringName, GDScriptFunction *> &E : scr->member_functions) {
GDScriptFunction *func = E.value;
for (int i = 0; i < func->argument_types.size(); i++) {
func->argument_types.write[i].script_type_ref = Ref<Script>();
}
func->return_type.script_type_ref = Ref<Script>();
}
for (KeyValue<StringName, GDScript::MemberInfo> &E : scr->member_indices) {
E.value.data_type.script_type_ref = Ref<Script>();
}
// Clear backup for scripts that could slip out of the cyclic reference
// check
scr->clear();
}
s = s->next();
}

... which means that GDScript::clear isn't called for them here:

scr->clear();

... which means that their member functions are not freed as part of that either, as found here:

for (GDScriptFunction *E : clear_data->functions) {
memdelete(E);
}

... which means we'll end up clearing GDScriptLanguage::function_list before any GDScript::clear happens:

function_list.clear();

... and then GDScriptCache will end up being freed as part of uninitialize_gdscript_module, here:

if (gdscript_cache) {
memdelete(gdscript_cache);
}

... which will cause GDScriptCache::static_gdscript_cache to be destroyed as well, thereby freeing the Ref<GDScript> it seems to be holding, which in turn leads to GDScript::clear being called from its destructor, as seen here:

... which then actually frees the member functions (too late) and emits the error from SelfList::List::remove:

for (GDScriptFunction *E : clear_data->functions) {
memdelete(E);
}

Fix/workaround

This pull request fixes this issue by adding what seems to be a missing clearing of GDScriptCache::static_gdscript_cache in GDScriptCache::clear, which according to @vnen might have been left out by mistake.

This results in the affected GDScript instances being freed/cleared as part of GDScriptCache::clear before we go to clear GDScriptLanguage::function_list, thereby preventing this redundant removal from happening, and thus preventing the SelfList error from being reported.

While this change does resolve the problem, and might still be warranted on its own, it still seems like a bit of a workaround to what might be a bigger issue here.

@mihe mihe added bug topic:gdscript regression cherrypick:4.4 Considered for cherry-picking into a future 4.4.x release labels Mar 17, 2025
@mihe mihe added this to the 4.5 milestone Mar 17, 2025
@mihe mihe requested a review from a team as a code owner March 17, 2025 17:21
@akien-mga akien-mga merged commit 500d005 into godotengine:master Mar 18, 2025
20 checks passed
@akien-mga
Copy link
Member

Thanks!

@akien-mga
Copy link
Member

Cherry-picked for 4.4.1.

@akien-mga akien-mga removed the cherrypick:4.4 Considered for cherry-picking into a future 4.4.x release label Mar 18, 2025
@mihe mihe deleted the clear-static-gdscript-cache branch March 26, 2025 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants