Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

!!!TASK: Reduce complexity of ReflectionService #3443

Merged
merged 8 commits into from
Feb 13, 2025

Conversation

kitsunet
Copy link
Member

@kitsunet kitsunet commented Feb 7, 2025

This change tackles some problems within the reflection service that stem from historically increasing complexity due to various caching mechanisms depending on application context and compile time status.

The aim was to cut down on this complexity, while ensuring that all existing use-cases continue working as intended.

This ultimately also fixes issue #3402 by providing the same reflection data across all possible contexts.

A few features and caches got deprecated with this change and could be breaking in the rare case you used the freeze package api in your code:

The entire concept of freezing a package is deprecated

What remains are the commands in the package controller, which are now all no-ops and deprecated to be removed with 9.0. This is to ensure deployment pipelines possibly calling freeze commands do not break with the 8.4 update.

Additionally the single method PackageManager::isPackageFrozen remains, while the rest was removed. None of the methods was ever api and it seems unlikely that someone used them in user-land code. isPackageFrozen however is at the very least used in Framework and Neos code and therefore remains until 9.0, but will now return false for every package.

Caches deprecated and unused

With the simplification two caches are no longer needed, both are still declared so that possibly existing cache configuration in user projects doesn't error, but both

Flow_Reflection_Status

and

Flow_Reflection_CompiletimeData

will no longer be used and any content can be removed.

The only reflection cache is now Flow_Reflection_RuntimeData, which makes the name somewhat deceptive as it is also used in compile time. To avoid backwards compatibility issues however it makes sense to keep the name for the foreseeable future.

Quick performance comparisons suggest that especially the initial compile from empty cache benefits from this change. Reflection updates in Development context afterwards seem to be on par with the existing code base.

Checklist

  • Code follows the PSR-2 coding style
  • Tests have been created, run and adjusted as needed
  • The PR is created against the lowest maintained branch
  • Reviewer - PR Title is brief but complete and starts with FEATURE|TASK|BUGFIX
  • Reviewer - The first section explains the change briefly for change-logs
  • Reviewer - Breaking Changes are marked with !!! and have upgrade-instructions

This change tackles some problems within the reflection service that
stem from historically increasing complexity due to various caching
mechanisms depending on application context and compile time status.

The aim was to cut down on this complexity, while ensuring that all
existing use-cases continue working as intended.

This ultimately also fixes issue neos#3402 by providing the same
reflection data across all possible contexts.

A few features and caches got deprecated with this change and
could be breaking in the rare case you used the freeze package
api in your code:

The entire concept of freezing a package is deprecated

What remains are the commands in the package controller, which are now
all no-ops and deprecated to be removed with 9.0. This is to ensure
deployment pipelines possibly calling freeze commands do not break
with the 8.4 update.

Additionally the single method `PackageManager::isPackageFrozen`
remains, while the rest was removed. None of the methods was ever api
and it seems unlikely that someone used them in user-land code.
`isPackageFrozen` however is at the very least used in Framework and
Neos code and therefore remains until 9.0, but will now return false
for every package.

Caches deprecated and unused

With the simplification two caches are no longer needed, both are
still declared so that possibly existing cache configuration in user
projects doesn't error, but both

`Flow_Reflection_Status`

and

`Flow_Reflection_CompiletimeData`

will no longer be used and any content can be removed.

The only reflection cache is now `Flow_Reflection_RuntimeData`, which
makes the name somewhat deceptive as it is also used in compile time.
To avoid backwards compatibility issues however it makes sense to keep
the name for the foreseeable future.

Quick performance comparisons suggest that especially the initial
compile from empty cache benefits from this change. Reflection updates
in Development context afterwards seem to be on par with the existing
code base.
@kitsunet
Copy link
Member Author

kitsunet commented Feb 7, 2025

Future steps here IMHO could/should be to transfer the huge array structure of reflection data into DTOs and possibly split code that creates the reflection into a Compiletime ReflectionService as we shouldn't need to reflect anything during runtime, therefore making the code easier to reason with. But I resisted from adding those changes here to avoid making it a needlessly large change for 8.4

Copy link
Member

@mhsdesign mhsdesign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you sooo much for the change! Ill yet have to actually test it and also fully understand the new reflection logic but from reading it briefly i love it. I remember attempting to fix things already in the reflection but the whole initialisation processes is just soo hard to wrap your head around and every-time i got a grasp later i forgot again. Because of that could you please just shortly describe what the expected difference in cache behaviour/performance during Development/Testing and Production might be? Anyway all this freezing functionally will be finally a thing of the past and it definitely does not justify any of the complexity! So super glad to have it gone.

@kitsunet
Copy link
Member Author

kitsunet commented Feb 10, 2025

So what this thing do before:

buildReflectionData is the starting point for actually doing reflection and is triggered via the CompiletimeObjectManager (only). The whole system is a bit disconnected and might be another candidate for future refactoring.
buildReflectionData gets all known classes from the CompiletimeObjectManager, then goes ahead to callforgetChangedClasses which checks if any of the classes no longer exist in the status cache (now removed). Those cache entries get removed from the cach directly in the CacheManager based on possibly modifiedfiles found by the FileMonitor. This obviously has to happen beforehand. It would be nicer if these things were handled closer together (eg. filemonitor provides the changes to the ReflectionService or so TBD).
Afterwards reflectEmergedClasses will go through the classes and reflect those not reflected.

Then the second big part is the saveToCache method called via signal.
This would do a bunch of different things, updateReflectionData would write one big cache entry to the reflection compiletime cache (now removed) if there was changed reflection data, BUT only if two other conditions didn't match, one was ($applicationContext = PRODUCTION && runtimecache frozen), no need to write anything because caches were frozen, the other was ("loadFromClassSchemaRuntimeCache" true), which indirectly was the same condition as above decided in initialize. IMHO that second condition should never happen as otherwise we should've early returned at the first already.

Additionally in production context we would write the runtime cache (the only one now) and freeze it afterwards, BUT this production cache did not contain all data the compiletime cache contained leading to bug #3402
The runtime cache contains separate entries for all caches unlike the compiletime cache which is one big file, this is important later to understand what I did.
Additionally it stores an array of all known classes as an entry (also important later)

And finally in development we would write an additional big cache file outside of any cache backend, that woudl contain about the same as the compiletime cache. This was called freezePackageReflection even though it has nothing to do with packages nor with package freezing or cache freezing. It was simply a separate cache file, but this is gone now too.

initialize did then decide which of these many caches should be used.

What happens WITH THIS CHANGE:

We will discuss this in a slightly different order to make more sense.
saveToCache checks if there is updated reflection data and if so runs updateCacheEntries (renamed from saveProductionData), which basically works as before BUT stores all necessary data now, this uses the reflection runtime and schemata caches. It will no longer freeze caches.

reflectEmergedClasses is still the entry point for (re)generating caches and the flow is as before, we get all known classes from the compiletimeobjectmanager, then we forgetChangedClasses which no longer uses the status cache but directly checks the runtime class for the class entries, this is also now the cache were entries get removed from via the CacheManager. But this cuts out the status cache, which was necessary because in compiletime we had that big single file cache entry so we couldn't remove entries from that, so the status cache was a workaround to do this.

Finally reflectEmergedClasses pretty much works as before, with the added special that we prefill the reflectionData with "empty" keys for all known classes, coming from the addtional cache entry __classNames in our runtime cache, which is then reduced in forgetChangedClsases if necessray, to have a complete list of currently reflected classes to compare the new list to.

initialize now always loads the "compounded" cache entries from the runtime cache (so __classNames, __annotatedClasses, and __classesByMethodAnnotations') and class reflection is loaded as needed via loadOrReflectClassIfNecessary which is pretty much used everywhere now to make sure we prefer the cache entries.

The class schemata weren't mentioned before, they are basically a sideline and contain data for doctrine ORM retrieved via reflection, so it's kinda in the right place but kinda not. Might also warrant another refactoring, but another day.

@kitsunet
Copy link
Member Author

Note I might have found two little things to imrpove / fix, shouldn't stop you from reviewing, but should be done before merge.

The new caching loads lazily we need to make sure that the interfaces
are actually loaded when trying to  remove implementation classes.
Otherwise the initialize might happen for no good reason if the
ReflectionService was never used.
Some docblock changes and minimal code improvements, might be nice to
have for future refactorings.
@kitsunet
Copy link
Member Author

kitsunet commented Feb 11, 2025

Here we go, split in separate commits to make it easier to review. the last commit is a lot of foobar adaptions to docblocks and code style, nothing functional.

Copy link
Member

@kdambekalns kdambekalns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a close look, left some comments. Generally fine, so 👍 already.

Neos.Flow/Classes/Package/PackageManager.php Show resolved Hide resolved
Neos.Flow/Classes/Package/PackageManager.php Show resolved Hide resolved
Neos.Flow/Configuration/Caches.yaml Show resolved Hide resolved
Neos.Flow/Classes/Reflection/ReflectionService.php Outdated Show resolved Hide resolved
foreach ($classNames as $className) {
if (!$this->statusCache->has($this->produceCacheIdentifierFromClassName($className))) {
if (is_string($className) && !$this->reflectionDataRuntimeCache->has($this->produceCacheIdentifierFromClassName($className))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What – if not a string – can $className be here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an example were that was for some reason I couldn't figure out a bool, so I jjust wanted to be sure here....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that must have been in a previous iteration of the code as array_keys is used which can NEVER return a bool as that is not a valid array key.

kitsunet and others added 3 commits February 13, 2025 15:30
The method was deprecated and the concept gutted,
this would return empty array anyways.
Co-authored-by: Karsten Dambekalns <[email protected]>
Comment on lines 37 to 41
use ReflectionIntersectionType;
use ReflectionNamedType;
use ReflectionType;
use ReflectionUnionType;
use RuntimeException;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we aggreed that we dont want to import first level things and use \ - will fix that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed via 82452e8

mhsdesign added a commit to mhsdesign/flow-development-collection that referenced this pull request Feb 13, 2025
Also removes declarations for not needed cache configurations:

Flow_Reflection_Status
Flow_Reflection_CompiletimeData

if you have any custom declaration for these please remove them.
mhsdesign added a commit to mhsdesign/neos-development-collection that referenced this pull request Feb 13, 2025
Copy link
Member

@mhsdesign mhsdesign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay now im wondering whats with the FreezableBackendInterface - do we want to deprecate that as well?

Also two times the word "frozen" still appears in the inline docs of the reflection service. Probably the whole doc block of how it works is outdated - but maybe we can put in there youre nice outline from above?

Tested it by upmerging into 9.0 (my 8.4 setup is broken^^) Anyways there are only little merge conflict and besides an error at start (which is expected) and cache flushing the Neos Demo runs fine.

Warning: unserialize(): Extra data starting at offset 283 of 307 bytes in Framework/Neos.Cache/Classes/Frontend/VariableFrontend.php line 94

I also prepared a followup for 9.0 to remove the dead apis #3446
and a neos change to remove the usages as well: neos/neos-development-collection#5467

Comment on lines 37 to 41
use ReflectionIntersectionType;
use ReflectionNamedType;
use ReflectionType;
use ReflectionUnionType;
use RuntimeException;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed via 82452e8

@kitsunet
Copy link
Member Author

FreezableBackendInterface yep it's also for shit like this, that said, can be deprecated but didn't really feel like the scope here.

@kitsunet
Copy link
Member Author

Warning: unserialize(): Extra data starting at offset 283 of 307 bytes in Framework/Neos.Cache/Classes/Frontend/VariableFrontend.php line 94

Right that's because of the change to simplefilebackend, which doesn't expect stuff after hte payload, but it's alos faster therefore.

@kitsunet kitsunet merged commit c3204e5 into neos:8.4 Feb 13, 2025
8 checks passed
Copy link
Member

@mhsdesign mhsdesign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some other things i found when looking at the code - id still also like the inline docs to be fixed as i wrote above ^^ -> will make a followup pr to tackle my code findings ;)

Edit: #3447

Comment on lines +1076 to +1080
$availableClassnames = [];
foreach ($this->availableClassNames as $classNamesInPackage) {
$availableClassnames[] = $classNamesInPackage;
}
$classNamesToReflect = array_merge([], ...$availableClassnames);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay that part got me first confused ^^ clever though :D

it first looked like a Object.assign({}, copyFrom) in javascript but its way better, we unpack all arrays and then merge them together to a flat thing

we can make that a little more simple even:

Suggested change
$availableClassnames = [];
foreach ($this->availableClassNames as $classNamesInPackage) {
$availableClassnames[] = $classNamesInPackage;
}
$classNamesToReflect = array_merge([], ...$availableClassnames);
// flatten nested array structure to a list of classes
$availableClassnames = array_merge(...array_values($this->availableClassNames));

We dont need the [] as since 7.4.0 this function can be called without any parameter, and it will return empty array.

And we still need the array_values to prevent: Uncaught ArgumentCountError: array_merge() does not accept unknown named parameters

return;
}

if (!$this->initialized) {
$this->initialize();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialising here when shutting down doesnt make sense and can never be the case i think - but was so beforehand as well ...

Comment on lines +1158 to +1160
if (!isset($this->classReflectionData[$className][self::DATA_INTERFACE_IMPLEMENTATIONS]) && $class->isInterface()) {
$this->classReflectionData[$className][self::DATA_INTERFACE_IMPLEMENTATIONS] = [];
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont need that explicitly here but it doesnt hurt either right? should be done automatically in addImplementedInterface

mhsdesign added a commit to mhsdesign/flow-development-collection that referenced this pull request Feb 13, 2025
@mhsdesign
Copy link
Member

mhsdesign commented Feb 13, 2025

Ha funny i found the original refactoring that introduced this production vs development distinction :D

ba1f5fb

And i did the least of performance testing one could do in dev mode and ran time ./flow flow:core:compile from cold caches and once after a file change and i could not see any immediate performance drawback before or after

mhsdesign added a commit to mhsdesign/flow-development-collection that referenced this pull request Feb 13, 2025
Also removes declarations for not needed cache configurations:

Flow_Reflection_Status
Flow_Reflection_CompiletimeData

if you have any custom declaration for these please remove them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants