Improve validate performance #1043
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a new PR due to invalid merge in PR #1005.
I also try to handle the review feedback. However I removed the cache property, as I forget to remove it. I have a proof of concept of cache but it becomes pretty messy as I need to ignore a few rules that has dependencies to data from being cached. I will open a PR with that later for feedback.
Original PR text below:
Optimize Validation for Large Nested Collections
Problem:
Validating large arrays (e.g., 5,000+ items) of nested Data objects using
Data::validate()can be significantly slower than native Laravel validation when the nested Data objects do not define a dynamicrules()method. This is due to the overhead ofIlluminate\Validation\Rule::forEachbeing used internally even for static nested rules.Benchmark Setup:
To demonstrate the issue and the improvement, the following self-contained benchmark test code was used (within the package's test environment):
Benchmark Results (Avg of 3 runs):
Before Optimization:
After Optimization:
Performance Summary:
Bonus: Impact of Upstream Laravel
Arr::dotOptimizationIt's worth noting that recent versions of Laravel (the changes from PR #55495) significantly optimized the
Arr::dotmethod, which is used internally by the validator when handling wildcard rules.Running the same benchmark test against a Laravel version incorporating this
Arr::dotoptimization shows improved performance for the Native Laravel Validator baseline, while ourlaravel-dataoptimization continues to provide a substantial benefit:Benchmark Results (With Optimized
Arr::dot):This demonstrates that even with the upstream improvements to Laravel's validator internals, directly bypassing
Rule::forEachfor static nested collections inlaravel-dataoffers further significant performance gains for this specific use case.Proposed Solution:
This PR modifies
DataValidationRulesResolver::resolveDataCollectionSpecificRules:BenchmarkIdData) has arules()method using `method_exists()$.rules()method exists, it iterates the collection payload and directly calls$this->execute()for each item, adding the generated rules to the main ruleset. This bypassesRule::forEach.rules()method does exist, it falls back to the originalRule::forEachlogic to ensure dynamic rules are handled correctly.Testing:
All existing package tests pass with this change.
Impact:
This provides a significant performance improvement (~95x faster in the benchmark case) for validating large collections of nested data objects with static rules (defined via typehints and attributes), without affecting validation for objects using dynamic
rules()methods.Feedback:
Let me know if you have any feedback on this. I am no expert in all internal processes in this package but I am running this PR in my project and it seems fine.