⚡️ Speed up method FsmWithContext._check_transitions by 371%
#78
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 371% (3.71x) speedup for
FsmWithContext._check_transitionsinwandb/sdk/lib/fsm.py⏱️ Runtime :
321 microseconds→68.2 microseconds(best of67runs)📝 Explanation and details
The optimized code achieves a 370% speedup by replacing expensive
isinstance()calls with fasttype() in tuplelookups and reducing attribute access overhead.Key Optimizations:
Precomputed Type Tuples: During initialization, the code creates tuples of state types for each protocol (
_fsm_state_exit_types,_fsm_state_stay_types, etc.). This converts runtimeisinstance(obj, Protocol)checks intotype(obj) in precomputed_tuplelookups, which are significantly faster.Attribute Access Reduction: Local variables (
state,state_type) cache frequently accessed attributes, eliminating repeatedself._stateandtype(self._state)calls within tight loops.Table Lookup Optimization: In
_check_transitions, the table lookupself._table[type(self._state)]is computed once and stored inentries, avoiding repeated dictionary lookups.Why This Works:
isinstance()calls dominated the original runtime (59.5% + 13.3% + 22.1% = ~95% of_transitiontime)type(x) in tuple) are O(1) for small tuples and much faster thanisinstance()with protocol classesself._state) involves Python's method resolution, while local variables are direct memory lookupsPerformance by Test Case:
The optimization excels particularly with high-transition workloads:
This optimization maintains identical behavior while dramatically improving performance for FSM-heavy workloads.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-FsmWithContext._check_transitions-mhdq0pkkand push.