You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is a concise summary of the major changes in this LLVM IR diff:
Vectorization of Small Struct Allocations and Loads/Stores:
Multiple instances replace { float, float } or { i64, i64 } struct allocations with vector types (<2 x float>, <2 x i64>, <4 x i32>, etc.), accompanied by corresponding load/store instructions instead of alloca + memcpy. This reflects improved SROA (Scalar Replacement of Aggregates) and vectorization, especially for 2-field structs representing geometric data (e.g., colors, vectors, points).
Elimination of Temporary Alloca + memcpy Patterns:
Code patterns using a temporary stack allocation (alloca) followed by memcpy to swap or copy small aggregates (e.g., b2Vec2, FBX::Light::Color, MetadataBlockInfo) are replaced with direct vector loads/stores and phi nodes. This removes unnecessary memory traffic and lifetime management (llvm.lifetime.start/end calls are removed).
Improved Handling of std::function Move/Assignment:
In several LLVM and C++ standard library modules (e.g., SmallVectorImpl<std::function>), move operations now use vector loads/stores (<2 x i64>) for the std::function’s internal storage instead of byte-wise memcpy. This includes aligning allocas to 16 bytes and updating pointer arithmetic accordingly.
Refinement of Sorting and Swapping Logic:
Sorting routines (e.g., __introsort_loop, __insertion_sort, __unguarded_linear_insert) across multiple benchmarks (hermes, duckdb, opencv, open3d) eliminate temporary alloca-based swap buffers in favor of direct vector loads/stores into loop-carried values, reducing stack usage and improving locality.
Cleanup of Redundant Struct Definitions and Phi Node Updates:
Minor but consistent cleanups include removing unused struct type declarations (e.g., PyCompilerFlags in cpython), replacing i64-based loads/stores of struct fields with vector equivalents, and fixing phi node operand order to match updated basic block predecessors—ensuring correctness after control-flow restructuring.
These changes collectively reflect aggressive SROA, better vector type inference, and more precise memory access modeling—leading to reduced stack allocations, eliminated redundant copies, and improved code generation for small aggregate data.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Link: llvm/llvm-project#165159
Requested by: @yxsamliu