pre-commit: PR165159 by zyw-bot · Pull Request #3520 · dtcxzyw/llvm-opt-benchmark

zyw-bot · 2026-02-27T19:13:59Z

Link: llvm/llvm-project#165159
Requested by: @yxsamliu

zyw-bot · 2026-02-27T19:48:05Z

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@c3b3f41
patch: llvm/llvm-project#165159
sha256: 830c900cbc232ae3eca1e3170ef8c9aa6e86661b1f8973e31288ab07ba30a540
commit: 6087fae

112 files changed, 126321 insertions(+), 127205 deletions(-)

Improvements:
  sroa.NumLoadsPredicated 14459 -> 14489 +0.21%
  sroa.NumStoresPredicated 3634 -> 3640 +0.17%
  instcount.NumExtractElementInst 55343 -> 55388 +0.08%
  sroa.NumLoadsSpeculated 316530 -> 316600 +0.02%
  loop-idiom.NumMemSet 38911 -> 38917 +0.02%
  instcount.NumInsertElementInst 90566 -> 90576 +0.01%
  memory-builtins.ObjectVisitorLoad 23200 -> 23202 +0.01%
  attributor.NumAAs 3940676 -> 3940956 +0.01%
  loop-delete.NumDeleted 112114 -> 112121 +0.01%
  memdep.NumCacheCompleteNonLocalPtr 5302316 -> 5302550 +0.00%
Regressions:
  memcpyopt.NumCpyToSet 11953 -> 11944 -0.08%
  instcombine.NumDeadStore 25573 -> 25564 -0.04%
  correlated-value-propagation.NumNonNull 10849100 -> 10847560 -0.01%
  memdep.NumCacheDirtyNonLocalPtr 23019 -> 23017 -0.01%
  instcount.NumAllocaInst 5811514 -> 5811232 -0.00%
  capture-tracking.NumNotCapturedBefore 19317429 -> 19316773 -0.00%
  instcount.NumCallInst 38921359 -> 38920222 -0.00%
  memcpyopt.NumCallSlot 1011065 -> 1011044 -0.00%
  sroa.NumAllocaPartitionUses 266600647 -> 266595399 -0.00%
  instcount.NumSelectInst 1779718 -> 1779683 -0.00%

+3 cpython/compile.ll
+3 xgboost/updater_refresh.ll
+1 ffmpeg/avformat.ll
+0 assimp/FBXConverter.ll
+0 box2d/sample_collision.ll
+0 ceres/gradient_problem_solver.ll
+0 ceres/line_search.ll
+0 ffmpeg/ffmpeg_dec.ll
+0 gromacs/colvarparse.ll
+0 opencv/benchmark.ll
+0 opencv/binarizer.ll
+0 opencv/graphsegmentation.ll
+0 opencv/seam_finders.ll
+0 openusd/blendShapeQuery.ll
+0 z3/euf_proof_checker.ll
-1 delta-rs/11f8x98axanecwnw.ll
-2 image-rs/1clnprdgqfw2q9lq.ll
-2 z3/seq_axioms.ll
-3 bullet3/b3DynamicBvhBroadphase.ll
-3 bullet3/btConvexHullComputer.ll
-3 cmake/session.ll
-3 gromacs/lincs.ll
-3 wireshark/sparkline_delegate.ll
-4 hyperscan/rose_build_bytecode.ll
-4 llvm/AArch64O0PreLegalizerCombiner.ll
-4 llvm/AttributorAttributes.ll
-4 llvm/OMPIRBuilder.ll
-4 php/dirstream.ll
-6 duckdb/ub_duckdb_storage_metadata.ll
-7 freetype/ftbase.ll
-8 llvm/AArch64InstructionSelector.ll
-9 open3d/EstimateNormals.ll
-9 opencv/erfilter.ll
-9 opencv/gapi_core_perf_tests.ll
-9 opencv/gnnparsers.ll
-9 velox/GreatestLeast.ll
-12 hermes/Exceptions.ll
-12 openusd/collectionCache.ll
-12 wasmtime-rs/16qf4j2oevjc61uc.ll
-14 llvm/FunctionAttrs.ll
-15 xgboost/updater_approx.ll
-24 velox/ArraySort.ll

github-actions · 2026-02-27T19:49:19Z

Here is a concise summary of the major changes in this LLVM IR diff:

Vectorization of Small Struct Allocations and Loads/Stores:
Multiple instances replace { float, float } or { i64, i64 } struct allocations with vector types (<2 x float>, <2 x i64>, <4 x i32>, etc.), accompanied by corresponding load/store instructions instead of alloca + memcpy. This reflects improved SROA (Scalar Replacement of Aggregates) and vectorization, especially for 2-field structs representing geometric data (e.g., colors, vectors, points).
Elimination of Temporary Alloca + memcpy Patterns:
Code patterns using a temporary stack allocation (alloca) followed by memcpy to swap or copy small aggregates (e.g., b2Vec2, FBX::Light::Color, MetadataBlockInfo) are replaced with direct vector loads/stores and phi nodes. This removes unnecessary memory traffic and lifetime management (llvm.lifetime.start/end calls are removed).
Improved Handling of std::function Move/Assignment:
In several LLVM and C++ standard library modules (e.g., SmallVectorImpl<std::function>), move operations now use vector loads/stores (<2 x i64>) for the std::function’s internal storage instead of byte-wise memcpy. This includes aligning allocas to 16 bytes and updating pointer arithmetic accordingly.
Refinement of Sorting and Swapping Logic:
Sorting routines (e.g., __introsort_loop, __insertion_sort, __unguarded_linear_insert) across multiple benchmarks (hermes, duckdb, opencv, open3d) eliminate temporary alloca-based swap buffers in favor of direct vector loads/stores into loop-carried values, reducing stack usage and improving locality.
Cleanup of Redundant Struct Definitions and Phi Node Updates:
Minor but consistent cleanups include removing unused struct type declarations (e.g., PyCompilerFlags in cpython), replacing i64-based loads/stores of struct fields with vector equivalents, and fixing phi node operand order to match updated basic block predecessors—ensuring correctness after control-flow restructuring.

These changes collectively reflect aggressive SROA, better vector type inference, and more precise memory access modeling—leading to reduced stack allocations, eliminated redundant copies, and improved code generation for small aggregate data.

model: qwen-plus-latest
CompletionUsage(completion_tokens=532, prompt_tokens=109022, total_tokens=109554, completion_tokens_details=None, prompt_tokens_details=None)

pre-commit: PR165159

d984228

github-actions bot mentioned this pull request Feb 27, 2026

Task submission #1312

Open

github-actions bot added 2 commits February 27, 2026 19:47

pre-commit: Update

7e4dd7e

pre-commit: Remap

6087fae

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pre-commit: PR165159#3520

pre-commit: PR165159#3520
zyw-bot wants to merge 3 commits intomainfrom
test-run22500182292

zyw-bot commented Feb 27, 2026

Uh oh!

zyw-bot commented Feb 27, 2026

Uh oh!

github-actions bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zyw-bot commented Feb 27, 2026

Uh oh!

zyw-bot commented Feb 27, 2026

Diff mode

Uh oh!

github-actions bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants