Skip to content

pre-commit: PRParkHanbum/llvm-project/commit/b37f35d5fb9716bb690410cde4d829cb6917fece#3518

Open
zyw-bot wants to merge 3 commits intomainfrom
test-run22495749330
Open

pre-commit: PRParkHanbum/llvm-project/commit/b37f35d5fb9716bb690410cde4d829cb6917fece#3518
zyw-bot wants to merge 3 commits intomainfrom
test-run22495749330

Conversation

@zyw-bot
Copy link
Collaborator

@zyw-bot zyw-bot commented Feb 27, 2026

@github-actions github-actions bot mentioned this pull request Feb 27, 2026
@zyw-bot
Copy link
Collaborator Author

zyw-bot commented Feb 27, 2026

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@4eab75e
patch: ParkHanbum/llvm-project@b37f35d
sha256: e7100534a3a6268505d714c909541a1e29b145d3a586901c58777c21b2fee53a
commit: d2f4ee8

3325 files changed, 2091137 insertions(+), 2063681 deletions(-)

Improvements:
  instsimplify.NumExpand 291246 -> 298875 +2.62%
  correlated-value-propagation.NumNNeg 95915 -> 97504 +1.66%
  correlated-value-propagation.NumShlNSW 116435 -> 118128 +1.45%
  bdce.NumRemoved 378277 -> 382479 +1.11%
  correlated-value-propagation.NumShlNW 268748 -> 270441 +0.63%
  gvn.NumGVNLoad 1317637 -> 1325147 +0.57%
  instsimplify.NumReassoc 813651 -> 817847 +0.52%
  instcount.NumOrInst 1142721 -> 1148431 +0.50%
  instcount.NumTruncInst 2936213 -> 2949937 +0.47%
  instcombine.NumExpand 2699 -> 2708 +0.33%
Regressions:
  aggressive-instcombine.NumAnyOrAllBitsSet 70 -> 20 -71.43%
  aggressive-instcombine.NumInstrsReduced 45386 -> 43406 -4.36%
  tailcallelim.NumAccumAdded 151 -> 149 -1.32%
  aggressive-instcombine.NumExprsReduced 15622 -> 15523 -0.63%
  correlated-value-propagation.NumSaturating 2577 -> 2562 -0.58%
  correlated-value-propagation.NumAnd 44240 -> 43988 -0.57%
  indvars.NumElimIdentity 1833 -> 1824 -0.49%
  correlated-value-propagation.NumOverflows 4349 -> 4330 -0.44%
  correlated-value-propagation.NumCmps 270335 -> 269714 -0.23%
  correlated-value-propagation.NumNUW 569422 -> 568612 -0.14%

+26 grpc/ssl_transport_security.ll
+24 ocio/MatrixOp.ll
+16 rocksdb/clock_cache.ll
+12 sdl/SDL_gamepad.ll
+8 nori/widget.ll
+8 rocksdb/lru_cache.ll
+6 z3/dl_mk_interp_tail_simplifier.ll
+5 ffmpeg/vc1_block.ll
+5 grpc/tls_security_connector.ll
+5 hermes/DCE.ll
+5 hyperscan/ng_literal_component.ll
+5 nuttx/sched_removereadytorun.ll
+5 proxygen/HTTP1xCodec.ll
+5 wireshark/qcustomplot.ll
+4 hermes/BigIntSupport.ll
+4 openssl/cmp_genm.ll
+4 rust-analyzer-rs/2334ao9w0k9d7973.ll
+4 rust-analyzer-rs/rilullg9p294yp1.ll
+4 wireshark/packet-wtp.ll
+3 openjdk/cfgnode.ll
+3 postgres/ruleutils.ll
+3 rustfmt-rs/llbxf4pclolbp5s.ll
+2 cmake/zstd_compress_superblock.ll
+2 duckdb/serialize_parquet.ll
+2 duckdb/ub_duckdb_common.ll
+2 libquic/t1_lib.ll
+2 llvm/Scalarizer.ll
+2 lvgl/lv_demo_render.ll
+2 pbrt-v4/integrator.ll
+2 slurm/hostlist.ll
+2 uv-rs/13uh81w4oy46nmpmfbcefagqn.ll
+2 verilator/V3Expand.ll
+2 wireshark/disabled_protos.ll
+1 coreutils-rs/2i3dvgzkmy2gn6v1.ll
+1 cpython/complexobject.ll
+1 cpython/pylifecycle.ll
+1 cpython/textio.ll
+1 delta-rs/43y2svfstmvqcl15.ll
+1 gromacs/grid.ll
+1 jemalloc/extent_dss.ll
+1 llvm/CodeMetrics.ll
+1 luau/Quantify.ll
+1 meshlab/layerDialog.ll
+1 mitsuba3/path.ll
+1 php/session.ll
+1 pola-rs/akny94jrhz4eylr1elklgkf62.ll
+1 portaudio/pa_linux_pulseaudio_cb.ll
+1 protobuf/tokenizer.ll
+1 qemu/job.ll
+1 redis/extent_dss.ll
+1 tokenizers-rs/4hn9gefsll13qr1r.ll
+1 tree-sitter-rs/3akexam875pc2p1h.ll
+1 wasmtime-rs/35xpok2vrm65hidj.ll
+0 abseil-cpp/gtest-all.ll
+0 brotli/decode.ll
+0 bullet3/btConvexHull.ll
-1 abc/bmcCexTools.ll
-1 cmake/archive_read_support_format_mtree.ll
-1 coreutils-rs/2pqvixtdp9wizsl2.ll
-1 cpython/preconfig.ll
-1 cvc5/SimpSolver.ll
-1 darktable/thumbtable.ll
-1 delta-rs/11w0at10aiwuq3yr.ll
-1 delta-rs/47qjbhol909h8zu7.ll
-1 ffmpeg/mss12.ll
-1 ffmpeg/remove_extradata.ll
-1 folly/AsyncSocket.ll
-1 git/config.ll
-1 git/remote-curl.ll
-1 grpc/alts_handshaker_client.ll
-1 just-rs/2sblcsgax6v4zfcc.ll
-1 linux/buffer.ll
-1 linux/hda_auto_parser.ll
-1 luajit/lj_carith.ll
-1 minetest/CSkinnedMesh.ll
-1 minetest/nodedef.ll
-1 mitsuba3/scene.ll
-1 openspiel/go_board.ll
-1 openusd/changes.ll
-1 postgres/brin_minmax_multi.ll
-1 quiche-rs/5lgkl05cn1nxa92c04lryr5y4.ll
-1 recastnavigation/catch_amalgamated.ll
-1 ruby/re.ll
-1 rust-analyzer-rs/leba1wmgxgrzxkl.ll
-1 rustfmt-rs/2vbyym84o66crvo9.ll
-1 stb/stb_vorbis.ll
-1 wireshark/packet-bacapp.ll
-2 abc/abcBalance.ll
-2 boost/algorithm.ll
-2 coreutils-rs/4gs2z359bfnc1tys.ll
-2 freetype/ftbase.ll
-2 libwebp/vp8l_dec.ll
-2 lief/ASN1Reader.ll
-2 oiio/argparse.ll
-2 openssl/threadstest.ll
-2 quest/QuEST_validation.ll
-2 sdl/SDL_hidapi.ll
-2 verilator/V3Split.ll
-2 wasmtime-rs/4bsmuvpz9r22ks1w.ll
-3 abc/sbdCut2.ll
-3 hermes/RegAlloc.ll
-3 icu/xmlparser.ll
-3 openjdk/modRefBarrierSetC1.ll
-3 proj/geodsigntest.ll
-3 regex-rs/32jw1oy2yofrhudk.ll
-3 vcpkg/commands.set-installed.ll
-3 yosys/SimpSolver.ll
-4 abc/giaSupp.ll
-4 arrow/api_vector.ll
-4 cvc5/theory_arith_private.ll
-4 ffmpeg/format.ll
-4 git/notes-utils.ll
-4 image-rs/1clnprdgqfw2q9lq.ll
-4 linux/main.ll
-4 linux/trace_dynevent.ll
-4 linux/waitid.ll
-4 llvm/EvaluationResult.ll
-4 node/pipe.ll
-4 openusd/loopPatchBuilder.ll
-4 php/array.ll
-4 sundials/sunmatrix_sparse.ll
-5 darktable/FujiDecompressor.ll
-5 graphviz/strmatch.ll
-5 libwebp/webpdec.ll
-5 llvm/DWARFDie.ll
-6 g2o/hyper_graph.ll
-8 ockam-rs/on09w5afel7x9qz.ll
-8 rust-analyzer-rs/2mbx5ptcpq6fo7sc.ll
-9 icu/udata.ll
-10 opencv/t1.ll
-12 luau/Unifier2.ll
-20 ruby/range.ll
-29 openssl/x_pubkey.ll

@github-actions
Copy link
Contributor

Here's a concise summary of the major changes in this LLVM IR diff:

  1. Boolean Optimization via Truncation: Replaced many icmp ne X, 0 comparisons with trunc nuw X to i1. This is a common optimization that converts integer zero-checks into direct boolean values, improving efficiency and enabling further boolean logic optimizations.

  2. Simplification of Boolean Logic: Converted chains of icmp, and, or, and select instructions (e.g., icmp ne A,0; icmp ne B,0; select ...) into more efficient patterns like trunc A to i1; select i1 ..., i32 B, i32 0. This reduces instruction count and improves clarity of control flow.

  3. Phi Node Type Consistency: Updated several phi instructions to use i1 instead of i32 or i8 for boolean values (e.g., phi i1 [...]), aligning types with the actual logical intent and enabling better downstream optimization.

  4. Control Flow Edge Updates: Adjusted branch targets and phi node predecessor lists to reflect new basic block names and control flow restructuring (e.g., changing br label %93 to br label %91, updating phi preds from %95 to %97). This maintains correctness after reordering or renaming.

  5. Cleanup of Redundant Instructions: Removed unnecessary zext/sext pairs, redundant icmp/or sequences, and dead code (e.g., unreachable blocks, unused phi operands), streamlining the IR and reducing register pressure.

These changes collectively improve code generation quality by promoting type-safe boolean operations, simplifying logic, and removing redundancy — all hallmarks of mature IR-level optimizations.

model: qwen-plus-latest
CompletionUsage(completion_tokens=381, prompt_tokens=108437, total_tokens=108818, completion_tokens_details=None, prompt_tokens_details=None)

@ParkHanbum
Copy link

@dtcxzyw is this BAD??

@dtcxzyw
Copy link
Owner

dtcxzyw commented Feb 27, 2026

I think we are in the right direction. You can look at the first few items of the list to find missing optimizations exposed by this patch (e.g., bench/grpc/optimized/tls_security_connector.ll).

@ParkHanbum
Copy link

ParkHanbum commented Feb 27, 2026

@dtcxzyw I just downloaded the bench/grpc/optimized/tls_security_connector.ll file, counted the number of truncations, searched for ICMP NE, and looked for potential IR candidates.

The following IR is presumed to have been changed.

 5367   %9 = icmp ne ptr %8, null                         
 5368   %10 = zext i1 %9 to i64                           
 5369   %11 = call i64 @llvm.expect.i64(i64 %10, i64 1)   
 5370   %12 = icmp ne i64 %11, 0                          

TO

1081   %234 = load i8, ptr %233, align 8, !tbaa !101, !range !113, !noundef !114
1082   %235 = trunc nuw i8 %234 to i1

Is this the correct way to find changes?

@dtcxzyw
Copy link
Owner

dtcxzyw commented Mar 1, 2026

Is this the correct way to find changes?

No. I mean you need to find the minimized pattern to demonstrate the regression. You should download bench/grpc/original/tls_security_connector.ll, then run opt -O3 with/without your patch (you can also use a flag to control whether your patch is functioning). After getting two versions of optimized IR, you can count the number of instructions (or specific patterns), construct an interestingness test, and use llvm-reduce to get the reduced reproducer.

Here is the script I use:

a=$(bin/opt -O3 -disable-loop-unrolling -vectorize-loops=false -vectorize-slp=false <flag-to-enable-your-patch> $1 -S 2>/dev/null)
b=$(bin/opt -O3 -disable-loop-unrolling -vectorize-loops=false -vectorize-slp=false $1 -S 2>/dev/null)
if [[ "$a" == "$b" ]]; then
    exit 1
fi
la=$(echo "$a" | grep "getelementptr inbounds" | wc -l)
lb=$(echo "$b" | grep "getelementptr inbounds" | wc -l)
echo "$la $lb"
if [[ $la -lt $lb ]]; then
    exit 0
fi
exit 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants