Skip to content

Commit 9bbda7c

Browse files
rscgopherbot
authored andcommitted
cmd/compile: make prove understand div, mod better
This CL introduces new divisible and divmod passes that rewrite divisibility checks and div, mod, and mul. These happen after prove, so that prove can make better sense of the code for deriving bounds, and they must run before decompose, so that 64-bit ops can be lowered to 32-bit ops on 32-bit systems. And then they need another generic pass as well, to optimize the generated code before decomposing. The three opt passes are "opt", "middle opt", and "late opt". (Perhaps instead they should be "generic", "opt", and "late opt"?) The "late opt" pass repeats the "middle opt" work on any new code that has been generated in the interim. There will not be new divs or mods, but there may be new muls. The x%c==0 rewrite rules are much simpler now, since they can match before divs have been rewritten. This has the effect of applying them more consistently and making the rewrite rules independent of the exact div rewrites. Prove is also now charged with marking signed div/mod as unsigned when the arguments call for it, allowing simpler code to be emitted in various cases. For example, t.Seconds()/2 and len(x)/2 are now recognized as unsigned, meaning they compile to a simple shift (unsigned division), avoiding the more complex fixup we need for signed values. https://gist.github.com/rsc/99d9d3bd99cde87b6a1a390e3d85aa32 shows a diff of 'go build -a -gcflags=-d=ssa/prove/debug=1 std' output before and after. "Proved Rsh64x64 shifts to zero" is replaced by the higher-level "Proved Div64 is unsigned" (the shift was in the signed expansion of div by constant), but otherwise prove is only finding more things to prove. One short example, in code that does x[i%len(x)]: < runtime/mfinal.go:131:34: Proved Rsh64x64 shifts to zero --- > runtime/mfinal.go:131:34: Proved Div64 is unsigned > runtime/mfinal.go:131:38: Proved IsInBounds A longer example: < crypto/internal/fips140/sha3/shake.go:28:30: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:38:27: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:53:46: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:55:46: Proved Rsh64x64 shifts to zero --- > crypto/internal/fips140/sha3/shake.go:28:30: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:28:30: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:28:30: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:38:27: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:45:7: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:46:4: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:53:46: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:53:46: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:53:46: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:55:46: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:55:46: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:55:46: Proved IsSliceInBounds These diffs are due to the smaller opt being better and taking work away from prove: < image/jpeg/dct.go:307:5: Proved IsInBounds < image/jpeg/dct.go:308:5: Proved IsInBounds ... < image/jpeg/dct.go:442:5: Proved IsInBounds In the old opt, Mul by 8 was rewritten to Lsh by 3 early. This CL delays that rule to help prove recognize mods, but it also helps opt constant-fold the slice x[8*i:8*i+8:8*i+8]. Specifically, computing the length, opt can now do: (Sub64 (Add (Mul 8 i) 8) (Add (Mul 8 i) 8)) -> (Add 8 (Sub (Mul 8 i) (Mul 8 i))) -> (Add 8 (Mul 8 (Sub i i))) -> (Add 8 (Mul 8 0)) -> (Add 8 0) -> 8 The key step is (Sub (Mul x y) (Mul x z)) -> (Mul x (Sub y z)), Leaving the multiply as Mul enables using that step; the old rewrite to Lsh blocked it, leaving prove to figure out the length and then remove the bounds checks. But now opt can evaluate the length down to a constant 8 and then constant-fold away the bounds checks 0 < 8, 1 < 8, and so on. After that, the compiler has nothing left to prove. Benchmarks are noisy in general; I checked the assembly for the many large increases below, and the vast majority are unchanged and presumably hitting the caches differently in some way. The divisibility optimizations were not reliably triggering before. This leads to a very large improvement in some cases, like DivisiblePow2constI64, DivisibleconstI64 on 64-bit systems and DivisbleconstU64 on 32-bit systems. Another way the divisibility optimizations were unreliable before was incorrectly triggering for x/3, x%3 even though they are written not to do that. There is a real but small slowdown in the DivisibleWDivconst benchmarks on Mac because in the cases used in the benchmark, it is still faster (on Mac) to do the divisibility check than to remultiply. This may be worth further study. Perhaps when there is no rotate (meaning the divisor is odd), the divisibility optimization should be enabled always. In any event, this CL makes it possible to study that. benchmark \ host s7 linux-amd64 mac linux-arm64 linux-ppc64le linux-386 s7:GOARCH=386 linux-arm vs base vs base vs base vs base vs base vs base vs base vs base LoadAdd ~ ~ ~ ~ ~ -1.59% ~ ~ ExtShift ~ ~ -42.14% +0.10% ~ +1.44% +5.66% +8.50% Modify ~ ~ ~ ~ ~ ~ ~ -1.53% MullImm ~ ~ ~ ~ ~ +37.90% -21.87% +3.05% ConstModify ~ ~ ~ ~ -49.14% ~ ~ ~ BitSet ~ ~ ~ ~ -15.86% -14.57% +6.44% +0.06% BitClear ~ ~ ~ ~ ~ +1.78% +3.50% +0.06% BitToggle ~ ~ ~ ~ ~ -16.09% +2.91% ~ BitSetConst ~ ~ ~ ~ ~ ~ ~ -0.49% BitClearConst ~ ~ ~ ~ -28.29% ~ ~ -0.40% BitToggleConst ~ ~ ~ +8.89% -31.19% ~ ~ -0.77% MulNeg ~ ~ ~ ~ ~ ~ ~ ~ Mul2Neg ~ ~ -4.83% ~ ~ -13.75% -5.92% ~ DivconstI64 ~ ~ ~ ~ ~ -30.12% ~ +0.50% ModconstI64 ~ ~ -9.94% -4.63% ~ +3.15% ~ +5.32% DivisiblePow2constI64 -34.49% -12.58% ~ ~ -12.25% ~ ~ ~ DivisibleconstI64 -24.69% -25.06% -0.40% -2.27% -42.61% -3.31% ~ +1.63% DivisibleWDivconstI64 ~ ~ ~ ~ ~ -17.55% ~ -0.60% DivconstU64/3 ~ ~ ~ ~ ~ +1.51% ~ ~ DivconstU64/5 ~ ~ ~ ~ ~ ~ ~ ~ DivconstU64/37 ~ ~ -0.18% ~ ~ +2.70% ~ ~ DivconstU64/1234567 ~ ~ ~ ~ ~ ~ ~ +0.12% ModconstU64 ~ ~ ~ -0.24% ~ -5.10% -1.07% -1.56% DivisibleconstU64 ~ ~ ~ ~ ~ -29.01% -59.13% -50.72% DivisibleWDivconstU64 ~ ~ -12.18% -18.88% ~ -5.50% -3.91% +5.17% DivconstI32 ~ ~ -0.48% ~ -34.69% +89.01% -6.01% -16.67% ModconstI32 ~ +2.95% -0.33% ~ ~ -2.98% -5.40% -8.30% DivisiblePow2constI32 ~ ~ ~ ~ ~ ~ ~ -16.22% DivisibleconstI32 ~ ~ ~ ~ ~ -37.27% -47.75% -25.03% DivisibleWDivconstI32 -11.59% +5.22% -12.99% -23.83% ~ +45.95% -7.03% -10.01% DivconstU32 ~ ~ ~ ~ ~ +74.71% +4.81% ~ ModconstU32 ~ ~ +0.53% +0.18% ~ +51.16% ~ ~ DivisibleconstU32 ~ ~ ~ -0.62% ~ -4.25% ~ ~ DivisibleWDivconstU32 -2.77% +5.56% +11.12% -5.15% ~ +48.70% +25.11% -4.07% DivconstI16 -6.06% ~ -0.33% +0.22% ~ ~ -9.68% +5.47% ModconstI16 ~ ~ +4.44% +2.82% ~ ~ ~ +5.06% DivisiblePow2constI16 ~ ~ ~ ~ ~ ~ ~ -0.17% DivisibleconstI16 ~ ~ -0.23% ~ ~ ~ +4.60% +6.64% DivisibleWDivconstI16 -1.44% -0.43% +13.48% -5.76% ~ +1.62% -23.15% -9.06% DivconstU16 +1.61% ~ -0.35% -0.47% ~ ~ +15.59% ~ ModconstU16 ~ ~ ~ ~ ~ -0.72% ~ +14.23% DivisibleconstU16 ~ ~ -0.05% +3.00% ~ ~ ~ +5.06% DivisibleWDivconstU16 +52.10% +0.75% +17.28% +4.79% ~ -37.39% +5.28% -9.06% DivconstI8 ~ ~ -0.34% -0.96% ~ ~ -9.20% ~ ModconstI8 +2.29% ~ +4.38% +2.96% ~ ~ ~ ~ DivisiblePow2constI8 ~ ~ ~ ~ ~ ~ ~ ~ DivisibleconstI8 ~ ~ ~ ~ ~ ~ +6.04% ~ DivisibleWDivconstI8 -26.44% +1.69% +17.03% +4.05% ~ +32.48% -24.90% ~ DivconstU8 -4.50% +14.06% -0.28% ~ ~ ~ +4.16% +0.88% ModconstU8 ~ ~ +25.84% -0.64% ~ ~ ~ ~ DivisibleconstU8 ~ ~ -5.70% ~ ~ ~ ~ ~ DivisibleWDivconstU8 +49.55% +9.07% ~ +4.03% +53.87% -40.03% +39.72% -3.01% Mul2 ~ ~ ~ ~ ~ ~ ~ ~ MulNeg2 ~ ~ ~ ~ -11.73% ~ ~ -0.02% EfaceInteger ~ ~ ~ ~ ~ +18.11% ~ +2.53% TypeAssert +33.90% +2.86% ~ ~ ~ -1.07% -5.29% -1.04% Div64UnsignedSmall ~ ~ ~ ~ ~ ~ ~ ~ Div64Small ~ ~ ~ ~ ~ -0.88% ~ +2.39% Div64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +0.35% Div64SmallNegDividend ~ ~ ~ ~ ~ -0.84% ~ +3.57% Div64SmallNegBoth ~ ~ ~ ~ ~ -0.86% ~ +3.55% Div64Unsigned ~ ~ ~ ~ ~ ~ ~ -0.11% Div64 ~ ~ ~ ~ ~ ~ ~ +0.11% Div64NegDivisor ~ ~ ~ ~ ~ -1.29% ~ ~ Div64NegDividend ~ ~ ~ ~ ~ -1.44% ~ ~ Div64NegBoth ~ ~ ~ ~ ~ ~ ~ +0.28% Mod64UnsignedSmall ~ ~ ~ ~ ~ +0.48% ~ +0.93% Mod64Small ~ ~ ~ ~ ~ ~ ~ ~ Mod64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +1.44% Mod64SmallNegDividend ~ ~ ~ ~ ~ +0.22% ~ +1.37% Mod64SmallNegBoth ~ ~ ~ ~ ~ ~ ~ -2.22% Mod64Unsigned ~ ~ ~ ~ ~ -0.95% ~ +0.11% Mod64 ~ ~ ~ ~ ~ ~ ~ ~ Mod64NegDivisor ~ ~ ~ ~ ~ ~ ~ -0.02% Mod64NegDividend ~ ~ ~ ~ ~ ~ ~ ~ Mod64NegBoth ~ ~ ~ ~ ~ ~ ~ -0.02% MulconstI32/3 ~ ~ ~ -25.00% ~ ~ ~ +47.37% MulconstI32/5 ~ ~ ~ +33.28% ~ ~ ~ +32.21% MulconstI32/12 ~ ~ ~ -2.13% ~ ~ ~ -0.02% MulconstI32/120 ~ ~ ~ +2.93% ~ ~ ~ -0.03% MulconstI32/-120 ~ ~ ~ -2.17% ~ ~ ~ -0.03% MulconstI32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03% MulconstI32/65538 ~ ~ ~ ~ ~ -33.38% ~ +0.04% MulconstI64/3 ~ ~ ~ +33.35% ~ -0.37% ~ -0.13% MulconstI64/5 ~ ~ ~ -25.00% ~ -0.34% ~ ~ MulconstI64/12 ~ ~ ~ +2.13% ~ +11.62% ~ +2.30% MulconstI64/120 ~ ~ ~ -1.98% ~ ~ ~ ~ MulconstI64/-120 ~ ~ ~ +0.75% ~ ~ ~ ~ MulconstI64/65537 ~ ~ ~ ~ ~ +5.61% ~ ~ MulconstI64/65538 ~ ~ ~ ~ ~ +5.25% ~ ~ MulconstU32/3 ~ +0.81% ~ +33.39% ~ +77.92% ~ -32.31% MulconstU32/5 ~ ~ ~ -24.97% ~ +77.92% ~ -24.47% MulconstU32/12 ~ ~ ~ +2.06% ~ ~ ~ +0.03% MulconstU32/120 ~ ~ ~ -2.74% ~ ~ ~ +0.03% MulconstU32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03% MulconstU32/65538 ~ ~ ~ ~ ~ -33.42% ~ -0.03% MulconstU64/3 ~ ~ ~ +33.33% ~ -0.28% ~ +1.22% MulconstU64/5 ~ ~ ~ -25.00% ~ ~ ~ -0.64% MulconstU64/12 ~ ~ ~ +2.30% ~ +11.59% ~ +0.14% MulconstU64/120 ~ ~ ~ -2.82% ~ ~ ~ +0.04% MulconstU64/65537 ~ +0.37% ~ ~ ~ +5.58% ~ ~ MulconstU64/65538 ~ ~ ~ ~ ~ +5.16% ~ ~ ShiftArithmeticRight ~ ~ ~ ~ ~ -10.81% ~ +0.31% Switch8Predictable +14.69% ~ ~ ~ ~ -24.85% ~ ~ Switch8Unpredictable ~ -0.58% -3.80% ~ ~ -11.78% ~ -0.79% Switch32Predictable -10.33% +17.89% ~ ~ ~ +5.76% ~ ~ Switch32Unpredictable -3.15% +1.19% +9.42% ~ ~ -10.30% -5.09% +0.44% SwitchStringPredictable +70.88% +20.48% ~ ~ ~ +2.39% ~ +0.31% SwitchStringUnpredictable ~ +3.91% -5.06% -0.98% ~ +0.61% +2.03% ~ SwitchTypePredictable +146.58% -1.10% ~ -12.45% ~ -0.46% -3.81% ~ SwitchTypeUnpredictable +0.46% -0.83% ~ +4.18% ~ +0.43% ~ +0.62% SwitchInterfaceTypePredictable -13.41% -10.13% +11.03% ~ ~ -4.38% ~ +0.75% SwitchInterfaceTypeUnpredictable -6.37% -2.14% ~ -3.21% ~ -4.20% ~ +1.08% Fixes #63110. Fixes #75954. Change-Id: I55a876f08c6c14f419ce1a8cbba2eaae6c6efbf0 Reviewed-on: https://go-review.googlesource.com/c/go/+/714160 Reviewed-by: Keith Randall <[email protected]> Reviewed-by: Keith Randall <[email protected]> Auto-Submit: Russ Cox <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
1 parent 915c183 commit 9bbda7c

25 files changed

+6350
-4365
lines changed

src/cmd/compile/internal/ssa/_gen/dec.rules

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
// This file contains rules to decompose builtin compound types
66
// (complex,string,slice,interface) into their constituent
7-
// types. These rules work together with the decomposeBuiltIn
7+
// types. These rules work together with the decomposeBuiltin
88
// pass which handles phis of these types.
99

1010
(Store {t} _ _ mem) && t.Size() == 0 => mem

src/cmd/compile/internal/ssa/_gen/dec64.rules

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
// license that can be found in the LICENSE file.
44

55
// This file contains rules to decompose [u]int64 types on 32-bit
6-
// architectures. These rules work together with the decomposeBuiltIn
6+
// architectures. These rules work together with the decomposeBuiltin
77
// pass which handles phis of these typ.
88

99
(Int64Hi (Int64Make hi _)) => hi
@@ -217,11 +217,32 @@
217217
(Rsh8x64 x y) => (Rsh8x32 x (Or32 <typ.UInt32> (Zeromask (Int64Hi y)) (Int64Lo y)))
218218
(Rsh8Ux64 x y) => (Rsh8Ux32 x (Or32 <typ.UInt32> (Zeromask (Int64Hi y)) (Int64Lo y)))
219219

220+
220221
(RotateLeft64 x (Int64Make hi lo)) => (RotateLeft64 x lo)
221222
(RotateLeft32 x (Int64Make hi lo)) => (RotateLeft32 x lo)
222223
(RotateLeft16 x (Int64Make hi lo)) => (RotateLeft16 x lo)
223224
(RotateLeft8 x (Int64Make hi lo)) => (RotateLeft8 x lo)
224225

226+
// RotateLeft64 by constant, for use in divmod.
227+
(RotateLeft64 <t> x (Const(64|32|16|8) [c])) && c&63 == 0 => x
228+
(RotateLeft64 <t> x (Const(64|32|16|8) [c])) && c&63 == 32 => (Int64Make <t> (Int64Lo x) (Int64Hi x))
229+
(RotateLeft64 <t> x (Const(64|32|16|8) [c])) && 0 < c&63 && c&63 < 32 =>
230+
(Int64Make <t>
231+
(Or32 <typ.UInt32>
232+
(Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)]))
233+
(Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)])))
234+
(Or32 <typ.UInt32>
235+
(Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)]))
236+
(Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
237+
(RotateLeft64 <t> x (Const(64|32|16|8) [c])) && 32 < c&63 && c&63 < 64 =>
238+
(Int64Make <t>
239+
(Or32 <typ.UInt32>
240+
(Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)]))
241+
(Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)])))
242+
(Or32 <typ.UInt32>
243+
(Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)]))
244+
(Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
245+
225246
// Clean up constants a little
226247
(Or32 <typ.UInt32> (Zeromask (Const32 [c])) y) && c == 0 => y
227248
(Or32 <typ.UInt32> (Zeromask (Const32 [c])) y) && c != 0 => (Const32 <typ.UInt32> [-1])
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
// Copyright 2025 The Go Authors. All rights reserved.
2+
// Use of this source code is governed by a BSD-style
3+
// license that can be found in the LICENSE file.
4+
5+
// Divisibility checks (x%c == 0 or x%c != 0) convert to multiply, rotate, compare.
6+
// The opt pass rewrote x%c to x-(x/c)*c
7+
// and then also rewrote x-(x/c)*c == 0 to x == (x/c)*c.
8+
// If x/c is being used for a division already (div.Uses != 1)
9+
// then we leave the expression alone.
10+
//
11+
// See ../magic.go for a detailed description of these algorithms.
12+
// See test/codegen/divmod.go for tests.
13+
// See divmod.rules for other division rules that run after these.
14+
15+
// Divisiblity by unsigned or signed power of two.
16+
(Eq(8|16|32|64) x (Mul(8|16|32|64) <t> (Div(8|16|32|64)u x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c])))
17+
&& x.Op != OpConst64 && isPowerOfTwo(c) =>
18+
(Eq(8|16|32|64) (And(8|16|32|64) <t> x (Const(8|16|32|64) <t> [c-1])) (Const(8|16|32|64) <t> [0]))
19+
(Eq(8|16|32|64) x (Mul(8|16|32|64) <t> (Div(8|16|32|64) x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c])))
20+
&& x.Op != OpConst64 && isPowerOfTwo(c) =>
21+
(Eq(8|16|32|64) (And(8|16|32|64) <t> x (Const(8|16|32|64) <t> [c-1])) (Const(8|16|32|64) <t> [0]))
22+
(Neq(8|16|32|64) x (Mul(8|16|32|64) <t> (Div(8|16|32|64)u x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c])))
23+
&& x.Op != OpConst64 && isPowerOfTwo(c) =>
24+
(Neq(8|16|32|64) (And(8|16|32|64) <t> x (Const(8|16|32|64) <t> [c-1])) (Const(8|16|32|64) <t> [0]))
25+
(Neq(8|16|32|64) x (Mul(8|16|32|64) <t> (Div(8|16|32|64) x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c])))
26+
&& x.Op != OpConst64 && isPowerOfTwo(c) =>
27+
(Neq(8|16|32|64) (And(8|16|32|64) <t> x (Const(8|16|32|64) <t> [c-1])) (Const(8|16|32|64) <t> [0]))
28+
29+
// Divisiblity by unsigned.
30+
(Eq8 x (Mul8 <t> div:(Div8u x (Const8 [c])) (Const8 [c])))
31+
&& div.Uses == 1
32+
&& x.Op != OpConst8 && udivisibleOK8(c) =>
33+
(Leq8U
34+
(RotateLeft8 <t>
35+
(Mul8 <t> x (Const8 <t> [int8(udivisible8(c).m)]))
36+
(Const8 <t> [int8(8 - udivisible8(c).k)]))
37+
(Const8 <t> [int8(udivisible8(c).max)]))
38+
(Neq8 x (Mul8 <t> div:(Div8u x (Const8 [c])) (Const8 [c])))
39+
&& div.Uses == 1
40+
&& x.Op != OpConst8 && udivisibleOK8(c) =>
41+
(Less8U
42+
(Const8 <t> [int8(udivisible8(c).max)])
43+
(RotateLeft8 <t>
44+
(Mul8 <t> x (Const8 <t> [int8(udivisible8(c).m)]))
45+
(Const8 <t> [int8(8 - udivisible8(c).k)])))
46+
(Eq16 x (Mul16 <t> div:(Div16u x (Const16 [c])) (Const16 [c])))
47+
&& div.Uses == 1
48+
&& x.Op != OpConst16 && udivisibleOK16(c) =>
49+
(Leq16U
50+
(RotateLeft16 <t>
51+
(Mul16 <t> x (Const16 <t> [int16(udivisible16(c).m)]))
52+
(Const16 <t> [int16(16 - udivisible16(c).k)]))
53+
(Const16 <t> [int16(udivisible16(c).max)]))
54+
(Neq16 x (Mul16 <t> div:(Div16u x (Const16 [c])) (Const16 [c])))
55+
&& div.Uses == 1
56+
&& x.Op != OpConst16 && udivisibleOK16(c) =>
57+
(Less16U
58+
(Const16 <t> [int16(udivisible16(c).max)])
59+
(RotateLeft16 <t>
60+
(Mul16 <t> x (Const16 <t> [int16(udivisible16(c).m)]))
61+
(Const16 <t> [int16(16 - udivisible16(c).k)])))
62+
(Eq32 x (Mul32 <t> div:(Div32u x (Const32 [c])) (Const32 [c])))
63+
&& div.Uses == 1
64+
&& x.Op != OpConst32 && udivisibleOK32(c) =>
65+
(Leq32U
66+
(RotateLeft32 <t>
67+
(Mul32 <t> x (Const32 <t> [int32(udivisible32(c).m)]))
68+
(Const32 <t> [int32(32 - udivisible32(c).k)]))
69+
(Const32 <t> [int32(udivisible32(c).max)]))
70+
(Neq32 x (Mul32 <t> div:(Div32u x (Const32 [c])) (Const32 [c])))
71+
&& div.Uses == 1
72+
&& x.Op != OpConst32 && udivisibleOK32(c) =>
73+
(Less32U
74+
(Const32 <t> [int32(udivisible32(c).max)])
75+
(RotateLeft32 <t>
76+
(Mul32 <t> x (Const32 <t> [int32(udivisible32(c).m)]))
77+
(Const32 <t> [int32(32 - udivisible32(c).k)])))
78+
(Eq64 x (Mul64 <t> div:(Div64u x (Const64 [c])) (Const64 [c])))
79+
&& div.Uses == 1
80+
&& x.Op != OpConst64 && udivisibleOK64(c) =>
81+
(Leq64U
82+
(RotateLeft64 <t>
83+
(Mul64 <t> x (Const64 <t> [int64(udivisible64(c).m)]))
84+
(Const64 <t> [int64(64 - udivisible64(c).k)]))
85+
(Const64 <t> [int64(udivisible64(c).max)]))
86+
(Neq64 x (Mul64 <t> div:(Div64u x (Const64 [c])) (Const64 [c])))
87+
&& div.Uses == 1
88+
&& x.Op != OpConst64 && udivisibleOK64(c) =>
89+
(Less64U
90+
(Const64 <t> [int64(udivisible64(c).max)])
91+
(RotateLeft64 <t>
92+
(Mul64 <t> x (Const64 <t> [int64(udivisible64(c).m)]))
93+
(Const64 <t> [int64(64 - udivisible64(c).k)])))
94+
95+
// Divisiblity by signed.
96+
(Eq8 x (Mul8 <t> div:(Div8 x (Const8 [c])) (Const8 [c])))
97+
&& div.Uses == 1
98+
&& x.Op != OpConst8 && sdivisibleOK8(c) =>
99+
(Leq8U
100+
(RotateLeft8 <t>
101+
(Add8 <t> (Mul8 <t> x (Const8 <t> [int8(sdivisible8(c).m)]))
102+
(Const8 <t> [int8(sdivisible8(c).a)]))
103+
(Const8 <t> [int8(8 - sdivisible8(c).k)]))
104+
(Const8 <t> [int8(sdivisible8(c).max)]))
105+
(Neq8 x (Mul8 <t> div:(Div8 x (Const8 [c])) (Const8 [c])))
106+
&& div.Uses == 1
107+
&& x.Op != OpConst8 && sdivisibleOK8(c) =>
108+
(Less8U
109+
(Const8 <t> [int8(sdivisible8(c).max)])
110+
(RotateLeft8 <t>
111+
(Add8 <t> (Mul8 <t> x (Const8 <t> [int8(sdivisible8(c).m)]))
112+
(Const8 <t> [int8(sdivisible8(c).a)]))
113+
(Const8 <t> [int8(8 - sdivisible8(c).k)])))
114+
(Eq16 x (Mul16 <t> div:(Div16 x (Const16 [c])) (Const16 [c])))
115+
&& div.Uses == 1
116+
&& x.Op != OpConst16 && sdivisibleOK16(c) =>
117+
(Leq16U
118+
(RotateLeft16 <t>
119+
(Add16 <t> (Mul16 <t> x (Const16 <t> [int16(sdivisible16(c).m)]))
120+
(Const16 <t> [int16(sdivisible16(c).a)]))
121+
(Const16 <t> [int16(16 - sdivisible16(c).k)]))
122+
(Const16 <t> [int16(sdivisible16(c).max)]))
123+
(Neq16 x (Mul16 <t> div:(Div16 x (Const16 [c])) (Const16 [c])))
124+
&& div.Uses == 1
125+
&& x.Op != OpConst16 && sdivisibleOK16(c) =>
126+
(Less16U
127+
(Const16 <t> [int16(sdivisible16(c).max)])
128+
(RotateLeft16 <t>
129+
(Add16 <t> (Mul16 <t> x (Const16 <t> [int16(sdivisible16(c).m)]))
130+
(Const16 <t> [int16(sdivisible16(c).a)]))
131+
(Const16 <t> [int16(16 - sdivisible16(c).k)])))
132+
(Eq32 x (Mul32 <t> div:(Div32 x (Const32 [c])) (Const32 [c])))
133+
&& div.Uses == 1
134+
&& x.Op != OpConst32 && sdivisibleOK32(c) =>
135+
(Leq32U
136+
(RotateLeft32 <t>
137+
(Add32 <t> (Mul32 <t> x (Const32 <t> [int32(sdivisible32(c).m)]))
138+
(Const32 <t> [int32(sdivisible32(c).a)]))
139+
(Const32 <t> [int32(32 - sdivisible32(c).k)]))
140+
(Const32 <t> [int32(sdivisible32(c).max)]))
141+
(Neq32 x (Mul32 <t> div:(Div32 x (Const32 [c])) (Const32 [c])))
142+
&& div.Uses == 1
143+
&& x.Op != OpConst32 && sdivisibleOK32(c) =>
144+
(Less32U
145+
(Const32 <t> [int32(sdivisible32(c).max)])
146+
(RotateLeft32 <t>
147+
(Add32 <t> (Mul32 <t> x (Const32 <t> [int32(sdivisible32(c).m)]))
148+
(Const32 <t> [int32(sdivisible32(c).a)]))
149+
(Const32 <t> [int32(32 - sdivisible32(c).k)])))
150+
(Eq64 x (Mul64 <t> div:(Div64 x (Const64 [c])) (Const64 [c])))
151+
&& div.Uses == 1
152+
&& x.Op != OpConst64 && sdivisibleOK64(c) =>
153+
(Leq64U
154+
(RotateLeft64 <t>
155+
(Add64 <t> (Mul64 <t> x (Const64 <t> [int64(sdivisible64(c).m)]))
156+
(Const64 <t> [int64(sdivisible64(c).a)]))
157+
(Const64 <t> [int64(64 - sdivisible64(c).k)]))
158+
(Const64 <t> [int64(sdivisible64(c).max)]))
159+
(Neq64 x (Mul64 <t> div:(Div64 x (Const64 [c])) (Const64 [c])))
160+
&& div.Uses == 1
161+
&& x.Op != OpConst64 && sdivisibleOK64(c) =>
162+
(Less64U
163+
(Const64 <t> [int64(sdivisible64(c).max)])
164+
(RotateLeft64 <t>
165+
(Add64 <t> (Mul64 <t> x (Const64 <t> [int64(sdivisible64(c).m)]))
166+
(Const64 <t> [int64(sdivisible64(c).a)]))
167+
(Const64 <t> [int64(64 - sdivisible64(c).k)])))
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
// Copyright 2025 The Go Authors. All rights reserved.
2+
// Use of this source code is governed by a BSD-style
3+
// license that can be found in the LICENSE file.
4+
5+
package main
6+
7+
var divisibleOps = []opData{}
8+
9+
var divisibleBlocks = []blockData{}
10+
11+
func init() {
12+
archs = append(archs, arch{
13+
name: "divisible",
14+
ops: divisibleOps,
15+
blocks: divisibleBlocks,
16+
generic: true,
17+
})
18+
}

0 commit comments

Comments
 (0)