Skip to content

Conversation

pull[bot]
Copy link

@pull pull bot commented Oct 2, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot locked and limited conversation to collaborators Oct 2, 2025
@pull pull bot added the ⤵️ pull label Oct 2, 2025
erichkeane and others added 28 commits October 15, 2025 13:59
When creating a 'yield', we have to make sure that the lexical scope we
created gets cleaned up.  This isn't really testable until a followup
patch, but I got this wrong in quite a few places.
…nd alike as trivial and returns a retained value (#161135)

Treat NSStringFromSelector, NSSelectorFromString, NSStringFromClass,
NSClassFromString, NSStringFromProtocol, and NSProtocolFromString as
trivial, and treat their return values as a safe pointer origin since
the return value of these functions don't need to be retained.
#161089)

This patch extends stdio redirection support to integrated and external
terminals. Currently, these cases are not covered by the standard logic
because `attach` is used instead of `launch`. To be honest,
`runInTerminal` in
[VSCode](https://github.com/microsoft/vscode/blob/main/src/vs/workbench/contrib/debug/node/terminals.ts#L188)
request supports `>` and `<` for redirection, but not `2>`. We could use
the `argsCanBeInterpretedByShell` option to use full power of shell,
however it requieres proper escaping of arguments on lldb-dap side. So,
I think it will be better to have the only one option to control stdio
redirection that works consistently across all console modes.

Also, fixed a small typo in a comparison that was leading to
out-of-bound access, and added `null` as a possible value for `stdio`
array in package.json.
When [1] landed, gdbremote server tests had to be updated to understand
the new packet field.

[1]: #163249
The BridgeOS SDK is capitalized, but previously failed to parse because
we were looking for bridgeOS. This PR updates the enum value and the
canonical spelling.

rdar://162641896
…g a parameter (#160994)

This PR updates the forward declaration checker so that unary operator &
and * will be ignored for the purpose of determining if a given function
argument is also a function argument of the caller / call-site.
With #161460 merged, we should
also enforce it in CI (although nothing changed for current files)
Instead of just deferring to ptrtoint, we should truncate to the index
width and then perform the ZextOrTrunc.
This is effectively NFC since ptrtoint ends up doing the same thing, but
handling it explicitly is cleaner and will make it easier to eventually
upstream the changes needed for CHERI support.

Reviewed By: nikic, arsenm

Pull Request: #139423
This can improve performance on 32-bit baremetal targets.
…3612)

This patch enhances the performance of `std::distance` and
`std::ranges::distance` for non-random-access segmented iterators, e.g.,
`std::join_view` iterators. The original implementation operates in
linear time, `O(n)`, where `n` is the total number of elements. The
optimized version reduces this to approximately `O(n / segment_size)` by
leveraging segmented structure, where `segment_size` is the average size
of each segment.

The table below summarizes the peak performance improvements observed
across different segment sizes, with the total element count `n` ranging
up to `1 << 20` (1,048,576 elements), based on benchmark results.

```
----------------------------------------------------------------------------------------
Container/n/segment_size                          std::distance     std::ranges::distance
----------------------------------------------------------------------------------------
join_view(vector<vector<int>>)/1048576/256	         401.6x	              422.9x
join_view(deque<deque<int>>)/1048576/256 	         112.1x	              132.6x
join_view(vector<vector<int>>)/1048576/1024         1669.2x              1559.1x
join_view(deque<deque<int>>)/1048576/1024            487.7x               497.4x
```

## Benchmarks


#### Segment size = 1024

```
-----------------------------------------------------------------------------------------
Benchmark                                                    Before      After    Speedup
-----------------------------------------------------------------------------------------
std::distance(join_view(vector<vector<int>>))/50            38.8 ns     1.01 ns     38.4x
std::distance(join_view(vector<vector<int>>))/1024           660 ns     1.02 ns    647.1x
std::distance(join_view(vector<vector<int>>))/4096          2934 ns     1.98 ns   1481.8x
std::distance(join_view(vector<vector<int>>))/8192          5751 ns     3.92 ns   1466.8x
std::distance(join_view(vector<vector<int>>))/16384        11520 ns     7.06 ns   1631.7x
std::distance(join_view(vector<vector<int>>))/65536        46367 ns     32.2 ns   1440.6x
std::distance(join_view(vector<vector<int>>))/262144      182611 ns      114 ns   1601.9x
std::distance(join_view(vector<vector<int>>))/1048576     737785 ns      442 ns   1669.2x
std::distance(join_view(deque<deque<int>>))/50              53.1 ns     6.13 ns      8.7x
std::distance(join_view(deque<deque<int>>))/1024             854 ns     7.53 ns    113.4x
std::distance(join_view(deque<deque<int>>))/4096            3507 ns     14.7 ns    238.6x
std::distance(join_view(deque<deque<int>>))/8192            7114 ns     17.6 ns    404.2x
std::distance(join_view(deque<deque<int>>))/16384          13997 ns     30.7 ns    455.9x
std::distance(join_view(deque<deque<int>>))/65536          55598 ns      114 ns    487.7x
std::distance(join_view(deque<deque<int>>))/262144        214293 ns      480 ns    446.4x
std::distance(join_view(deque<deque<int>>))/1048576       833000 ns     2183 ns    381.6x
rng::distance(join_view(vector<vector<int>>))/50            39.1 ns     1.10 ns     35.5x
rng::distance(join_view(vector<vector<int>>))/1024           689 ns     1.14 ns    604.4x
rng::distance(join_view(vector<vector<int>>))/4096          2753 ns     2.15 ns   1280.5x
rng::distance(join_view(vector<vector<int>>))/8192          5530 ns     4.61 ns   1199.6x
rng::distance(join_view(vector<vector<int>>))/16384        10968 ns     7.97 ns   1376.2x
rng::distance(join_view(vector<vector<int>>))/65536        46009 ns     35.3 ns   1303.4x
rng::distance(join_view(vector<vector<int>>))/262144      190569 ns      124 ns   1536.9x
rng::distance(join_view(vector<vector<int>>))/1048576     746724 ns      479 ns   1559.1x
rng::distance(join_view(deque<deque<int>>))/50              51.6 ns     6.57 ns      7.9x
rng::distance(join_view(deque<deque<int>>))/1024             826 ns     6.50 ns    127.1x
rng::distance(join_view(deque<deque<int>>))/4096            3323 ns     12.5 ns    265.8x
rng::distance(join_view(deque<deque<int>>))/8192            6619 ns     19.1 ns    346.5x
rng::distance(join_view(deque<deque<int>>))/16384          13495 ns     33.2 ns    406.5x
rng::distance(join_view(deque<deque<int>>))/65536          53668 ns      114 ns    470.8x
rng::distance(join_view(deque<deque<int>>))/262144        236277 ns      475 ns    497.4x
rng::distance(join_view(deque<deque<int>>))/1048576       914177 ns     2157 ns    423.8x
-----------------------------------------------------------------------------------------
```



#### Segment size = 256

```
-----------------------------------------------------------------------------------------
Benchmark                                                    Before      After    Speedup
-----------------------------------------------------------------------------------------
std::distance(join_view(vector<vector<int>>))/50            38.1 ns     1.02 ns     37.4x
std::distance(join_view(vector<vector<int>>))/1024           689 ns     2.06 ns    334.5x
std::distance(join_view(vector<vector<int>>))/4096          2815 ns     7.01 ns    401.6x
std::distance(join_view(vector<vector<int>>))/8192          5507 ns     14.3 ns    385.1x
std::distance(join_view(vector<vector<int>>))/16384        11050 ns     33.7 ns    327.9x
std::distance(join_view(vector<vector<int>>))/65536        44197 ns      118 ns    374.6x
std::distance(join_view(vector<vector<int>>))/262144      175793 ns      449 ns    391.5x
std::distance(join_view(vector<vector<int>>))/1048576     703242 ns     2140 ns    328.7x
std::distance(join_view(deque<deque<int>>))/50              50.2 ns     6.12 ns      8.2x
std::distance(join_view(deque<deque<int>>))/1024             835 ns     11.4 ns     73.2x
std::distance(join_view(deque<deque<int>>))/4096            3353 ns     32.9 ns    101.9x
std::distance(join_view(deque<deque<int>>))/8192            6711 ns     64.2 ns    104.5x
std::distance(join_view(deque<deque<int>>))/16384          13231 ns      118 ns    112.1x
std::distance(join_view(deque<deque<int>>))/65536          53523 ns      556 ns     96.3x
std::distance(join_view(deque<deque<int>>))/262144        219101 ns     2166 ns    101.2x
std::distance(join_view(deque<deque<int>>))/1048576       880277 ns    15852 ns     55.5x
rng::distance(join_view(vector<vector<int>>))/50            37.7 ns     1.13 ns     33.4x
rng::distance(join_view(vector<vector<int>>))/1024           697 ns     2.14 ns    325.7x
rng::distance(join_view(vector<vector<int>>))/4096          2804 ns     7.52 ns    373.0x
rng::distance(join_view(vector<vector<int>>))/8192          5749 ns     15.2 ns    378.2x
rng::distance(join_view(vector<vector<int>>))/16384        11742 ns     34.8 ns    337.4x
rng::distance(join_view(vector<vector<int>>))/65536        47274 ns      116 ns    407.7x
rng::distance(join_view(vector<vector<int>>))/262144      187774 ns      444 ns    422.9x
rng::distance(join_view(vector<vector<int>>))/1048576     749724 ns     2109 ns    355.5x
rng::distance(join_view(deque<deque<int>>))/50              53.0 ns     6.09 ns      8.7x
rng::distance(join_view(deque<deque<int>>))/1024             895 ns     11.0 ns     81.4x
rng::distance(join_view(deque<deque<int>>))/4096            3825 ns     30.6 ns    125.0x
rng::distance(join_view(deque<deque<int>>))/8192            7550 ns     60.5 ns    124.8x
rng::distance(join_view(deque<deque<int>>))/16384          14847 ns      112 ns    132.6x
rng::distance(join_view(deque<deque<int>>))/65536          56888 ns      453 ns    125.6x
rng::distance(join_view(deque<deque<int>>))/262144        231395 ns     2034 ns    113.8x
rng::distance(join_view(deque<deque<int>>))/1048576       933093 ns    15012 ns     62.2x
-----------------------------------------------------------------------------------------
```

Addresses a subtask of #102817.

---------

Co-authored-by: Louis Dionne <[email protected]>
Co-authored-by: A. Jiang <[email protected]>
…2780)

This PR adds lowering of xegpu.load_matrix/store_matrix to
xevm.blockload/blockstore or and llvm.load/store, depending on wi level
attributes.
It includes a few components: 
   1. adds wi-level attributes: subgroup_block_io.   
2. expand load_matrix/store_matrix op definition to support scalar data
(besides vector data).
2. adds a member function to mem_desc to compute the linearized address
for a nd offsets.
   3. add lowering depending on wi-level attributes: 
a) if subgroup_block_io attribute presents, lower to
xevm.blockload/blockstore
c) else lower to llvm.load/store. If result is a vector, lower to
llvm.load/store with vector operand.
Various DAP tests are specifying their own timeouts, with values ranging
from "1" to "20". Most of them seem arbitrary, but some come with a
comment.

The performance characters of running these tests in CI are
unpredictable (they generally run much slower than developers expect)
and really not something we can make assumptions about. I suspect these
timeouts are a contributing factor to the flakiness of the DAP tests.

This PR unifies the timeouts around a central value in the DAP server.

Fixes #162523
…163466)

Hexagon packets can visually straddle labels when data (e.g. jump
tables) in the text section does not carry end-of-packet bits. In such
cases the next instruction, even at a new symbol, appears to continue
the previous packet.

This patch resets packet state when encountering a new symbol so that
packets at symbol starts are guaranteed to start in their own packet.
…tions (#162120)

Op constraints are emitted as static standalone functions and need not
be surrounded by the Dialect's C++ namespace. Currently they are, and
this change stops emitting a namespace around these static functions.
…ng to DXIL (#161753)

Introduces LLVM intrinsic `llvm.dx.resource.getdimensions.x` and its lowering to DXIL op `op.dx.getDimensions`.
The intrinsic will be used to implement `GetDimension` for buffers. The lowering is using `undef` value since it is required by the DXIL format which is based on LLVM 3.7.

Proposal update: llvm/wg-hlsl#350

Closes #112982
Reviewers: thurstond, fmayer

Reviewed By: fmayer

Pull Request: #163667
Reviewers: fmayer, thurstond

Reviewed By: fmayer

Pull Request: #163669
…s. (#163658)

* Add a missing dependency to install struct_itimerval.h
* Add fseeko/ftello declarations to stdio.h
Happened with 'undef,' -- comma not being recognized as a word boundary
Mutation `changeElementCountTo` now uses `ElementCount`
… dxc. (#163474)

Implement the frontend change to support maximal reconvergence feature.

The next work is to generate the corresponding SPIR-V instructions
(`OpExtension` and `OpExecutionMode`) based on the llvm ir added in this
CL `"enable-maximal-reconvergence"="true"`.


Addresses #136930
RKSimon and others added 30 commits October 17, 2025 12:52
…ndling (#163567)

Leave this to shuffle folding instead.
a1ef81d added overloads for `llvm.matrix.column.major.store` and
`llvm.matrix.column.major.load` that allow strides to occupy an
arbitrary bitwidth. This change wasn't reflected in the verifier,
causing an assertion to trip when given strides overflowing 64-bit. This
patch explicitly caps the bitwidth at 64, repairing the crash and
avoiding future complexity dealing with strides that overflow 64 bits.

PR: #163729
Added Pattern for lowering `Math::ClampFOp` to `ROCDL::FMED3`.
Also added `chipet` option to `MathToRocdl` pass to check for arch
support ISA instructions

Solves [#15072](#157052)

Reapplies #160100

Un-reverts the merged #163259,
and fixes the error.

---------

Signed-off-by: Keshav Vinayak Jha <[email protected]>
Seems like the function returns a dict, not a list of tuples like the
other.
Not sure of the schema or the design for this, so fixing the code
without using the test-suite name. But this might have to be addressed
once the owner can take a look.
…ucturization (#163942)

This fixes the bug where deserializer would fail, with as assert, during
the control flow structurization when a block to be removed still had
uses. This indicates that the block that was sunk still is being
referenced outside the region as the control flow is not structured.

closes #163099
This PR seeks to improve the clarity of the Shard dialect documentation,
in particular the descriptions of the communication operations.
…tInt based operands. (#159358)

Simplifcation of vector.reduce intrinsics are prevented by an early
bailout for ConstantInt base operands. This PR removes the bailout and
updates the tests to show matching output when
-use-constant-int-for-*-splat is used.
… outside the block

If the copyable entry has the last instruction, used only outside the
block, tha insert ion point for the vector code should be the last
instruction itself, not the following one. It prevents wrong def-use
sequences, which might be generated for the buildvector nodes.

Fixes #163404
HLSL extends C++'s requirement that unary `!` apply to boolean arguments
and produces boolean results to apply to vectors. This change implements
implicit conversion for non-boolean vector operands to boolean vectors.

I've noted this behavior in the issue tracking writing the HLSL
specification section on unary operators
(microsoft/hlsl-specs#686).

Fixes #162913

---------

Co-authored-by: Farzon Lotfi <[email protected]>
Co-authored-by: Erich Keane <[email protected]>
Summary:
The changes in #163011 caused
all ELF platforms to default to ELF mangling. We want to auto upgrade
this for linking in new programs to old ones.
The following errors are seen in our builds using MSVC 2022:

BitstreamRemarkParser.h(115): error C2990: 'llvm::remarks::BitstreamBlockParserHelper': non-class template has already been declared as a class template
BitstreamRemarkParser.h(66): note: see declaration of 'llvm::remarks::BitstreamBlockParserHelper'

This change fixes the build issue by adding an explicit template
argument to the `friend class` statements.

This issue is not seen if using `-std:c++20`, but we still support
building as C++17.
…63914)

This patch replaces LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]],
introduced as part of C++17.
…]] (NFC) (#163915)

This patch replaces LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]],
introduced as part of C++17.
…63916)

This patch replaces LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]],
introduced as part of C++17.
…3917)

This patch replaces LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]],
introduced as part of C++17.
…r mul opcode (#163963)

These now all use the same asm printout code
…163660)

Implements genAllocate, genFree, and genCopy for FIR pointer types
(fir.ref, fir.ptr, fir.heap, fir.llvm_ptr) in the OpenACC
PointerLikeType interface.

- genAllocate: Uses fir.alloca for stack types, fir.allocmem for heap
types. Returns null for dynamic/unknown types (unlimited polymorphic,
dynamic arrays, dynamic character lengths, box types).
- genFree: Generates fir.freemem for heap allocations. Returns false if
original allocation cannot be found.
- genCopy: Uses fir.load+fir.store for trivial types (scalars),
hlfir.assign for non-trivial types (arrays, derived types, characters).
Returns false for unsupported dynamic types and box types.

Adds comprehensive MLIR tests covering various FIR types and edge cases.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.