[FEA]: Matrix-view cache.

### Is this a new feature, an improvement, or a change to existing functionality?

New Feature

### How would you describe the priority of this feature request?

Medium

### Please provide a clear description of problem this feature solves

Some algorithms may access a tile more than once within its cuda block without knowing its the same tile. This should not load same tile twice. Or not even store twice. Maybe, an optional mechanism with a maximum dedicated smem-cache size can help redundancy-related issues.

For example, if I'm developing an open-world video-game where a player looks around and sees world, it needs tiles around the player (assuming 2D world map). When computing things for the player, the access to tiles could be optimized by actively caching by developer or automatically by cutile. Because, why not? If its multiplayer, then 8 players could be in same cluster and use multicasting too. (assuming its cloud-gaming with B200 gpu)


### Feature Description

Read-caching, write-caching, maybe cluster-based multicasting automatically.

### Describe your ideal solution

LRU, LFU, direct-mapped, even multiple layers (block L1 -> cluster L2 -> TMA), anything with an eviction works.

### Describe any alternatives you have considered

I have looked at google with "cuda TMA cache" but it returned with 0 results.

### Additional context

Maybe Blackwell architecture's tensor-memory can be used as a scratchpad memory for this instead of shared-memory?

### Contributing Guidelines

- [x] I agree to follow cuTile Python's contributing guidelines
- [x] I have searched the [open feature requests](https://github.com/nvidia/cutile-python/issues?q=is%3Aopen+is%3Aissue+label%3A%22feature+request) and have found no duplicates for this feature request

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEA]: Matrix-view cache. #46

Is this a new feature, an improvement, or a change to existing functionality?

How would you describe the priority of this feature request?

Please provide a clear description of problem this feature solves

Feature Description

Describe your ideal solution

Describe any alternatives you have considered

Additional context

Contributing Guidelines

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEA]: Matrix-view cache. #46

Description

Is this a new feature, an improvement, or a change to existing functionality?

How would you describe the priority of this feature request?

Please provide a clear description of problem this feature solves

Feature Description

Describe your ideal solution

Describe any alternatives you have considered

Additional context

Contributing Guidelines

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions