Skip to content

Conversation

@sandilya-xilinx
Copy link
Contributor

…ghput

Processing 1 folder(s)...
Updating xrt_smi_npu3.a from 19 files...
✓ Updated: npu3\xrt_smi_npu3.a
➕ New files added (4):
+ profile_cmd_chain_latency.json
+ profile_cmd_chain_throughput.json
+ recipe_cmd_chain_latency.json
+ recipe_cmd_chain_throughput.json

============================================================ SUMMARY:
Archives processed: 1/1
Total new files: 4
Total updated files: 0

Archive changes:
xrt_smi_npu3.a: 4 new

…ghput

Processing 1 folder(s)...
Updating xrt_smi_npu3.a from 19 files...
✓ Updated: npu3\xrt_smi_npu3.a
  ➕ New files added (4):
    + profile_cmd_chain_latency.json
    + profile_cmd_chain_throughput.json
    + recipe_cmd_chain_latency.json
    + recipe_cmd_chain_throughput.json

============================================================
SUMMARY:
Archives processed: 1/1
Total new files: 4
Total updated files: 0

Archive changes:
  xrt_smi_npu3.a: 4 new
@sandilya-xilinx
Copy link
Contributor Author

@sandilya-xilinx Please make use of the depth and mode fields in profile.json instead of copying each kernel multiple times https://github.com/Xilinx/XRT/blob/master/src/runtime_src/core/common/runner/profile.md#:~:text=%22mode%22%3A%20mode%20%20%20%20%20%20%20%20%20%20%20//%20latency%2C%20throughput%2C%20or%20validate%0A%20%20%20%20%22depth%22%3A%20depth%20%20%20%20%20%20%20%20%20//%20clone%20the%20recipe%20runlist

Yes, sure. I am testing this and will update accordingly. Until then this is provided only for reference if anyone wants to debug.
My tests shows that if the depth is less than 6 (<=5) then its not changing and instead it is sending as individual commands. I will test further then update. Until then its labelled as Do Not Merge.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds command-chain profiling recipes for measuring latency and throughput on NPU3 devices. The changes introduce configuration files that define execution parameters for benchmarking command-chain operations.

  • Adds two new recipe directories for cmd_chain_latency and cmd_chain_throughput
  • Configures both recipes to run 10 iterations in non-verbose mode

Reviewed Changes

Copilot reviewed 2 out of 5 changed files in this pull request and generated no comments.

File Description
archive/npu3/cmd_chain_throughput/profile_cmd_chain_throughput.json Configuration for throughput profiling with 10 iterations
archive/npu3/cmd_chain_latency/profile_cmd_chain_latency.json Configuration for latency profiling with 10 iterations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants