Skip to content

Conversation

@ahoppen
Copy link
Member

@ahoppen ahoppen commented Apr 10, 2025

This adds the infrastructure to take performance measurements for PRs by taking a performance measurement of the project before the changes in the PR.

This will become really useful once we can switch this to a macOS runner and are then able to measure instructions executed by an executable (like swift-format supports using swift-format --measure-instructions) – instruction counting isn’t available on Linux as far as I could find out.

But even now, we can use this to track other metrics like binary size.

Example run here (ignore the actual measurement, it’s bogus) ahoppen/swift-format#8 (comment)

@ahoppen ahoppen requested a review from a team as a code owner April 10, 2025 23:10
Copy link
Member

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m curious to understand the direction here more. In the server ecosystem we are using the package-benchmark package to write performance tests for our packages. Packages like foundation have also started adopting it. Could we use that instead of the custom scripts here?

@ahoppen
Copy link
Member Author

ahoppen commented Apr 11, 2025

Do you have an example of a package using the package-benchmark package for performance measurements?

@FranzBusch
Copy link
Member

Yes you can see it here: https://github.com/apple/swift-certificates/tree/main/Benchmarks.

This packages only uses allocation metrics but another good metrics are the cpu instructions since those are stable on CI. I would really encourage us to adopt this approach instead of hand rolling performance scripts

This adds the infrastructure to take performance measurements for PRs by taking a performance measurement of the project before the changes in the PR.

This will become really useful once we can switch this to a macOS runner and are then able to measure instructions executed by an executable (like swift-format supports using `swift-format --measure-instructions`) – instruction counting isn’t available on Linux as far as I could find out.

But even now, we can use this to track other metrics like binary size.
@ahoppen
Copy link
Member Author

ahoppen commented Apr 25, 2025

Great suggestion with using the benchmarks package. I have updated the workflow to be based on in. Example run can be found here: ahoppen/swift-format#10 (comment)

Copy link
Member

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff. I would like @rnro to take a look at this as well. We had some plans to upstream the existing benchmark https://github.com/apple/swift-nio/blob/main/.github/workflows/benchmarks.yml workflow from NIO here. That workflow runs benchmarks against multiple Swift versions which we found very helpful since it allowed us to catch compiler regressions/improvements quite fast.

@ktoso
Copy link
Contributor

ktoso commented Apr 25, 2025

Very nice! Great to spread more use of the benchmark package <3

FYI @ser-0xff & @hassila :-)

@ahoppen
Copy link
Member Author

ahoppen commented Apr 25, 2025

I think one thing that’s worth mentioning here is that this workflow does not use a fixed baseline, which makes it very easy to handle because it means that we don’t need to ensure that performance testing always runs on exactly the same kind of machine – we don’t even need to guarantee that the Swift version stays constant.

It means that we can’t use it to test for compiler regressions (because they would necessarily run in different containers / on different nodes). That sort of testing has different trade-offs, so if we want that, we should add a different kind of workflow for it, I think.

@hassila
Copy link

hassila commented Apr 25, 2025

@ahoppen for the record, cpu instructions is supported on linux, but you must have permissions to use the perf_event_open call, if you modify your local package-benchmark copy and/or make a fork and uncomment this line in the package:

//            fprintf(stderr, "Can't enable performance counters for instructions metric, error in perf_event_open syscall, failed with [%d], error: %s\n", errorCode, strerror(errorCode));

you probably will get a good hint. Likely no permissions.

@hassila
Copy link

hassila commented Apr 25, 2025

Or try running "perf" and see if that works.

I.e.

perf stat -e instructions /bin/ls

@ahoppen
Copy link
Member Author

ahoppen commented Apr 29, 2025

Thanks for chiming in, @hassila. I think the problem is that we will be running the performance tests inside a Docker container in GitHub actions and my understand is that you don’t have access to perf_event_open in these setups. If you know better, I would love to hear about it.

@hassila
Copy link

hassila commented Apr 29, 2025

Thanks for chiming in, @hassila. I think the problem is that we will be running the performance tests inside a Docker container in GitHub actions and my understand is that you don’t have access to perf_event_open in these setups. If you know better, I would love to hear about it.

Thanks for chiming in, @hassila. I think the problem is that we will be running the performance tests inside a Docker container in GitHub actions and my understand is that you don’t have access to perf_event_open in these setups. If you know better, I would love to hear about it.

Oh, I misread the description that instruction count was not available from the benchmark package, but you probably meant in docker. Quick google gave this as a workaround:
https://medium.com/@geekidea_81313/running-perf-in-docker-kubernetes-7eb878afcd42

Also ping @freef4ll - maybe you know for sure.

@ahoppen
Copy link
Member Author

ahoppen commented Apr 29, 2025

I think I tried something like this without success, I think GitHub Actions doesn’t allow these extended privileges. But maybe I’ll give it another try when I find the time.

@freef4ll
Copy link

freef4ll commented Apr 30, 2025

Unfortunately it does not look like GitHub public runners expose instructions PMU counters (getting a shell into running action job via https://github.com/mxschmitt/action-tmate):

root@fv-az1283-31:~# perf stat sleep 1

 Performance counter stats for 'sleep 1':

              0.70 msec task-clock                       #    0.001 CPUs utilized
                 1      context-switches                 #    1.435 K/sec
                 0      cpu-migrations                   #    0.000 /sec
                79      page-faults                      #  113.371 K/sec
   <not supported>      cycles
   <not supported>      instructions
   <not supported>      branches
   <not supported>      branch-misses

       1.001178359 seconds time elapsed

       0.001080000 seconds user
       0.000000000 seconds sys

The VMs are running in Azure under Hyper-V and while it appears that PMU events are supported, it is not enabled.

@ahoppen
Copy link
Member Author

ahoppen commented May 5, 2025

Unfortunate that GitHub Action runners don’t support instruction counters by default but thanks a lot for verifying my analysis, @freef4ll!

@rnro
Copy link
Contributor

rnro commented May 7, 2025

This looks great! I think it serves a different purpose to the use-case we have in the swift-server repos where we want to compare against a fix baseline so that we can check not only PRs for regressions but also timed runs. I think having both is a good thing.

Copy link
Member

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just two minor comments

else
echo "has_significant_changes=false" >> "$GITHUB_OUTPUT"
fi
- name: Install gh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we would be better off just moving this into a separate job since gh is installed on the runners by default

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this before but if you run it as a separate job, that shows up as a separate item in the pull request status, which is confusing.

apt update
apt install gh -y
- name: Post comment
if: ${{ steps.compare_performance.outputs.has_significant_changes == 'true' }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we unconditionally post this? Otherwise it might seem like nothing happened.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think so. Most PRs don’t modify the performance of a package and since we run this unconditionally, I think posting a comment saying that the performance stayed constant just creates noise and users might get used to them and then not actually notice if a comment highlights a real performance change. If you want to double-check that it ran, you can open the GitHub action log.

@ahoppen ahoppen merged commit b258470 into swiftlang:main May 12, 2025
25 of 26 checks passed
@ahoppen ahoppen deleted the performance-test branch May 12, 2025 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants