Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify SHA256 of jemalloc-$version.tar.bz2 to protect against supply chain attacks #27

Closed
wants to merge 2 commits into from

Conversation

rofreg
Copy link

@rofreg rofreg commented Oct 15, 2021

Hey there! Thanks so much for maintaining this repo – it's been a super useful buildpack for us at Splitwise, and seriously improved the memory usage of our Rails app.

This PR fixes a small-but-meaningful supply chain vulnerability for users of this buildpack. Right now this is only a proof-of-concept (SHAs for all the other stack + jemalloc version combos still need to be generated), but I wanted to open the PR at this point to ask for feedback.

The vulnerable scenario

  • Company A adds the heroku-buildpack-jemalloc buildpack to their Heroku app.
    • Company A includes a commit SHA at the end of the buildpack URL, which lets them specify the exact commit being included in the buildpack. Even if an attacker takes over this repo and pushes a new malicious update to the buildpack code, the inclusion of the commit SHA helps prevent that attacker's buildpack code from making it into Company A's Heroku app.
  • However, this buildpack ALSO downloads the file located at https://github.com/gaffneyc/heroku-buildpack-jemalloc/releases/download/$STACK/jemalloc-$version.tar.bz2
    • Unfortunately, this download is NOT verified – if an attacker gained access to this repo, they could delete an existing release and replace it with malicious code.

In practice, the caching check in the buildpack helps to mitigate this vulnerability somewhat, but in any scenario where the buildpack actually downloads the file (when first deploying a new app; after clearing the Heroku build cache; after changing the Heroku stack version or jemalloc version), the Heroku app is vulnerable. Thus, if this GitHub repo were ever to be compromised in some way, users of this buildpack would not have a way to guard themselves against downloading and running untrusted code.

Remediation

A straightforward way to fix this is to verify the SHA256 of jemalloc-$version.tar.bz2, similar to how including a SHA in the buildpack URL allows an app to verify the code of the buildpack itself. This would mean that when new releases of jemalloc-$version.tar.bz2 are added, the buildpack repo would also need to be updated to include SHA256 files for those releases. Thus, if a user of the buildpack chooses to lock the buildpack to a certain Git SHA, they have effectively protected themself from the supply chain attack described above.

Again, thanks so much for your work on this buildpack! As I mentioned, currently this PR only includes a SHA256 for a single Heroku stack + jemalloc version combo, but if this fix makes sense to the maintainer, then it should be straightforward to extend this solution to all existing version combos.

@gaffneyc
Copy link
Owner

Thanks! I'm really glad you've found it useful.

Overall I like your proof of concept. It has some good decisions with pinning to a specific commit and storing the SHAs in the repo so any updates to the binaries would need an associated commit and rationale. I've noticed the most common reason to fork the repo is to control where the binaries are downloaded from to avoid these kinds of supply chain attacks.

I've had an alternative approach in my head for a while that I'd like to get your take on. Instead of supplying binaries to the buildpack it downloads the jemalloc source then compiles and caches it. This has the benefit of not needing to trust that the builds I've uploaded haven't been tampered with in some way. It also means that we don't need to provide new binaries with each jemalloc release or heroku platform update. I took this approach with a spiped buildpack a while back but haven't ported it to jemalloc yet.

Thoughts?

@rofreg
Copy link
Author

rofreg commented Oct 19, 2021

Mm, that makes a ton of sense! As you say, it'd eliminate the need for you to upload so many binaries, and to upload new binaries with each Heroku platform update. As you did for the spiped buildpack, I imagine the checksum for each jemalloc version would be included in this buildpack, and then bin/compile would download the source from the official GitHub releases?

I'll see if I can update this PR to go in that direction, sometime in the coming days. Adding the full list of checksums seems straightforward enough. I'm less familiar with how to build a binary within bin/compile, but your link to the spiped build pack seems like a great example to adapt.

gaffneyc added a commit that referenced this pull request Oct 20, 2021
As brought up in #27 there are possible supply chain attacks against
anyone using this buildpack. The main concern is that a compromised
binary could be uploaded and there is no way to check that the hosted
binaries are trusted. Going a level deeper there is no way to verify
that the compiled binaries I've uploaded can be trusted (I promise
they're fine!).

This goes a step farther and removes one more random dude on the
Internet (me!) from the chain of trust. By compiling from source and
posting known checksums of those sources it is possible to verify
exactly what is being built and to be notified if the source changes.
This has the added benefit of not needing to provide binaries for each
new release of Jemalloc or launch of a new Heroku stack. Builds are now
cached per version and stack so that changing either will cause a
rebuild on the first deploy.
gaffneyc added a commit that referenced this pull request Oct 20, 2021
As brought up in #27 there are possible supply chain attacks against
anyone using this buildpack. The main concern is that a compromised
binary could be uploaded and there is no way to check that the hosted
binaries are trusted. Going a level deeper there is no way to verify
that the compiled binaries I've uploaded can be trusted (I promise
they're fine!).

This goes a step farther and removes one more random dude on the
Internet (me!) from the chain of trust. By compiling from source and
posting known checksums of those sources it is possible to verify
exactly what is being built and to be notified if the source changes.
This has the added benefit of not needing to provide binaries for each
new release of Jemalloc or launch of a new Heroku stack. Builds are now
cached per version and stack so that changing either will cause a
rebuild on the first deploy.
@gaffneyc
Copy link
Owner

@rofreg Had some time today and took a stab at implementing it in #28. I haven't tested it on Heroku yet but will do that when I get a chance.

@rofreg rofreg closed this Oct 20, 2021
@rofreg
Copy link
Author

rofreg commented Oct 20, 2021

Closing in favor of #28!

gaffneyc added a commit that referenced this pull request May 7, 2022
As brought up in #27 there are possible supply chain attacks against
anyone using this buildpack. The main concern is that a compromised
binary could be uploaded and there is no way to check that the hosted
binaries are trusted. Going a level deeper there is no way to verify
that the compiled binaries I've uploaded can be trusted (I promise
they're fine!).

This goes a step farther and removes one more random dude on the
Internet (me!) from the chain of trust. By compiling from source and
posting known checksums of those sources it is possible to verify
exactly what is being built and to be notified if the source changes.
This has the added benefit of not needing to provide binaries for each
new release of Jemalloc or launch of a new Heroku stack. Builds are now
cached per version and stack so that changing either will cause a
rebuild on the first deploy.
nenadfilipovic added a commit to nenadfilipovic/heroku-buildpack-jemalloc that referenced this pull request Aug 11, 2022
* Change to compile jemalloc from source

As brought up in gaffneyc#27 there are possible supply chain attacks against
anyone using this buildpack. The main concern is that a compromised
binary could be uploaded and there is no way to check that the hosted
binaries are trusted. Going a level deeper there is no way to verify
that the compiled binaries I've uploaded can be trusted (I promise
they're fine!).

This goes a step farther and removes one more random dude on the
Internet (me!) from the chain of trust. By compiling from source and
posting known checksums of those sources it is possible to verify
exactly what is being built and to be notified if the source changes.
This has the added benefit of not needing to provide binaries for each
new release of Jemalloc or launch of a new Heroku stack. Builds are now
cached per version and stack so that changing either will cause a
rebuild on the first deploy.

* Add checksum for 5.3.0

https://github.com/jemalloc/jemalloc/releases/tag/5.3.0

Co-authored-by: Chris Gaffney <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants