-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Gradle] Added deep-scanning of gradle modules #1380
Conversation
Signed-off-by: Roland Asmann <[email protected]>
Signed-off-by: Roland Asmann <[email protected]>
… to achieve correct IDs of those modules Signed-off-by: Roland Asmann <[email protected]>
What is the time penalty for fineract and elasticsearch? |
@malice00 can we wrap this behind a feature flag? See safe-pip-install for example. There is a utils method available. Alternatively, we can bump the minor version and include this as a breaking change. |
Custom-diff results for this PR: |
@aryan-rajoria I assume build-conventions and build-tools:reaper only exist once in the project? I'll take another look at what's going on, maybe I've missed something there... |
@malice00 build-conventions etc have different group names so probably ok. We can make it the default in 10.10.x, since the trees are looking better. Thank you so much for your help! |
my testcases work fine with this change, but there are no deep gradle projects included. |
Thanks for merging, but I'm not quite happy with this yet... Good to have elasticsearch as a test project, it might help fine-tune this just a little more... |
@malice00 in non-multithreaded mode, there are too many duplicate components for the elasticsearch repo. Shall we work towards making the multi-threaded mode the default? |
@prabhu The current solution has too many duplicates in either mode, but I think I found the issue. Still running tests though... Multi-threaded as default would be awesome, the performance gains are unbelievable! |
So after some changes and testing, I had a solution which imo was correct, but there still was a difference in the number of modules in elasticsearch. After some manual comparisons I had a hunch what might be going on, so I added some extra output in cdxgen and my suspicions where indeed correct: several modules in elasticsearch have the same
In the previous code this was not an issue, since the module-name was used in the purl (which actually would imo generate an invalid purl, since it had a ':' in the name) and there would be no duplicates. Now, in the above list of modules, I think on all 'qa'-modules, it doesn't make much of a difference (except maybe on the correctness of the tree), since they (or at least those I manually checked) don't actually have any dependencies. The modules 'test-fixtures' however, do have their own dependencies and those would now just get bunched up together as if it was a single module. Might make finding a (vulnerable) dependency a bit harder... Still working on some fine-tuning, but I would like some input on how to best handle this (output a warning/add a switch to disable deep-scanning/add a switch to use the module name for duplicates/...) before committing and actually 'destroying' newer SBOMs for some projects... |
@malice00 This is good observation. Any findings related to multiple third party dependencies appearing in the sbom? For example, multiple commons-compress in non-multithreaded mode |
@prabhu I still have those, but they are in both single- and multithreaded mode, so I guess this is actually what Gradle is reporting. There's still differences in both versions, but I already explained my findings on that -- although I'm not going to check that manually again in elastic, it's just too big for that! Here's 2 SBOMs I just generated. Singlethreaded took around 150 minutes, multithreaded only 60 seconds! It can be even faster, but that's a change that isn't stable on my side yet. Anyway, |
For performance-reasons, scanning of Gradle modules was disabled for modules deeper than the first level. This does however mean that your module information might be wrong. Performance obviously depends on the number of modules lower than first level, but using the multi-threaded version of the Gradle scanner makes this negligible.
Currently the optimization is just removed, I could however also add a switch (eg with an EnvVar) to (de-)activate it, although personally I think the scan should be complete and not use any (sometimes weird) defaults.
Also, for testing, I could maybe add fineract to the repotests...
Note: this fix builds on #1379!