fix: ensure %2F in purl names is handled properly#2156
fix: ensure %2F in purl names is handled properly#2156jcrossley3 merged 2 commits intoguacsec:mainfrom
Conversation
Reviewer's GuideFixes handling of percent-encoded slashes in purl names by tightening PackageUrl construction/formatting and extending SBOM and Purl tests, while updating csaf and packageurl dependencies to patched git branches that include the encoding fix. Sequence diagram for updated PackageUrl construction in OSV translate and split_namesequenceDiagram
participant Caller
participant Ecosystem
participant Translate as translate
participant PackageUrl
Caller->>Translate: translate(ecosystem, name)
Translate->>Ecosystem: ecosystem.ty_for(name)
Ecosystem-->>Translate: ty
alt Maven-style name with group and artifact
Translate->>Translate: split_name(name, "maven", ":")
alt name contains separator
Translate->>Translate: join all but last segment as namespace
Translate->>Translate: take last segment as simple_name
Translate->>PackageUrl: new("maven", simple_name)
PackageUrl-->>Translate: Result<PackageUrl, Error>
alt Ok(package_url)
Translate->>PackageUrl: with_namespace(namespace)
PackageUrl-->>Translate: Result<&mut PackageUrl, Error>
alt Ok(mut_ref)
Translate->>PackageUrl: add_qualifier("repository_url", repo)
PackageUrl-->>Translate: Result<&mut PackageUrl, Error>
Translate-->>Caller: Some(PackageUrl)
else Err
Translate-->>Caller: None (namespace error propagated)
end
else Err
Translate-->>Caller: None (creation error)
end
else no separator
Translate-->>Caller: None
end
else other ecosystems or simple names
Translate->>PackageUrl: new(ty, name)
PackageUrl-->>Translate: Result<PackageUrl, Error>
alt Ok(package_url)
Translate-->>Caller: Some(PackageUrl)
else Err
Translate-->>Caller: None
end
end
Class diagram for updated Purl formatting and PackageUrl constructionclassDiagram
class Purl {
+ty: String
+name: String
+namespace: Option~String~
+version: Option~String~
+qualifiers: HashMap~String, String~
+from_str(input: &str) Result~Purl, Error~
+to_string() String
}
class PurlDisplayImpl {
+fmt(f: &mut Formatter) Result~(), fmt::Error~
}
class PurlDebugImpl {
+fmt(f: &mut Formatter) Result~(), fmt::Error~
}
class PackageUrl~'a~ {
+new(ty: &str, name: &str) Result~PackageUrl~'a~, Error~
+with_namespace(namespace: &str) Result~&mut PackageUrl~'a~, Error~
+with_version(version: &str) Result~&mut PackageUrl~'a~, Error~
+add_qualifier(key: &str, value: &str) Result~&mut PackageUrl~'a~, Error~
+to_string() String
}
Purl ..> PurlDisplayImpl : implements Display
Purl ..> PurlDebugImpl : implements Debug
PurlDisplayImpl ..> PackageUrl~'a~ : constructs and configures
%% Updated behavior:
%% - Errors from PackageUrl::new, with_namespace, with_version, add_qualifier
%% are now propagated as fmt::Error in PurlDisplayImpl.fmt
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- In
split_name, the newwith_namespace(namespace).unwrap()reintroduces a potential panic path; consider propagating the error (e.g., viaand_thenor?in aResult-returning helper) instead of unwrapping. - Cargo.toml now declares
packageurltwice (once with a version and once as a git dependency); it would be clearer to consolidate this into a single dependency specification or use[patch]so it’s obvious which source is authoritative.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `split_name`, the new `with_namespace(namespace).unwrap()` reintroduces a potential panic path; consider propagating the error (e.g., via `and_then` or `?` in a `Result`-returning helper) instead of unwrapping.
- Cargo.toml now declares `packageurl` twice (once with a version and once as a git dependency); it would be clearer to consolidate this into a single dependency specification or use `[patch]` so it’s obvious which source is authoritative.
## Individual Comments
### Comment 1
<location> `common/src/purl.rs:241` </location>
<code_context>
let mut purl = PackageUrl::new(&self.ty, &self.name).map_err(|_| fmt::Error)?;
if let Some(ns) = &self.namespace {
- purl.with_namespace(ns);
+ purl.with_namespace(ns).map_err(|_| std::fmt::Error {})?;
}
if let Some(version) = &self.version {
</code_context>
<issue_to_address>
**issue (bug_risk):** Use `fmt::Error` directly instead of constructing `std::fmt::Error {}`
`fmt::Error` is a unit struct, so `std::fmt::Error {}` is invalid and won’t compile. Instead, either `use std::fmt::Error;` and write `map_err(|_| Error)?`, or follow the existing pattern and use `map_err(|_| fmt::Error)?`. Apply the same fix to the `with_version` call as well.
</issue_to_address>
### Comment 2
<location> `common/src/purl.rs:244` </location>
<code_context>
}
if let Some(version) = &self.version {
- purl.with_version(version);
+ purl.with_version(version).map_err(|_| std::fmt::Error {})?;
}
for (key, value) in &self.qualifiers {
</code_context>
<issue_to_address>
**issue (bug_risk):** Fix construction of `fmt::Error` on `with_version` error mapping
As with `with_namespace`, `std::fmt::Error {}` isn’t a valid constructor here and will not compile. Use `map_err(|_| fmt::Error)?` (or an equivalent mapping) so the error conversion is valid and consistent.
</issue_to_address>
### Comment 3
<location> `modules/ingestor/src/service/advisory/osv/translate.rs:63` </location>
<code_context>
PackageUrl::new(ty, name)
.map(|mut purl| {
- purl.with_namespace(namespace);
+ purl.with_namespace(namespace).unwrap();
purl
})
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Avoid `unwrap()` on `with_namespace` to prevent panics and keep error handling consistent
In `translate` you now propagate `with_namespace` errors with `?`, but here `unwrap()` reintroduces a panic on malformed namespaces. Please handle the error instead (e.g., propagate it, or map the failure to `None` for this helper) to stay consistent and avoid panics.
</issue_to_address>
### Comment 4
<location> `modules/fundamental/src/sbom/endpoints/test.rs:1727-1739` </location>
<code_context>
+
+ // Assert that both packages (generic & huggingface) return the same SBOM
+ // First huggingface...
+ let uri = format!(
+ "/api/v2/sbom/by-package?purl={}",
+ encode("pkg:huggingface/ibm-granite/[email protected]")
+ );
+ let req = TestRequest::get().uri(&uri).to_request();
+ let response: Value = app.call_and_read_body_json(req).await;
+ assert_eq!(
</code_context>
<issue_to_address>
**suggestion (testing):** Add a direct assertion that the huggingface request can be made with an already-encoded slash as well
This block covers the case where a literal `/` in the huggingface purl is translated to `%2F` in the generic purl. Another real-world edge case is when clients pre-encode the `/` in the huggingface purl itself (e.g. `pkg:huggingface/ibm-granite%[email protected]`).
Please add a third request that uses a huggingface purl with `%2F` in the name (still wrapped in `encode(...)` for the query parameter) and asserts it resolves to the same generic SBOM id. That will verify the endpoint behaves correctly whether or not the client pre-encodes the slash in the name segment.
```suggestion
// Assert that both packages (generic & huggingface) return the same SBOM
// First huggingface with a literal '/' in the name...
let uri = format!(
"/api/v2/sbom/by-package?purl={}",
encode("pkg:huggingface/ibm-granite/[email protected]")
);
let req = TestRequest::get().uri(&uri).to_request();
let response: Value = app.call_and_read_body_json(req).await;
assert_eq!(
response["items"][0]["described_by"][0]["id"],
"pkg:generic/ibm-granite%[email protected]"
);
// Then huggingface with a pre-encoded '/' in the name...
let uri = format!(
"/api/v2/sbom/by-package?purl={}",
encode("pkg:huggingface/ibm-granite%[email protected]")
);
let req = TestRequest::get().uri(&uri).to_request();
let response: Value = app.call_and_read_body_json(req).await;
assert_eq!(
response["items"][0]["described_by"][0]["id"],
"pkg:generic/ibm-granite%[email protected]"
);
// And then generic...
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2156 +/- ##
==========================================
- Coverage 68.74% 68.72% -0.03%
==========================================
Files 397 397
Lines 22305 22305
Branches 22305 22305
==========================================
- Hits 15334 15329 -5
+ Misses 6070 6066 -4
- Partials 901 910 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
66039c2 to
01b1d43
Compare
|
@sourcery-ai review |
|
I'll release the |
I wouldn't think so. Only those with %2F in their purl names, and it's only through silently failing actions within the UI that anyone would even notice, at which point one could simply re-upload the SBOM in question. |
Can the formulate a SQL query finding such SBOMs?
Assuming that the user still has the original document. |
I'm not sure that's worth the effort. It would take me days to come up with that. :) And I've only seen it in the AIBOM's -- and I honestly think the %2F's in them is a mistake.
They can always download the original and re-ingest, right? |
e9c234d to
f520b0c
Compare
Cargo.toml
Outdated
| #cpe = { git = "https://github.com/ctron/cpe-rs", rev = "c3c05e637f6eff7dd4933c2f56d070ee2ddfb44b" } | ||
| # required due to https://github.com/voteblake/csaf-rs/pull/29 | ||
| csaf = { git = "https://github.com/trustification/csaf-rs", rev = "17620a225744b4a18845d4f7bf63354e01109b91" } | ||
| csaf = { git = "https://github.com/trustification/csaf-rs", branch = "purl-name-encoding-fix" } |
There was a problem hiding this comment.
If we keep patching it, we should move it over to the scm-rs organization.
There was a problem hiding this comment.
Agree, but I'd prefer to adopt https://github.com/csaf-rs/csaf instead.
There was a problem hiding this comment.
I think doing that is out of scope for this PR, though. I'd like to merge this to close TC-3090
fd524b4 to
7981ee5
Compare
a02a4df to
3dd330a
Compare
This fixes guacsec#2146 and therefore indirectly resolves the downstream issue, https://issues.redhat.com/browse/TC-3090 Ultimately, it'd be swell if we could instead depend on something less dead and more supported like https://github.com/csaf-rs/csaf
|
To my understanding the main thing the PR does is to use a newer version of If that's the case, I still thing that we should handle the patching of |
Here you go: scm-rs/csaf-rs#3
Because we already depend on the repo containing the above PR, moving that dep elsewhere is out of scope for this PR. |
dejanb
left a comment
There was a problem hiding this comment.
This looks OK to me ... Let's look next how to properly maintain csaf library or use the alternative
|
/backport |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin release/0.4.z
git worktree add -d .worktree/backport-2156-to-release/0.4.z origin/release/0.4.z
cd .worktree/backport-2156-to-release/0.4.z
git switch --create backport-2156-to-release/0.4.z
git cherry-pick -x 665f1154ad9fbdeea596e16083a4d2f32307058f bb4926ea7ec5863a8e273cdbdfaa6abc517b1d1f |
This fixes #2146 and therefore indirectly resolves the downstream issue, https://issues.redhat.com/browse/TC-3090
It relies on unpublished forks of two interdependent crates: csaf-rs and packageurl.rs
Ultimately, it'd be swell if we could instead depend on something less dead and more supported like https://github.com/csaf-rs/csaf
Summary by Sourcery
Handle percent-encoded slashes in package URLs and align SBOM and advisory translation logic with updated packageurl behavior.
Bug Fixes:
Enhancements:
Build:
Tests: