Add package.json extractor for lockfile-less JS projects#126
Add package.json extractor for lockfile-less JS projects#126jbcibois-ddhq wants to merge 11 commits into
Conversation
|
🎯 Code Coverage (details) 🔗 Commit SHA: 549617d | Docs | Datadog PR Page | Give us feedback! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a53af776fb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 01d126b26d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8d01ffd4c0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b68e24bb79
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
f5807e2 to
61b9465
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 61b94650d1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if _, exists := packages[dedupKey]; exists { | ||
| continue | ||
| } |
There was a problem hiding this comment.
Merge dep groups when deduplicating same package version
The dedup path drops later occurrences of the same resolvedName@version with continue, so dependencies declared in multiple sections (for example both dependencies and devDependencies) retain only one DepGroups value. This loses section metadata and can misclassify direct dependency context, whereas other JS matching paths in this repo append/merge groups instead of discarding them.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
First occurrence wins. The extractor iterates sections in order: dependencies (prod), devDependencies (dev), optionalDependencies (optional) and deduplicates by name@version. So if lodash@4.17.21 appears in both dependencies and devDependencies, it's reported as prod only. npm itself treats a package listed in both dependencies and devDependencies as a prod dependency
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4c22f9bdaf
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if len(lockfiles) > 0 && hasLockfileInAncestors(f, context.RootDir, lockfiles) { | ||
| return []lockfile.PackageDetails{}, nil |
There was a problem hiding this comment.
Respect scan exclusions when suppressing package.json
Extract short-circuits to an empty package list whenever hasLockfileInAncestors finds a lockfile on disk, but that presence check is independent of scanner filtering. In scanDir, lockfiles can be skipped by .gitignore/--exclude before parsing, so this early return can suppress package.json fallback even when no lockfile parser will actually run, yielding missing JavaScript dependencies in those filtered scans.
Useful? React with 👍 / 👎.
d13b1bd to
68c111a
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d4dfe717f4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
634d9c9 to
5849bec
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 667ad6b478
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
667ad6b to
540c53f
Compare
💡 Codex ReviewDependencies pinned as ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e6a5bce75e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| func (e PackageJSONExtractor) ShouldExtract(path string) bool { | ||
| normalized := filepath.ToSlash(path) | ||
| if strings.Contains(normalized, "/"+nodeModulesPath) || strings.HasPrefix(normalized, nodeModulesPath) { |
There was a problem hiding this comment.
Match node_modules as a full path segment
ShouldExtract still filters by substring/prefix ("/"+nodeModulesPath and HasPrefix) rather than segment boundaries, so directories like node_modules2 or node_modules_backup are incorrectly treated as node_modules and their package.json files are skipped. This causes false negatives in repos that legitimately use similarly named directories. Fresh evidence after the earlier thread: the updated guard now uses strings.Contains(normalized, "/"+nodeModulesPath), which still matches paths such as path/to/node_modules2/foo/package.json.
Useful? React with 👍 / 👎.
rjcoulter22
left a comment
There was a problem hiding this comment.
I think we should gate this under the --manifest-parsers flag like we do for pyproject.toml parsing as well? We would probably want to be consistent IMO 🙂
|
|
||
| const ( | ||
| packageJSONPackageManager = models.NPM | ||
| packageJSONOfficiallySupported = true |
There was a problem hiding this comment.
If this is opt in shgould this be false? https://github.com/DataDog/datadog-sbom-generator/blob/main/pkg/lockfile/python/types.go#L176
| return active | ||
| } | ||
|
|
||
| func hasLockfileInAncestors(f lockfile.DepFile, rootDir string, lockfiles []string) bool { |
There was a problem hiding this comment.
Hm idt we do this for pyproject - should we make it consistent?
There was a problem hiding this comment.
yes, it is important that lockfile parsers take precedence if we find one
| } | ||
|
|
||
| func (e PackageJSONExtractor) Extract(f lockfile.DepFile, context lockfile.ScanContext) ([]lockfile.PackageDetails, error) { | ||
| lockfiles := activeLockfiles(context.EnabledParsers) |
There was a problem hiding this comment.
what is activeLockfiles checking for?
There was a problem hiding this comment.
It filters siblingLockfiles (package-lock.json, yarn.lock, pnpm-lock.yaml) down to only the ones whose parsers are enabled. If all lockfile parsers are disabled (e.g. --enable-parsers package.json only), it returns an empty slice, which tells Extract to skip the lockfile guard entirely and always extract from package.json.
| versionRange = effectiveSpec | ||
| } | ||
|
|
||
| dedupKey := resolvedName + "@" + version + versionRange |
There was a problem hiding this comment.
would we ever have a version and versionRange?
@rjcoulter22 done |
anderruiz
left a comment
There was a problem hiding this comment.
pkg/lockfile/javascript/parse-package-json.go — missing LocationRole
The PackageDetails literal built in Extract does not set LocationRole. Since this extractor reads directly from package.json (a manifest, not a lockfile), every package it emits should carry LocationRole: models.LocationRoleManifest — same as parse-pyproject-toml.go (the other manifest-only extractor introduced in PR #138).
Without it, p.LocationRole is the zero value in vulnerability_result.go, so the role field in the emitted PackageLocation is blank even though the location points to a manifest file.
packages[dedupKey] = lockfile.PackageDetails{
// ...
BlockLocation: blockLocation,
NameLocation: nameLocation,
VersionLocation: versionLocation,
+ LocationRole: models.LocationRoleManifest,
}| "name": "alias-collision", | ||
| "dependencies": { | ||
| "react": "17.0.2", | ||
| "react18": "npm:react@18.3.1" |
There was a problem hiding this comment.
We need to complete it at least with the cases included here
🚀 Motivation
JavaScript projects that only ship a
package.json(no lockfile) currently produce no SBOM output. This adds a dedicated extractor so these projects get dependency coverage.📚 Documentation
📝 Summary
New
PackageJSONExtractorthat parsesdependencies,devDependencies, andoptionalDependenciesfrompackage.jsonfiles.Version handling:
"lodash": "4.17.21") →version = "4.17.21""lodash": "^4.17.21") →version = ""+ newdatadog:version-rangeSBOM property holding the raw specifier (e.g."^4.17.21")file:,latest, URLs) → both fields emptynpm alias resolution:
"react18": "npm:react@18.3.1"emits a package namedreactat version18.3.1, consistent with the npm-lock parser.Deduplication: by
resolvedName@version+versionRange, so aliased variants pointing to the same package at different versions are both preserved.Lockfile suppression (
hasLockfileInAncestors): extraction is skipped when a lockfile (package-lock.json,yarn.lock,pnpm-lock.yaml) exists in the same directory or any ancestor up tocontext.RootDir. This handles both standalone projects and monorepo workspaces. The check is bypassed when lockfile parsers are disabled via--enable-parsers.node_modules/filtering:ShouldExtractrejects paths containingnode_modules/to avoid extracting per-package manifests inside installed dependencies.🧪 Testing
🚧 Staging validation
🆘 Recovery
Notes for on-call - select only one: