Skip to content

Add package.json extractor for lockfile-less JS projects#126

Open
jbcibois-ddhq wants to merge 11 commits into
mainfrom
jb.cibois/add-package-json-support
Open

Add package.json extractor for lockfile-less JS projects#126
jbcibois-ddhq wants to merge 11 commits into
mainfrom
jb.cibois/add-package-json-support

Conversation

@jbcibois-ddhq
Copy link
Copy Markdown
Contributor

@jbcibois-ddhq jbcibois-ddhq commented Apr 14, 2026

🚀 Motivation

JavaScript projects that only ship a package.json (no lockfile) currently produce no SBOM output. This adds a dedicated extractor so these projects get dependency coverage.

📚 Documentation

Document Link or Detail
RFC N/A
Incident N/A
Jira Ticket K9VULN-13485

📝 Summary

New PackageJSONExtractor that parses dependencies, devDependencies, and optionalDependencies from package.json files.

Version handling:

  • Exact pins ("lodash": "4.17.21") → version = "4.17.21"
  • Range specifiers ("lodash": "^4.17.21") → version = "" + new datadog:version-range SBOM property holding the raw specifier (e.g. "^4.17.21")
  • Non-version values (file:, latest, URLs) → both fields empty

npm alias resolution: "react18": "npm:react@18.3.1" emits a package named react at version 18.3.1, consistent with the npm-lock parser.

Deduplication: by resolvedName@version+versionRange, so aliased variants pointing to the same package at different versions are both preserved.

Lockfile suppression (hasLockfileInAncestors): extraction is skipped when a lockfile (package-lock.json, yarn.lock, pnpm-lock.yaml) exists in the same directory or any ancestor up to context.RootDir. This handles both standalone projects and monorepo workspaces. The check is bypassed when lockfile parsers are disabled via --enable-parsers.

node_modules/ filtering: ShouldExtract rejects paths containing node_modules/ to avoid extracting per-package manifests inside installed dependencies.

🧪 Testing

  • New tests were added for new logic.
  • Existing tests were updated for new logic, and not only so that they pass!
  • Benchmark results prove that performance is the same or better.

🚧 Staging validation

  • Deployed and monitored using Datadog dashboards.
  • Proof that it works as expected, including profiling or UX screenshots.

🆘 Recovery

Notes for on-call - select only one:

  • The change can be rolled back.
  • Do not roll back. Why?:

@jbcibois-ddhq jbcibois-ddhq requested a review from a team as a code owner April 14, 2026 14:43
@jbcibois-ddhq jbcibois-ddhq requested a review from anderruiz April 14, 2026 14:43
@datadog-prod-us1-5
Copy link
Copy Markdown

datadog-prod-us1-5 Bot commented Apr 14, 2026

🎯 Code Coverage (details)
Patch Coverage: 93.24%
Overall Coverage: 85.19% (+0.40%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 549617d | Docs | Datadog PR Page | Give us feedback!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a53af776fb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread pkg/lockfile/javascript/parse-package-json.go Outdated
Comment thread pkg/lockfile/javascript/parse-package-json.go Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 01d126b26d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread pkg/lockfile/javascript/parse-package-json.go Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8d01ffd4c0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread pkg/lockfile/javascript/parse-package-json.go Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b68e24bb79

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread pkg/lockfile/javascript/parse-package-json.go
@jbcibois-ddhq jbcibois-ddhq force-pushed the jb.cibois/add-package-json-support branch 3 times, most recently from f5807e2 to 61b9465 Compare April 15, 2026 08:25
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 61b94650d1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread pkg/lockfile/javascript/parse-package-json.go Outdated
Comment on lines +211 to +213
if _, exists := packages[dedupKey]; exists {
continue
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Merge dep groups when deduplicating same package version

The dedup path drops later occurrences of the same resolvedName@version with continue, so dependencies declared in multiple sections (for example both dependencies and devDependencies) retain only one DepGroups value. This loses section metadata and can misclassify direct dependency context, whereas other JS matching paths in this repo append/merge groups instead of discarding them.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First occurrence wins. The extractor iterates sections in order: dependencies (prod), devDependencies (dev), optionalDependencies (optional) and deduplicates by name@version. So if lodash@4.17.21 appears in both dependencies and devDependencies, it's reported as prod only. npm itself treats a package listed in both dependencies and devDependencies as a prod dependency

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4c22f9bdaf

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +184 to +185
if len(lockfiles) > 0 && hasLockfileInAncestors(f, context.RootDir, lockfiles) {
return []lockfile.PackageDetails{}, nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Respect scan exclusions when suppressing package.json

Extract short-circuits to an empty package list whenever hasLockfileInAncestors finds a lockfile on disk, but that presence check is independent of scanner filtering. In scanDir, lockfiles can be skipped by .gitignore/--exclude before parsing, so this early return can suppress package.json fallback even when no lockfile parser will actually run, yielding missing JavaScript dependencies in those filtered scans.

Useful? React with 👍 / 👎.

@jbcibois-ddhq jbcibois-ddhq marked this pull request as draft April 15, 2026 13:03
@jbcibois-ddhq jbcibois-ddhq force-pushed the jb.cibois/add-package-json-support branch 2 times, most recently from d13b1bd to 68c111a Compare May 4, 2026 13:45
@jbcibois-ddhq jbcibois-ddhq marked this pull request as ready for review May 4, 2026 14:09
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d4dfe717f4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread pkg/lockfile/javascript/parse-package-json.go Outdated
@jbcibois-ddhq jbcibois-ddhq force-pushed the jb.cibois/add-package-json-support branch from 634d9c9 to 5849bec Compare May 4, 2026 14:53
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 667ad6b478

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread pkg/lockfile/javascript/parse-package-json.go Outdated
@jbcibois-ddhq jbcibois-ddhq force-pushed the jb.cibois/add-package-json-support branch from 667ad6b to 540c53f Compare May 4, 2026 15:24
@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

var exactVersionRegex = cachedregexp.MustCompile(`^v?\d+(\.\d+)*(-[\w.]+)?(\+[\w.]+)?$`)

P2 Badge Treat '=' semver pins as exact versions

Dependencies pinned as "=1.2.3" are currently classified as ranged because exactVersionRegex does not accept an optional leading =. In npm this syntax is an exact pin, so these entries end up with Version="" and only VersionRange populated, which produces versionless PURLs and reduces vulnerability matching accuracy for otherwise fully pinned dependencies.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e6a5bce75e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


func (e PackageJSONExtractor) ShouldExtract(path string) bool {
normalized := filepath.ToSlash(path)
if strings.Contains(normalized, "/"+nodeModulesPath) || strings.HasPrefix(normalized, nodeModulesPath) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match node_modules as a full path segment

ShouldExtract still filters by substring/prefix ("/"+nodeModulesPath and HasPrefix) rather than segment boundaries, so directories like node_modules2 or node_modules_backup are incorrectly treated as node_modules and their package.json files are skipped. This causes false negatives in repos that legitimately use similarly named directories. Fresh evidence after the earlier thread: the updated guard now uses strings.Contains(normalized, "/"+nodeModulesPath), which still matches paths such as path/to/node_modules2/foo/package.json.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@rjcoulter22 rjcoulter22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should gate this under the --manifest-parsers flag like we do for pyproject.toml parsing as well? We would probably want to be consistent IMO 🙂

Comment thread pkg/lockfile/javascript/types.go Outdated

const (
packageJSONPackageManager = models.NPM
packageJSONOfficiallySupported = true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return active
}

func hasLockfileInAncestors(f lockfile.DepFile, rootDir string, lockfiles []string) bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm idt we do this for pyproject - should we make it consistent?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is important that lockfile parsers take precedence if we find one

Comment thread pkg/lockfile/javascript/parse-package-json.go Outdated
}

func (e PackageJSONExtractor) Extract(f lockfile.DepFile, context lockfile.ScanContext) ([]lockfile.PackageDetails, error) {
lockfiles := activeLockfiles(context.EnabledParsers)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is activeLockfiles checking for?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It filters siblingLockfiles (package-lock.json, yarn.lock, pnpm-lock.yaml) down to only the ones whose parsers are enabled. If all lockfile parsers are disabled (e.g. --enable-parsers package.json only), it returns an empty slice, which tells Extract to skip the lockfile guard entirely and always extract from package.json.

versionRange = effectiveSpec
}

dedupKey := resolvedName + "@" + version + versionRange
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we ever have a version and versionRange?

Comment thread pkg/lockfile/javascript/parse-package-json.go
@jbcibois-ddhq
Copy link
Copy Markdown
Contributor Author

I think we should gate this under the --manifest-parsers flag like we do for pyproject.toml parsing as well? We would probably want to be consistent IMO 🙂

@rjcoulter22 done

Copy link
Copy Markdown
Contributor

@anderruiz anderruiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg/lockfile/javascript/parse-package-json.go — missing LocationRole

The PackageDetails literal built in Extract does not set LocationRole. Since this extractor reads directly from package.json (a manifest, not a lockfile), every package it emits should carry LocationRole: models.LocationRoleManifest — same as parse-pyproject-toml.go (the other manifest-only extractor introduced in PR #138).

Without it, p.LocationRole is the zero value in vulnerability_result.go, so the role field in the emitted PackageLocation is blank even though the location points to a manifest file.

packages[dedupKey] = lockfile.PackageDetails{
    // ...
    BlockLocation:   blockLocation,
    NameLocation:    nameLocation,
    VersionLocation: versionLocation,
+   LocationRole:    models.LocationRoleManifest,
}

"name": "alias-collision",
"dependencies": {
"react": "17.0.2",
"react18": "npm:react@18.3.1"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to complete it at least with the cases included here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants