fix: preserve real prefix when middle ID segment is a reserved word by JOhnsonKC201 · Pull Request #513 · Egonex-AI/Understand-Anything

JOhnsonKC201 · 2026-06-25T00:57:11Z

Problem

stripToValidPrefix in analyzer/normalize-graph.ts collapses any node ID whose second segment happens to be a valid prefix, treating it as a double-prefix duplicate. This corrupts IDs where a reserved word legitimately appears as a middle path segment.

For example, endpoint:service:getUser is parsed as:

outer segment endpoint (valid prefix) ✓
next segment service (also a valid prefix) → wrongly assumed to be a duplicate prefix

…so the real endpoint prefix is dropped and the function returns { prefix: "service", path: "getUser" }, yielding service:getUser. This:

changes the node type (endpoint → service),
breaks edge references that point at the original ID, and
violates idempotency — normalizing an already-normalized ID mutates it.

Fix

Only collapse a true same-prefix duplicate (e.g. file:file:src/foo.ts) by requiring the inner segment to equal the outer prefix:

// before
if (innerColonIdx > 0 && VALID_PREFIXES.has(rest.slice(0, innerColonIdx))) {

// after
if (innerColonIdx > 0 && rest.slice(0, innerColonIdx) === segment) {

A different reserved word in the middle is a legitimate path segment and is preserved. The genuine file:file:... double-prefix case still collapses as before.

Tests

Added two regression tests to normalize-graph.test.ts:

endpoint:service:getUser is preserved unchanged (was previously corrupted to service:getUser).
normalization is idempotent for IDs with a reserved-word middle segment.

pnpm --filter @understand-anything/core test — 755 passing (including the existing file:file: double-prefix test, which still passes).

stripToValidPrefix collapsed any ID whose second segment was a valid prefix, treating e.g. "endpoint:service:x" as a double-prefix and returning "service:x". This dropped the real outer prefix, corrupting the node type and breaking edge references and idempotency. Only collapse a true same-prefix duplicate (e.g. "file:file:...") by requiring the inner segment to equal the outer prefix. A different reserved word in the middle is a legitimate path segment and is kept. Adds regression tests covering the middle-reserved-word case and idempotency.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bfe10eed8d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-25T01:01:18Z

      const rest = remaining.slice(colonIdx + 1);
      const innerColonIdx = rest.indexOf(":");
-      if (innerColonIdx > 0 && VALID_PREFIXES.has(rest.slice(0, innerColonIdx))) {
+      if (innerColonIdx > 0 && rest.slice(0, innerColonIdx) === segment) {


Use expected type when collapsing prefixed IDs

When an LLM emits a project-prefixed ID and the project name is also a reserved prefix (for example a file node with service:file:src/foo.ts, or any bad outer valid prefix before the expected prefix), this condition no longer recurses because the inner prefix differs from the outer one. normalizeNodeId then returns service:file:src/foo.ts for a node whose type is file, so the graph no longer uses the canonical type:path ID and edges that reference the canonical file:src/foo.ts form are dropped as dangling. The duplicate-prefix decision needs the expected node prefix to distinguish this case from legitimate middle path segments like endpoint:service:x.

Useful? React with 👍 / 👎.

The previous fix collapsed a duplicate prefix only when the inner segment matched the outer one (`inner === segment`). That dropped a spurious project-name prefix only when it was an exact duplicate, so an ID like `service:file:src/foo.ts` for a `file` node — where the project name happens to collide with the reserved word `service` — kept the outer `service:` prefix instead of resolving to the canonical `file:src/foo.ts`. Edges referencing `file:src/foo.ts` then dangled. Thread the node's expected prefix into stripToValidPrefix and collapse when the inner segment is either the same reserved word (true duplicate) or the expected prefix (spurious reserved-word project prefix). A different reserved word that is not the expected prefix (`endpoint:service:x`) is still preserved as a real path segment. Adds two regression tests covering the strip + idempotency cases.

JOhnsonKC201 · 2026-06-28T03:31:25Z

Confirmed the codex P2 and pushed a follow-up. The earlier inner === segment check kept a spurious reserved-word project prefix — e.g. service:file:src/foo.ts for a file node stayed as-is instead of resolving to the canonical file:src/foo.ts, which would dangle edges referencing the canonical ID.

Fix (6d54111): stripToValidPrefix now takes the node's expected prefix and collapses the outer segment when the inner one is either the same reserved word (true duplicate) or the expected prefix (spurious reserved-word project prefix). A different reserved word that isn't the expected prefix (endpoint:service:x) is still preserved as a real path segment. Added two regression tests (strip + idempotency); 757/757 core tests pass.

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d54111a8b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-28T03:35:57Z

+      const innerSegment = innerColonIdx > 0 ? rest.slice(0, innerColonIdx) : "";
+      if (
+        innerColonIdx > 0 &&
+        (innerSegment === segment || innerSegment === expectedPrefix)


Handle reserved-word project prefixes in edge fallback

When normalizeBatchOutput falls back to normalizing an edge endpoint that is not present in idMap, it infers the type from the malformed endpoint itself. For a project named like a reserved prefix, e.g. an edge endpoint service:file:src/foo.ts pointing to an existing canonical node file:src/foo.ts, inferTypeFromId supplies service, so this check does not strip the outer service segment and the edge remains dangling and is dropped. This regresses the cross-variant edge repair path for the same reserved-word project-prefix case the node normalization now handles.

Useful? React with 👍 / 👎.

The node-level fix did not cover normalizeBatchOutput's edge-repair fallback. When an edge endpoint isn't found in idMap, the fallback infers its type from the endpoint's own prefix via inferTypeFromId. For a reserved-word project prefix (e.g. edge `service:file:src/foo.ts` pointing at node `file:src/foo.ts`), that yields type `service`, so normalizeNodeId keeps the spurious `service:` segment, the endpoint never matches the canonical node, and the edge is dropped as dangling. Add resolveEdgeEndpoint: try the prefix-inferred type first (unchanged common case), then each subsequent leading reserved-word segment as a candidate type, returning the first that resolves to an existing node — and the original id when none do. Add a regression test asserting the edge is repaired rather than dropped. 758/758 core tests pass.

JOhnsonKC201 · 2026-06-28T20:19:45Z

Addressed the edge-path P2 from the last review. The node-level fix didn't cover normalizeBatchOutput's edge-repair fallback: for an endpoint like service:file:src/foo.ts pointing at canonical node file:src/foo.ts, inferTypeFromId read the spurious service as the type, so the strip never fired and the edge dropped as dangling.

Fix (0a0ffd2): added resolveEdgeEndpoint — tries the prefix-inferred type first (common case unchanged), then each subsequent leading reserved-word segment as a candidate type, returning the first that resolves to an existing node (and the original id when none do). Added a regression test asserting the edge is repaired rather than dropped; 758/758 core tests pass.

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0a0ffd233c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-28T20:27:17Z

+  }
+
+  for (const type of candidateTypes) {
+    const normalized = normalizeNodeId(id, { type });


Normalize from the candidate prefix segment

When an edge endpoint has more than one reserved prefix before the real node prefix, e.g. service:endpoint:file:src/foo.ts pointing at an existing file:src/foo.ts, candidateTypes includes file but this loop still calls normalizeNodeId on the full original ID. For the file candidate, stripToValidPrefix sees service followed by endpoint and preserves service:endpoint:..., so none of the later candidates can resolve and the edge is dropped as dangling; the previous fallback would collapse this chain. The repair needs to normalize from the candidate segment (or otherwise skip earlier reserved segments) rather than always from id.

Useful? React with 👍 / 👎.

resolveEdgeEndpoint tried every candidate type against the full edge endpoint id. For a chain of reserved prefixes before the real one (e.g. `service:endpoint:file:src/foo.ts` pointing at node `file:src/foo.ts`), stripToValidPrefix can't collapse the run for the `file` candidate — `service` then `endpoint` are preserved — so no candidate resolves and the edge is dropped as dangling. Pair each candidate type with the id suffix that begins at its prefix segment, and normalize from that suffix instead of the full id. The inferred-type/full-id attempt is kept first so the common and single reserved-word prefix cases are unchanged. Add a regression test for the multi-prefix chain. 759/759 core tests pass.

JOhnsonKC201 · 2026-06-30T18:06:14Z

Addressed the third codex P2 ("Normalize from the candidate prefix segment"). The edge-repair fallback normalized every candidate type against the full endpoint id, so a chain of reserved prefixes before the real one — e.g. service:endpoint:file:src/foo.ts pointing at file:src/foo.ts — couldn't be collapsed for the file candidate and the edge was dropped as dangling.

resolveEdgeEndpoint now pairs each candidate type with the id suffix that begins at its prefix segment and normalizes from that suffix. The inferred-type/full-id attempt is still tried first, so the common case and the earlier single reserved-word prefix case are unchanged. Added a regression test for the multi-prefix chain; 759/759 core tests pass.

chatgpt-codex-connector Bot reviewed Jun 25, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jun 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: preserve real prefix when middle ID segment is a reserved word#513

fix: preserve real prefix when middle ID segment is a reserved word#513
JOhnsonKC201 wants to merge 4 commits into
Egonex-AI:mainfrom
JOhnsonKC201:fix/strip-valid-prefix-middle-reserved-word

JOhnsonKC201 commented Jun 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Uh oh!

JOhnsonKC201 commented Jun 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Uh oh!

JOhnsonKC201 commented Jun 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Uh oh!

JOhnsonKC201 commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

JOhnsonKC201 commented Jun 25, 2026

Problem

Fix

Tests

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

JOhnsonKC201 commented Jun 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

JOhnsonKC201 commented Jun 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

JOhnsonKC201 commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant