Conversation
🦋 Changeset detectedLatest commit: f26d76d The changes in this PR will be included in the next version bump. This PR includes changesets to release 10 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
TL;DR — Adds targeted diagnostic logging across Doltgres ref-scoping middleware and trigger execution paths to surface slow database operations (pool connect, branch checkout, ref resolution) and improve observability for production latency investigations. Key changes
Summary | 7 files | 1 commit | base: Slow-operation logging in Doltgres ref-scoping
The 5-second threshold keeps logs quiet under normal conditions while surfacing Doltgres connection pool exhaustion or slow branch operations in production. The
Ref resolution diagnostics
This is particularly useful for diagnosing tenant-scoped ref mismatches where the expected branch doesn't appear in Trigger execution lifecycle tracing
|
There was a problem hiding this comment.
This PR adds diagnostic logging and slow-operation timing across the Dolt ref resolution and branch-scoping code paths. The intent — finding where latency hides — is good. A few issues to address before merging, mostly around log levels, duplicate logging, and a data leak in one log statement.
Claude Opus | 𝕏
| branchQueryMs, | ||
| totalMs, | ||
| branchNames: branches.map((b) => b.name), | ||
| }, |
There was a problem hiding this comment.
High: branchNames leaks all tenant branch names into logs. This array contains tenant-scoped branch names for every project. In a production environment with many tenants sharing a Doltgres instance, this could produce enormous log entries and expose tenant data in log aggregators.
Drop branchNames entirely — branchCount is sufficient to diagnose the "branch not found" scenario.
| }, | |
| branchCount: branches.length, | |
| branchQueryMs, | |
| totalMs, |
| if (!branch) { | ||
| const totalMs = Date.now() - startTime; | ||
| logger.info( | ||
| { | ||
| ref, | ||
| branchCount: branches.length, | ||
| branchQueryMs, | ||
| totalMs, | ||
| branchNames: branches.map((b) => b.name), | ||
| }, | ||
| 'resolveRef: branch not found in dolt_branches result' | ||
| ); | ||
| } |
There was a problem hiding this comment.
Medium: Log level should be warn, not info. A branch not being found during resolveRef is an unexpected/concerning condition — callers generally expect the branch to exist. logger.warn is the appropriate level per the repo's logging guidelines ("recoverable issues or concerning patterns"). Also, this fires unconditionally (no slow-path threshold), so at info it will add noise for every cache miss or missing-branch lookup.
| logger.info( | ||
| { projectMain, error: error instanceof Error ? error.message : String(error) }, | ||
| 'resolveRef threw in ref middleware' | ||
| ); | ||
| logger.warn({ error, projectMain }, 'Failed to resolve project main branch'); |
There was a problem hiding this comment.
Medium: Duplicate log — this logger.info and the existing logger.warn on line 246 log the same error. The warn already captures error and projectMain. Remove this new info call to avoid double-logging the same failure.
| logger.info( | |
| { projectMain, error: error instanceof Error ? error.message : String(error) }, | |
| 'resolveRef threw in ref middleware' | |
| ); | |
| logger.warn({ error, projectMain }, 'Failed to resolve project main branch'); | |
| logger.warn({ error, projectMain }, 'Failed to resolve project main branch'); |
| logger.info( | ||
| { | ||
| ref: resolvedRef.name, | ||
| connectionId, | ||
| cleanupMs: Date.now() - cleanupStart, | ||
| error: cleanupError instanceof Error ? cleanupError.message : String(cleanupError), | ||
| }, | ||
| 'withRef cleanup failed' | ||
| ); |
There was a problem hiding this comment.
Medium: Duplicate log — same cleanup failure is logged twice. This new logger.info and the existing logger.error on lines 272–275 report the same cleanupError. Remove one. The error level is the correct one for a cleanup failure; the info call should be dropped.
| logger.info( | |
| { | |
| ref: resolvedRef.name, | |
| connectionId, | |
| cleanupMs: Date.now() - cleanupStart, | |
| error: cleanupError instanceof Error ? cleanupError.message : String(cleanupError), | |
| }, | |
| 'withRef cleanup failed' | |
| ); | |
| logger.error( | |
| { error: cleanupError, tempBranch, connectionId, cleanupMs: Date.now() - cleanupStart }, | |
| 'Error during ref scope cleanup' | |
| ); |
| logger.info( | ||
| { | ||
| ref: resolvedRef.name, | ||
| cleanupMs: Date.now() - cleanupStart, | ||
| error: cleanupError instanceof Error ? cleanupError.message : String(cleanupError), | ||
| }, | ||
| 'branchScopedDb cleanup failed' | ||
| ); |
There was a problem hiding this comment.
Medium: Same duplicate-log pattern as ref-scope.ts. The new logger.info for cleanup failure is immediately followed by the existing logger.error for the same error. Consolidate into the single error-level call and add the timing fields there.
| logger.info( | |
| { | |
| ref: resolvedRef.name, | |
| cleanupMs: Date.now() - cleanupStart, | |
| error: cleanupError instanceof Error ? cleanupError.message : String(cleanupError), | |
| }, | |
| 'branchScopedDb cleanup failed' | |
| ); | |
| logger.error( | |
| { | |
| error: cleanupError, | |
| ref: resolvedRef.name, | |
| cleanupMs: Date.now() - cleanupStart, | |
| }, | |
| 'Error during connection cleanup' | |
| ); |
| logger.info( | ||
| { | ||
| evaluatorId: evaluator.id, | ||
| conversationId: conversation.id, | ||
| hasExpectedOutput: expectedOutput != null, | ||
| expectedOutputLength: expectedOutputText?.length ?? 0, | ||
| }, | ||
| 'Building evaluation prompt' | ||
| ); |
There was a problem hiding this comment.
Low: This is a routine operation log at info level. Per the logging guidelines, routine pre-step breadcrumbs ("Building evaluation prompt") belong at debug. None of the structured fields here convey business-event or performance-alert information. Consider logger.debug or removing entirely — the downstream LLM call log is the meaningful signal.
| logger.info( | ||
| { tenantId, projectId, agentId, triggerId, invocationId, conversationId, dispatchedAt }, | ||
| 'Trigger execution dispatched and starting execution' | ||
| ); |
There was a problem hiding this comment.
Low: Duplicate of the log 5 lines above. Line 541–543 already logs 'Trigger invocation created' with the same structured fields (tenantId, projectId, agentId, triggerId, invocationId, conversationId). This second log at the same level with nearly the same payload adds noise without new information. Remove it or merge the two.
| logger.info( | ||
| { tenantId, projectId, agentId, triggerId, invocationId }, | ||
| 'executeAgentAsync: starting' |
There was a problem hiding this comment.
Low: Redundant entry-point log. executeAgentAsync already has a log at line 891 ('Starting async trigger execution') with the same fields plus conversationId. This extra log at function entry doesn't add diagnostic value — if you want it, make it debug.
|
TL;DR — Adds diagnostic logging and timing instrumentation across the Doltgres branch-scoping and ref-resolution hot path to help identify slow database operations in production. Slow operations (>5 s) are logged at Key changes
Summary | 7 files | 4 commits | base: Slow-operation instrumentation for Doltgres ref scoping
The same pattern is applied symmetrically in both
Ref resolution and branch-not-found diagnostics
Trigger execution and agent lifecycle logging
|
There was a problem hiding this comment.
PR Review Summary
(6) Total Issues | Risk: High
🔴❗ Critical (1) ❗🔴
Inline Comments:
- 🔴 Critical:
ref-helpers.ts:56Logging all branch names exposes tenant/project identifiers in logs
🟠⚠️ Major (2) 🟠⚠️
Inline Comments:
- 🟠 Major:
ref-helpers.ts:53Unconditional INFO log on expected code path creates noise - 🟠 Major:
ref.ts:241-248Duplicate logging — error logged at both INFO and WARN levels
🟡 Minor (2) 🟡
Inline Comments:
- 🟡 Minor:
ref-scope.ts:265-273Duplicate logging — cleanup failure logged at INFO and ERROR - 🟡 Minor:
branchScopedDb.ts:165-170Same duplicate logging pattern
💭 Consider (1) 💭
Inline Comments:
- 💭 Consider:
EvaluationService.ts:157-165Downgrade to DEBUG or remove — routine operation
✅ What's Good
The PR demonstrates good observability practices:
- Consistent 5s threshold for slow operation detection across all files
- Consistent naming for timing variables (
xxxMspattern) - Structured logging with relevant context (ref names, connection IDs, durations)
- Strategic instrumentation points covering pool.connect, branch checkout, and cleanup operations
📋 Changeset Required
Per AGENTS.md, this PR modifies agents-api and agents-core packages with runtime behavior changes (new logging). A changeset is needed:
pnpm bump patch --pkg agents-api --pkg agents-core "Add slow operation logging for Doltgres database operations"🚫 REQUEST CHANGES
Summary: The observability logging approach is solid, but the critical issue of logging all branch names (which contain tenant/project IDs) needs to be fixed before merge. This creates a data isolation concern where one tenant's identifiers could appear in logs triggered by another tenant's request. Additionally, several duplicate logging patterns (INFO + WARN/ERROR for same event) should be consolidated to follow the established api-logging-guidelines.
Discarded (3)
| Location | Issue | Reason Discarded |
|---|---|---|
TriggerService.ts:549 |
Duplicate dispatch log near executeAgentAsync:starting | Valid observation but low severity — the logs serve slightly different purposes (dispatch vs execution start) and include different context fields |
multi-file |
Duplicate slow-operation patterns in ref-scope.ts and branchScopedDb.ts | Not a bug — both serve different purposes (middleware vs utility). Shared abstraction would be nice-to-have but not required for this PR |
TriggerService.ts:549-915 |
Inconsistent context fields across logs | Valid observation but these logs serve different phases of execution where different context is relevant |
Reviewers (3)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
pr-review-sre |
6 | 0 | 0 | 0 | 4 | 0 | 2 |
pr-review-consistency |
6 | 0 | 0 | 0 | 2 | 0 | 1 |
pr-review-standards |
1 | 0 | 0 | 0 | 0 | 0 | 0 |
| Total | 13 | 0 | 0 | 0 | 6 | 0 | 3 |
Note: Findings from pr-review-standards overlapped with pr-review-sre on the branchNames issue.
| if (!branch) { | ||
| const totalMs = Date.now() - startTime; | ||
| logger.info( | ||
| { |
There was a problem hiding this comment.
🔴 CRITICAL: Logging all branch names exposes tenant/project identifiers
Issue: Branch names follow the pattern {tenantId}_{projectId}_{branchName}. Logging ALL branch names via branchNames: branches.map((b) => b.name) leaks every tenant and project ID in the database whenever any branch resolution fails.
Why: This creates:
- A compliance/data-isolation issue where one tenant's IDs appear in logs caused by another tenant's request
- Unbounded log line size as branch count grows (could be hundreds or thousands)
- Log ingestion costs scaling with total branch count per failed lookup
Fix: Remove branchNames from the log payload:
| { | |
| logger.info( | |
| { ref, branchCount: branches.length, branchQueryMs, totalMs }, | |
| 'resolveRef: branch not found in dolt_branches result' | |
| ); |
Refs:
| const branchQueryMs = Date.now() - branchQueryStart; | ||
| const branch = branches.find((b) => b.name === ref); | ||
|
|
||
| if (!branch) { |
There was a problem hiding this comment.
🟠 MAJOR: Unconditional INFO log on expected code path creates noise
Issue: resolveRef is called multiple times during fallback resolution (e.g., try project-scoped, then raw ref). A not-found result is often expected, not exceptional. Logging INFO on every miss inflates log volume.
Why: Per api-logging-guidelines, INFO is for "important business events, NOT routine operations". Branch-not-found is a normal return path — the caller handles null. This will fire on every request that uses a non-existent or fallback ref.
Fix: Downgrade to DEBUG level, or let the caller log at INFO when resolution ultimately fails:
| if (!branch) { | |
| if (!branch) { | |
| const totalMs = Date.now() - startTime; | |
| logger.debug( | |
| { ref, branchCount: branches.length, branchQueryMs, totalMs }, | |
| 'resolveRef: branch not found in dolt_branches result' | |
| ); | |
| } |
| } catch (error) { | ||
| logger.info( | ||
| { projectMain, error: error instanceof Error ? error.message : String(error) }, | ||
| 'resolveRef threw in ref middleware' | ||
| ); | ||
| logger.warn({ error, projectMain }, 'Failed to resolve project main branch'); | ||
| refResult = null; | ||
| } |
There was a problem hiding this comment.
🟠 MAJOR: Duplicate logging — error logged at both INFO and WARN levels
Issue: Lines 242-245 log at INFO level, then line 246 logs the same error at WARN level. This creates redundant log entries for the same event.
Why: The established codebase pattern (per api-logging-guidelines) is to use WARN for recoverable issues, not to double-log at multiple levels. This adds noise and makes log analysis harder.
Fix: Remove the INFO log and keep only the WARN log:
| } catch (error) { | |
| logger.info( | |
| { projectMain, error: error instanceof Error ? error.message : String(error) }, | |
| 'resolveRef threw in ref middleware' | |
| ); | |
| logger.warn({ error, projectMain }, 'Failed to resolve project main branch'); | |
| refResult = null; | |
| } | |
| } catch (error) { | |
| logger.warn({ error, projectMain }, 'Failed to resolve project main branch'); | |
| refResult = null; | |
| } |
| ref: resolvedRef.name, | ||
| connectionId, | ||
| cleanupMs: Date.now() - cleanupStart, | ||
| error: cleanupError instanceof Error ? cleanupError.message : String(cleanupError), | ||
| }, | ||
| 'withRef cleanup failed' | ||
| ); | ||
| logger.error( | ||
| { error: cleanupError, tempBranch, connectionId }, |
There was a problem hiding this comment.
🟡 Minor: Duplicate logging — cleanup failure logged at both INFO and ERROR levels
Issue: Lines 266-269 log at INFO level, then lines 270-273 log the same error at ERROR level. The same pattern appears in branchScopedDb.ts.
Why: Cleanup failures are unexpected errors warranting ERROR level, not INFO. Double-logging the same event at different levels creates noise.
Fix: Keep only the ERROR log:
| ref: resolvedRef.name, | |
| connectionId, | |
| cleanupMs: Date.now() - cleanupStart, | |
| error: cleanupError instanceof Error ? cleanupError.message : String(cleanupError), | |
| }, | |
| 'withRef cleanup failed' | |
| ); | |
| logger.error( | |
| { error: cleanupError, tempBranch, connectionId }, | |
| } catch (cleanupError) { | |
| logger.error( | |
| { error: cleanupError, tempBranch, connectionId, cleanupMs: Date.now() - cleanupStart }, | |
| 'Error during ref scope cleanup' | |
| ); |
| }, | ||
| 'branchScopedDb cleanup failed' | ||
| ); | ||
| logger.error({ error: cleanupError }, 'Error during connection cleanup'); | ||
| } finally { | ||
| const totalMs = Date.now() - mwStartTime; |
There was a problem hiding this comment.
🟡 Minor: Same duplicate logging pattern as ref-scope.ts
Issue: Lines 166-169 log at INFO, then line 170 logs the same error at ERROR level.
Fix: Keep only the ERROR log, include cleanupMs in it:
| }, | |
| 'branchScopedDb cleanup failed' | |
| ); | |
| logger.error({ error: cleanupError }, 'Error during connection cleanup'); | |
| } finally { | |
| const totalMs = Date.now() - mwStartTime; | |
| } catch (cleanupError) { | |
| logger.error({ error: cleanupError, ref: resolvedRef.name, cleanupMs: Date.now() - cleanupStart }, 'Error during connection cleanup'); |
| logger.info( | ||
| { | ||
| evaluatorId: evaluator.id, | ||
| conversationId: conversation.id, | ||
| hasExpectedOutput: expectedOutput != null, | ||
| expectedOutputLength: expectedOutputText?.length ?? 0, | ||
| }, | ||
| 'Building evaluation prompt' | ||
| ); |
There was a problem hiding this comment.
💭 Consider: Downgrade to DEBUG or remove — this is a routine operation
Issue: This INFO log fires on every evaluation execution, not gated by any threshold. Per api-logging-guidelines, INFO should be for "important events, NOT routine operations".
Why: The other new logs in this PR correctly use threshold checks (>5_000ms), but this one does not. Building evaluation prompts is a routine operation already covered by adjacent logs at lines 146-152 (schema info) and 111-117 (trace completion).
Fix: Consider DEBUG level if this detail is needed for troubleshooting, or remove if redundant with existing logs.
Preview URLsUse these stable preview aliases for testing this PR:
These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find. Raw Vercel deployment URLs
|
There was a problem hiding this comment.
PR Review Summary
(5) Total Issues | Risk: High
Re-review scope: This review covers only changes since the last automated review. The delta (
c710dae06) removed 10 lines of unnecessary logging fromEvaluationService.ts, addressing the "Consider" item from the prior review. No new code was introduced.
✅ Addressed Since Last Review
The following issue was addressed by the latest commit:
- 💭
— Unnecessary INFO log removed ✅EvaluationService.ts:157-165
🕐 Pending Recommendations (5)
The following issues from the prior review remain unresolved:
- 🔴
ref-helpers.ts:56LoggingbranchNamesexposes all tenant/project identifiers in logs — data isolation issue - 🟠
ref-helpers.ts:53Unconditional INFO log on expected code path creates noise - 🟠
ref.ts:241-248Duplicate logging — error logged at both INFO and WARN levels - 🟡
ref-scope.ts:265-273Duplicate logging — cleanup failure logged at INFO and ERROR - 🟡
branchScopedDb.ts:165-170Same duplicate logging pattern as ref-scope.ts
📋 Changeset Still Required
Per AGENTS.md, this PR modifies agents-api and agents-core packages with runtime behavior changes. A changeset is needed:
pnpm bump patch --pkg agents-api --pkg agents-core "Add slow operation logging for Doltgres database operations"🚫 REQUEST CHANGES
Summary: Good progress — the unnecessary EvaluationService logging has been removed. However, the Critical issue (logging all branch names which contain tenant/project IDs) and Major issues (duplicate logging patterns, unconditional INFO logging) from the prior review remain unaddressed. Please fix the branchNames data leak in ref-helpers.ts:61 before merge — this is a data isolation concern where one tenant's identifiers could appear in logs triggered by another tenant's request.
Reviewers (0)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
| N/A | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Note: Delta review — no new code to analyze, so no reviewers dispatched. Prior review findings carried forward as Pending Recommendations.
There was a problem hiding this comment.
PR Review Summary
(6) Total Issues | Risk: High
Re-review scope: This review covers only changes since the last automated review. The delta (
f26d76df6) added a changeset file. All prior issues remain unaddressed.
🟡 Minor (1) 🟡
Inline Comments:
- 🟡 Minor:
.changeset/itchy-cups-join.md:6Changeset message has typo ("for for") and trailing quote
🕐 Pending Recommendations (5)
The following issues from the prior review remain unresolved:
- 🔴
ref-helpers.ts:61LoggingbranchNamesexposes all tenant/project identifiers in logs — data isolation issue - 🟠
ref-helpers.ts:53-55Unconditional INFO log on expected code path creates noise - 🟠
ref.ts:242-246Duplicate logging — error logged at both INFO and WARN levels - 🟡
ref-scope.ts:263-275Duplicate logging — cleanup failure logged at INFO and ERROR - 🟡
branchScopedDb.ts:160-168Same duplicate logging pattern as ref-scope.ts
🚫 REQUEST CHANGES
Summary: A changeset was added (good!), but it has a minor typo. More importantly, the 5 issues from the prior review remain unaddressed. The critical issue is logging all branch names (branchNames: branches.map((b) => b.name)) in ref-helpers.ts:61, which leaks tenant/project identifiers from the entire database into logs whenever any branch resolution fails. This is a data isolation concern that must be fixed before merge.
Discarded (0)
None — delta contained only the changeset file.
Reviewers (0)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
| N/A | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Note: Delta review — only a changeset file was added, so no code reviewers dispatched. Prior review findings carried forward as Pending Recommendations.
| "@inkeep/agents-api": patch | ||
| --- | ||
|
|
||
| logging for for Doltgres database operations" |
There was a problem hiding this comment.
🟡 Minor: Changeset message has typo and formatting issue
Issue: The message contains a duplicate "for for" and a trailing double quote that shouldn't be there.
Why: Changeset messages appear in release notes and changelog. Typos look unprofessional and could confuse consumers.
Fix: (1-click apply)
| logging for for Doltgres database operations" | |
| Add slow operation logging for Doltgres database operations |
Refs:
| } | ||
| ); | ||
| const withRefMs = Date.now() - withRefStart; | ||
| if (withRefMs > 5_000) { |
There was a problem hiding this comment.
optionally, you can just always log the time in case 5 seconds doesn't capture what you are looking for
Ito Test Report ✅13 test cases ran. 3 additional findings, 10 passed. Across the unified run, 13 test cases executed with 10 passing and 3 failing, confirming that most core behaviors are stable including valid webhook execution, scheduled rerun semantics and active-invocation guards, deep-link/mobile trigger UX, broken-ref handling, inert rendering of malicious payload text, run-as authorization boundaries, and repeated chat completions on the same conversation. The key defects were a High-severity security gap where malformed webhook signatures can still be accepted when signature verification is configured without a signing secret, a High-severity reliability bug where parallel manual scheduled run/rerun requests can collide on millisecond-based idempotency keys and throw 500s, and a Medium contract regression where webhook rerun with an empty JSON body is rejected with 400 because userMessage is incorrectly required. ✅ Passed (10)ℹ️ Additional Findings (3)
|














No description provided.