Skip to content

Add session-aware evaluation logging#54

Merged
AQing-527 merged 1 commit into
mainfrom
feat/diagnostic-skill-apis
Apr 15, 2026
Merged

Add session-aware evaluation logging#54
AQing-527 merged 1 commit into
mainfrom
feat/diagnostic-skill-apis

Conversation

@AQing-527
Copy link
Copy Markdown
Collaborator

This pull request enhances logging and traceability across the evaluation and experience retrieval modules by consistently including session identifiers in log messages. It also introduces a new utility class to standardize session ID extraction and enriches context propagation for experience queries.

Logging and traceability improvements:

  • Added a new utility class EvaluationLogContextHelper to extract session IDs from EvaluationContext and RunnableConfig, and refactored code to use this helper for consistent session ID retrieval across modules.
  • Updated log statements in CriterionEvaluationAction, EvaluationObservationLifecycleListener, and ExperienceRetrievalEvaluatorFactory to include sessionId for better traceability of evaluation and experience retrieval operations. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]

Evaluation and experience context propagation:

  • Enhanced the experience retrieval process to propagate additional context (sessionId, tenantId, userId) from CriterionExecutionContext to ExperienceQueryContext, improving downstream query traceability and multi-tenancy support. [1] [2]

Minor refactoring and cleanup:

  • Added missing imports and performed minor code cleanups in affected files. [1] [2] [3]

These changes collectively improve observability, debugging, and context-awareness in the evaluation pipeline.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 15, 2026 09:56
Copy link
Copy Markdown
Collaborator

@canfuu canfuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AQing-527 AQing-527 merged commit c6f4677 into main Apr 15, 2026
3 checks passed
@AQing-527 AQing-527 deleted the feat/diagnostic-skill-apis branch April 15, 2026 09:57
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves observability across the evaluation and experience-retrieval pipeline by standardizing session ID extraction and including sessionId in key log messages, and by propagating more identity context into experience queries.

Changes:

  • Introduces EvaluationLogContextHelper to consistently extract sessionId from EvaluationContext and RunnableConfig.
  • Updates evaluation lifecycle/executor logging to include sessionId for better traceability.
  • Enriches experience retrieval by propagating sessionId/tenantId/userId into ExperienceQueryContext.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
assistant-agent-extensions/src/main/java/com/alibaba/assistant/agent/extension/experience/model/ExperienceQueryContext.java Adds sessionId and userId fields to support richer downstream experience-query context.
assistant-agent-extensions/src/main/java/com/alibaba/assistant/agent/extension/evaluation/hook/BeforeAgentEvaluationHook.java Adds session-aware logs during evaluation start/context build/completion and threads context into result logging.
assistant-agent-extensions/src/main/java/com/alibaba/assistant/agent/extension/evaluation/experience/ExperienceRetrievalEvaluatorFactory.java Logs session-aware experience retrieval activity and propagates session/tenant/user identifiers into experience queries.
assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/util/EvaluationLogContextHelper.java New utility for consistent session ID extraction (context env + config thread id).
assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/observation/EvaluationObservationLifecycleListener.java Adds session-aware lifecycle logging for graph start/end/errors and criterion execution boundaries.
assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/executor/CriterionEvaluationAction.java Adds session-aware logging for criterion execution, batching behavior, and errors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


logger.info("Criterion {} completed with status: {}, value: {}",
criterion.getName(), result.getStatus(), result.getValue());
logger.info("Criterion {} completed, sessionId={}, status={}, result={}",
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message labels the last field as result={}, but the code passes result.getValue(). This is confusing when debugging; either log the full CriterionResult (if safe) or rename the label to value={}.

Suggested change
logger.info("Criterion {} completed, sessionId={}, status={}, result={}",
logger.info("Criterion {} completed, sessionId={}, status={}, value={}",

Copilot uses AI. Check for mistakes.
Comment on lines 201 to +216
ExperienceQueryContext queryContext = new ExperienceQueryContext();
queryContext.setUserQuery(queryText);
if (executionContext != null && executionContext.getInputContext() != null) {
Object sessionId = executionContext.getInputContext().getEnvironmentValue("sessionId");
if (sessionId != null) {
queryContext.setSessionId(String.valueOf(sessionId));
}
Object tenantId = executionContext.getInputContext().getEnvironmentValue("tenantId");
if (tenantId != null) {
queryContext.setTenantId(String.valueOf(tenantId));
}
Object userId = executionContext.getInputContext().getEnvironmentValue("userId");
if (userId != null) {
queryContext.setUserId(String.valueOf(userId));
}
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior propagates sessionId/tenantId/userId from CriterionExecutionContext into ExperienceQueryContext, but there’s no test asserting that these fields are passed through to ExperienceProvider.query(...). Since this module already has unit tests around ExperienceQueryContext, consider adding a focused test that mocks ExperienceProvider and verifies the constructed ExperienceQueryContext contains the expected IDs.

Copilot uses AI. Check for mistakes.
Comment on lines +119 to +120
log.info("ExperienceRetrievalEvaluatorFactory#{} - reason=开始检索经验, sessionId={}, queryText={}, experienceTypes={}",
evaluatorId, sessionId, queryText, experienceTypes);
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queryText appears to be derived from end-user input, but it’s currently logged at INFO level. This can leak PII/secrets into logs. Consider removing it, logging only its length / a truncated value, or downgrading to DEBUG with truncation/redaction.

Suggested change
log.info("ExperienceRetrievalEvaluatorFactory#{} - reason=开始检索经验, sessionId={}, queryText={}, experienceTypes={}",
evaluatorId, sessionId, queryText, experienceTypes);
int queryTextLength = StringUtils.hasLength(queryText) ? queryText.length() : 0;
log.info("ExperienceRetrievalEvaluatorFactory#{} - reason=开始检索经验, sessionId={}, queryTextLength={}, experienceTypes={}",
evaluatorId, sessionId, queryTextLength, experienceTypes);

Copilot uses AI. Check for mistakes.
Comment on lines +204 to +206
Object sessionId = executionContext.getInputContext().getEnvironmentValue("sessionId");
if (sessionId != null) {
queryContext.setSessionId(String.valueOf(sessionId));
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Session ID propagation into ExperienceQueryContext only checks environment.sessionId, but EvaluationLogContextHelper also falls back to environment.threadId. This can cause logs to show a non-null sessionId while downstream experience queries receive a null sessionId. Consider using EvaluationLogContextHelper.getSessionId(executionContext.getInputContext()) (or duplicating the same fallback logic) when setting queryContext.sessionId.

Suggested change
Object sessionId = executionContext.getInputContext().getEnvironmentValue("sessionId");
if (sessionId != null) {
queryContext.setSessionId(String.valueOf(sessionId));
String sessionId = EvaluationLogContextHelper.getSessionId(executionContext.getInputContext());
if (StringUtils.hasText(sessionId)) {
queryContext.setSessionId(sessionId);

Copilot uses AI. Check for mistakes.
Comment on lines 138 to 149
@@ -142,7 +149,8 @@ private static RuleBasedEvaluator createEvaluator(
result.getMetadata().put("experience_count", experiences.size());
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

experiences.stream().map(...).collect(toList()) is computed multiple times (for logging and for metadata). Consider computing experienceIds once and reusing it to avoid duplicate work and ensure the log/metadata always refer to the same list.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants