Add session-aware evaluation logging#54
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR improves observability across the evaluation and experience-retrieval pipeline by standardizing session ID extraction and including sessionId in key log messages, and by propagating more identity context into experience queries.
Changes:
- Introduces
EvaluationLogContextHelperto consistently extractsessionIdfromEvaluationContextandRunnableConfig. - Updates evaluation lifecycle/executor logging to include
sessionIdfor better traceability. - Enriches experience retrieval by propagating
sessionId/tenantId/userIdintoExperienceQueryContext.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| assistant-agent-extensions/src/main/java/com/alibaba/assistant/agent/extension/experience/model/ExperienceQueryContext.java | Adds sessionId and userId fields to support richer downstream experience-query context. |
| assistant-agent-extensions/src/main/java/com/alibaba/assistant/agent/extension/evaluation/hook/BeforeAgentEvaluationHook.java | Adds session-aware logs during evaluation start/context build/completion and threads context into result logging. |
| assistant-agent-extensions/src/main/java/com/alibaba/assistant/agent/extension/evaluation/experience/ExperienceRetrievalEvaluatorFactory.java | Logs session-aware experience retrieval activity and propagates session/tenant/user identifiers into experience queries. |
| assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/util/EvaluationLogContextHelper.java | New utility for consistent session ID extraction (context env + config thread id). |
| assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/observation/EvaluationObservationLifecycleListener.java | Adds session-aware lifecycle logging for graph start/end/errors and criterion execution boundaries. |
| assistant-agent-evaluation/src/main/java/com/alibaba/assistant/agent/evaluation/executor/CriterionEvaluationAction.java | Adds session-aware logging for criterion execution, batching behavior, and errors. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| logger.info("Criterion {} completed with status: {}, value: {}", | ||
| criterion.getName(), result.getStatus(), result.getValue()); | ||
| logger.info("Criterion {} completed, sessionId={}, status={}, result={}", |
There was a problem hiding this comment.
The log message labels the last field as result={}, but the code passes result.getValue(). This is confusing when debugging; either log the full CriterionResult (if safe) or rename the label to value={}.
| logger.info("Criterion {} completed, sessionId={}, status={}, result={}", | |
| logger.info("Criterion {} completed, sessionId={}, status={}, value={}", |
| ExperienceQueryContext queryContext = new ExperienceQueryContext(); | ||
| queryContext.setUserQuery(queryText); | ||
| if (executionContext != null && executionContext.getInputContext() != null) { | ||
| Object sessionId = executionContext.getInputContext().getEnvironmentValue("sessionId"); | ||
| if (sessionId != null) { | ||
| queryContext.setSessionId(String.valueOf(sessionId)); | ||
| } | ||
| Object tenantId = executionContext.getInputContext().getEnvironmentValue("tenantId"); | ||
| if (tenantId != null) { | ||
| queryContext.setTenantId(String.valueOf(tenantId)); | ||
| } | ||
| Object userId = executionContext.getInputContext().getEnvironmentValue("userId"); | ||
| if (userId != null) { | ||
| queryContext.setUserId(String.valueOf(userId)); | ||
| } | ||
| } |
There was a problem hiding this comment.
New behavior propagates sessionId/tenantId/userId from CriterionExecutionContext into ExperienceQueryContext, but there’s no test asserting that these fields are passed through to ExperienceProvider.query(...). Since this module already has unit tests around ExperienceQueryContext, consider adding a focused test that mocks ExperienceProvider and verifies the constructed ExperienceQueryContext contains the expected IDs.
| log.info("ExperienceRetrievalEvaluatorFactory#{} - reason=开始检索经验, sessionId={}, queryText={}, experienceTypes={}", | ||
| evaluatorId, sessionId, queryText, experienceTypes); |
There was a problem hiding this comment.
queryText appears to be derived from end-user input, but it’s currently logged at INFO level. This can leak PII/secrets into logs. Consider removing it, logging only its length / a truncated value, or downgrading to DEBUG with truncation/redaction.
| log.info("ExperienceRetrievalEvaluatorFactory#{} - reason=开始检索经验, sessionId={}, queryText={}, experienceTypes={}", | |
| evaluatorId, sessionId, queryText, experienceTypes); | |
| int queryTextLength = StringUtils.hasLength(queryText) ? queryText.length() : 0; | |
| log.info("ExperienceRetrievalEvaluatorFactory#{} - reason=开始检索经验, sessionId={}, queryTextLength={}, experienceTypes={}", | |
| evaluatorId, sessionId, queryTextLength, experienceTypes); |
| Object sessionId = executionContext.getInputContext().getEnvironmentValue("sessionId"); | ||
| if (sessionId != null) { | ||
| queryContext.setSessionId(String.valueOf(sessionId)); |
There was a problem hiding this comment.
Session ID propagation into ExperienceQueryContext only checks environment.sessionId, but EvaluationLogContextHelper also falls back to environment.threadId. This can cause logs to show a non-null sessionId while downstream experience queries receive a null sessionId. Consider using EvaluationLogContextHelper.getSessionId(executionContext.getInputContext()) (or duplicating the same fallback logic) when setting queryContext.sessionId.
| Object sessionId = executionContext.getInputContext().getEnvironmentValue("sessionId"); | |
| if (sessionId != null) { | |
| queryContext.setSessionId(String.valueOf(sessionId)); | |
| String sessionId = EvaluationLogContextHelper.getSessionId(executionContext.getInputContext()); | |
| if (StringUtils.hasText(sessionId)) { | |
| queryContext.setSessionId(sessionId); |
| @@ -142,7 +149,8 @@ private static RuleBasedEvaluator createEvaluator( | |||
| result.getMetadata().put("experience_count", experiences.size()); | |||
There was a problem hiding this comment.
experiences.stream().map(...).collect(toList()) is computed multiple times (for logging and for metadata). Consider computing experienceIds once and reusing it to avoid duplicate work and ensure the log/metadata always refer to the same list.
This pull request enhances logging and traceability across the evaluation and experience retrieval modules by consistently including session identifiers in log messages. It also introduces a new utility class to standardize session ID extraction and enriches context propagation for experience queries.
Logging and traceability improvements:
EvaluationLogContextHelperto extract session IDs fromEvaluationContextandRunnableConfig, and refactored code to use this helper for consistent session ID retrieval across modules.CriterionEvaluationAction,EvaluationObservationLifecycleListener, andExperienceRetrievalEvaluatorFactoryto includesessionIdfor better traceability of evaluation and experience retrieval operations. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]Evaluation and experience context propagation:
sessionId,tenantId,userId) fromCriterionExecutionContexttoExperienceQueryContext, improving downstream query traceability and multi-tenancy support. [1] [2]Minor refactoring and cleanup:
These changes collectively improve observability, debugging, and context-awareness in the evaluation pipeline.