UAT: _forward_trace_to_langsmith does not capture prompt text or response text — spec requires full prompt/response tracing #4042

Open
opened 2026-04-06 09:07:10 +00:00 by freemo · 0 comments
Owner

Metadata

  • Branch: fix/trace-service-langsmith-missing-prompt-response-text
  • Commit Message: fix(observability): include prompt_text and completion_text in LangSmith trace forwarding
  • Milestone: (none — backlog)
  • Parent Epic: #945

Backlog note: This issue was discovered during autonomous operation
on milestone v3.6.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.

Bug Report

What Was Tested

Code-level analysis of src/cleveragents/application/services/trace_service.py — specifically the _forward_trace_to_langsmith() function and what data it sends to LangSmith.

Expected Behavior (from spec)

The specification states that LLM call tracing via LangSmith should include:

  • Prompts: The full prompt sent to the LLM.
  • Responses: The full response received from the LLM.
  • Tool calls: The name of the tool called, its parameters, and its return value.
  • Latencies: The duration of each LLM call and tool execution.
  • Token counts: The number of input and output tokens for each LLM call.

Actual Behavior

The _forward_trace_to_langsmith() function sends only token counts and metadata — no prompt text, no response text:

# src/cleveragents/application/services/trace_service.py lines 293-313
run_data: dict[str, Any] = {
    "name": f"llm-trace-{trace.trace_id}",
    "run_type": "llm",
    "inputs": {
        "actor": trace.actor,
        "provider": trace.provider,
        "model": trace.model,
        "prompt_tokens": trace.prompt_tokens,  # ← only token COUNT, not text
        "streaming": trace.streaming,
    },
    "outputs": {
        "completion_tokens": trace.completion_tokens,  # ← only token COUNT, not text
        "cost_usd": trace.cost_usd,
        "latency_ms": trace.latency_ms,
        "tool_calls": trace.tool_calls,
    },
}

The LLMTrace domain model (src/cleveragents/domain/models/observability/llm_trace.py) does not have fields for storing the actual prompt text or response text — only prompt_tokens (int) and completion_tokens (int).

Additionally, the create_run() call is missing start_time and end_time parameters, which the LangSmith SDK uses to calculate accurate latency in its UI.

Impact

  • LangSmith traces show token counts but no actual prompt/response content
  • LangSmith's prompt debugging and response analysis features are unusable
  • The spec's promise of "full prompt" and "full response" tracing is not delivered

Code Locations

  • File: src/cleveragents/application/services/trace_service.py
    • _forward_trace_to_langsmith() function (lines 277-314)
  • File: src/cleveragents/domain/models/observability/llm_trace.py
    • LLMTrace model — missing prompt_text and completion_text fields

Fix Required

  1. Add prompt_text: str | None and completion_text: str | None fields to LLMTrace
  2. Update _forward_trace_to_langsmith() to include these in inputs and outputs
  3. Add start_time and end_time to the create_run() call (derivable from timestamp and latency_ms)

Note: Storing full prompt/response text may have privacy and storage implications that should be considered.

Subtasks

  • Add prompt_text: str | None and completion_text: str | None fields to LLMTrace domain model
  • Update _forward_trace_to_langsmith() to pass prompt_text in inputs and completion_text in outputs
  • Add start_time and end_time to the create_run() call (derived from trace.timestamp and trace.latency_ms)
  • Update Behave unit tests for the LLMTrace model to cover the new fields
  • Update Behave unit tests for _forward_trace_to_langsmith() to assert prompt/response text is forwarded
  • Verify all nox stages pass and coverage >= 97%

Definition of Done

  • LLMTrace domain model has prompt_text: str | None and completion_text: str | None fields
  • _forward_trace_to_langsmith() includes prompt text in inputs["messages"] and response text in outputs["generations"] per LangSmith SDK conventions
  • create_run() call includes start_time and end_time for accurate latency display in LangSmith UI
  • Privacy/storage implications documented in code comments
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/trace-service-langsmith-missing-prompt-response-text` - **Commit Message**: `fix(observability): include prompt_text and completion_text in LangSmith trace forwarding` - **Milestone**: *(none — backlog)* - **Parent Epic**: #945 > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.6.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. ## Bug Report ### What Was Tested Code-level analysis of `src/cleveragents/application/services/trace_service.py` — specifically the `_forward_trace_to_langsmith()` function and what data it sends to LangSmith. ### Expected Behavior (from spec) The specification states that LLM call tracing via LangSmith should include: > - **Prompts**: The full prompt sent to the LLM. > - **Responses**: The full response received from the LLM. > - **Tool calls**: The name of the tool called, its parameters, and its return value. > - **Latencies**: The duration of each LLM call and tool execution. > - **Token counts**: The number of input and output tokens for each LLM call. ### Actual Behavior The `_forward_trace_to_langsmith()` function sends only token counts and metadata — **no prompt text, no response text**: ```python # src/cleveragents/application/services/trace_service.py lines 293-313 run_data: dict[str, Any] = { "name": f"llm-trace-{trace.trace_id}", "run_type": "llm", "inputs": { "actor": trace.actor, "provider": trace.provider, "model": trace.model, "prompt_tokens": trace.prompt_tokens, # ← only token COUNT, not text "streaming": trace.streaming, }, "outputs": { "completion_tokens": trace.completion_tokens, # ← only token COUNT, not text "cost_usd": trace.cost_usd, "latency_ms": trace.latency_ms, "tool_calls": trace.tool_calls, }, } ``` The `LLMTrace` domain model (`src/cleveragents/domain/models/observability/llm_trace.py`) does not have fields for storing the actual prompt text or response text — only `prompt_tokens` (int) and `completion_tokens` (int). Additionally, the `create_run()` call is missing `start_time` and `end_time` parameters, which the LangSmith SDK uses to calculate accurate latency in its UI. ### Impact - LangSmith traces show token counts but no actual prompt/response content - LangSmith's prompt debugging and response analysis features are unusable - The spec's promise of "full prompt" and "full response" tracing is not delivered ### Code Locations - **File**: `src/cleveragents/application/services/trace_service.py` - `_forward_trace_to_langsmith()` function (lines 277-314) - **File**: `src/cleveragents/domain/models/observability/llm_trace.py` - `LLMTrace` model — missing `prompt_text` and `completion_text` fields ### Fix Required 1. Add `prompt_text: str | None` and `completion_text: str | None` fields to `LLMTrace` 2. Update `_forward_trace_to_langsmith()` to include these in `inputs` and `outputs` 3. Add `start_time` and `end_time` to the `create_run()` call (derivable from `timestamp` and `latency_ms`) Note: Storing full prompt/response text may have privacy and storage implications that should be considered. ## Subtasks - [ ] Add `prompt_text: str | None` and `completion_text: str | None` fields to `LLMTrace` domain model - [ ] Update `_forward_trace_to_langsmith()` to pass `prompt_text` in `inputs` and `completion_text` in `outputs` - [ ] Add `start_time` and `end_time` to the `create_run()` call (derived from `trace.timestamp` and `trace.latency_ms`) - [ ] Update Behave unit tests for the `LLMTrace` model to cover the new fields - [ ] Update Behave unit tests for `_forward_trace_to_langsmith()` to assert prompt/response text is forwarded - [ ] Verify all nox stages pass and coverage >= 97% ## Definition of Done - [ ] `LLMTrace` domain model has `prompt_text: str | None` and `completion_text: str | None` fields - [ ] `_forward_trace_to_langsmith()` includes prompt text in `inputs["messages"]` and response text in `outputs["generations"]` per LangSmith SDK conventions - [ ] `create_run()` call includes `start_time` and `end_time` for accurate latency display in LangSmith UI - [ ] Privacy/storage implications documented in code comments - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
HAL9000 added this to the v3.5.0 milestone 2026-04-09 03:11:45 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#4042
No description provided.