UAT: LLMTrace domain model missing spec-required fields: total_tokens, temperature, context_refs; actor field named incorrectly #3765

New issue

Open

opened 2026-04-05 22:33:41 +00:00 by freemo · 0 comments

freemo commented

2026-04-05 22:33:41 +00:00

Owner

Bug Report

What Was Tested

Code-level analysis of src/cleveragents/domain/models/observability/llm_trace.py against the LLMTrace schema defined in docs/specification.md §Observability → LLM Call Tracing (lines ~46079–46098).

Expected Behavior (from spec)

The spec defines the following fields for LLMTrace:

class LLMTrace(BaseModel):
    trace_id: str                     # ULID
    plan_id: str
    decision_id: str | None
    actor_name: str                   # ← field name is actor_name
    provider: str
    model: str
    prompt_tokens: int
    completion_tokens: int
    total_tokens: int                 # ← REQUIRED by spec
    cost_usd: float
    latency_ms: int
    temperature: float                # ← REQUIRED by spec
    tool_calls: list[str]             # tool names invoked by the LLM
    context_hash: str
    context_refs: list[str]           # ← REQUIRED by spec (resource IDs referenced in context)
    streaming: bool
    retry_count: int
    error: str | None
    timestamp: datetime

Actual Behavior

The implementation in llm_trace.py is missing three spec-required fields and uses a different field name for the actor:

Spec Field	Implementation	Status
`actor_name`	`actor`	❌ Wrong name
`total_tokens`	(missing)	❌ Missing
`temperature`	(missing)	❌ Missing
`context_refs`	(missing)	❌ Missing

1. actor vs actor_name: The spec uses actor_name (consistent with DomainEvent.actor_name and the rest of the codebase), but the implementation uses actor. This creates an inconsistency when correlating traces with domain events.

2. total_tokens missing: The spec requires total_tokens: int as a derived field (prompt_tokens + completion_tokens). The TraceService.compute_metrics() method manually computes this sum (t.prompt_tokens + t.completion_tokens) instead of using a model field, which means the total is not persisted in the trace record.

3. temperature missing: The spec requires temperature: float to be recorded per LLM call. This is important for reproducibility analysis — knowing the temperature used for a given decision.

4. context_refs missing: The spec requires context_refs: list[str] — the list of resource IDs referenced in the context window. This is critical for provenance tracking (spec line ~44085: "Provenance everywhere: Every piece of context delivered to an actor can be traced back to a specific resource").

Code Location

File: src/cleveragents/domain/models/observability/llm_trace.py
Lines 29–130 (the LLMTrace class definition)

Impact

total_tokens missing means the trace record does not capture the full token usage, requiring callers to recompute it
temperature missing means LLM call reproducibility cannot be analyzed from traces
context_refs missing breaks the provenance chain from LLM calls back to source resources
actor vs actor_name inconsistency makes cross-correlation with domain events harder

Fix Required

Add the missing fields to LLMTrace:

actor_name: str = Field(
    ...,
    min_length=1,
    max_length=255,
    description="Name of the actor that made the call",
)
total_tokens: int = Field(
    ...,
    ge=0,
    description="Total tokens (prompt + completion)",
)
temperature: float = Field(
    default=1.0,
    ge=0.0,
    le=2.0,
    description="Temperature used for this LLM call",
)
context_refs: list[str] = Field(
    default_factory=list,
    description="Resource IDs referenced in the context window",
)

And rename actor → actor_name (with a migration for any existing persisted data).

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-uat-tester

## Bug Report ### What Was Tested Code-level analysis of `src/cleveragents/domain/models/observability/llm_trace.py` against the `LLMTrace` schema defined in `docs/specification.md` §Observability → LLM Call Tracing (lines ~46079–46098). ### Expected Behavior (from spec) The spec defines the following fields for `LLMTrace`: ```python class LLMTrace(BaseModel): trace_id: str # ULID plan_id: str decision_id: str | None actor_name: str # ← field name is actor_name provider: str model: str prompt_tokens: int completion_tokens: int total_tokens: int # ← REQUIRED by spec cost_usd: float latency_ms: int temperature: float # ← REQUIRED by spec tool_calls: list[str] # tool names invoked by the LLM context_hash: str context_refs: list[str] # ← REQUIRED by spec (resource IDs referenced in context) streaming: bool retry_count: int error: str | None timestamp: datetime ``` ### Actual Behavior The implementation in `llm_trace.py` is missing three spec-required fields and uses a different field name for the actor: | Spec Field | Implementation | Status | |---|---|---| | `actor_name` | `actor` | ❌ Wrong name | | `total_tokens` | *(missing)* | ❌ Missing | | `temperature` | *(missing)* | ❌ Missing | | `context_refs` | *(missing)* | ❌ Missing | **1. `actor` vs `actor_name`**: The spec uses `actor_name` (consistent with `DomainEvent.actor_name` and the rest of the codebase), but the implementation uses `actor`. This creates an inconsistency when correlating traces with domain events. **2. `total_tokens` missing**: The spec requires `total_tokens: int` as a derived field (`prompt_tokens + completion_tokens`). The `TraceService.compute_metrics()` method manually computes this sum (`t.prompt_tokens + t.completion_tokens`) instead of using a model field, which means the total is not persisted in the trace record. **3. `temperature` missing**: The spec requires `temperature: float` to be recorded per LLM call. This is important for reproducibility analysis — knowing the temperature used for a given decision. **4. `context_refs` missing**: The spec requires `context_refs: list[str]` — the list of resource IDs referenced in the context window. This is critical for provenance tracking (spec line ~44085: "Provenance everywhere: Every piece of context delivered to an actor can be traced back to a specific resource"). ### Code Location - **File**: `src/cleveragents/domain/models/observability/llm_trace.py` - **Lines 29–130** (the `LLMTrace` class definition) ### Impact - `total_tokens` missing means the trace record does not capture the full token usage, requiring callers to recompute it - `temperature` missing means LLM call reproducibility cannot be analyzed from traces - `context_refs` missing breaks the provenance chain from LLM calls back to source resources - `actor` vs `actor_name` inconsistency makes cross-correlation with domain events harder ### Fix Required Add the missing fields to `LLMTrace`: ```python actor_name: str = Field( ..., min_length=1, max_length=255, description="Name of the actor that made the call", ) total_tokens: int = Field( ..., ge=0, description="Total tokens (prompt + completion)", ) temperature: float = Field( default=1.0, ge=0.0, le=2.0, description="Temperature used for this LLM call", ) context_refs: list[str] = Field( default_factory=list, description="Resource IDs referenced in the context window", ) ``` And rename `actor` → `actor_name` (with a migration for any existing persisted data). --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-uat-tester

freemo referenced this issue

2026-04-05 22:34:53 +00:00

[Automated] CleverAgents Build Session - 2026-04-05 (Resumed) #3654

freemo referenced this issue

2026-04-05 22:34:59 +00:00

[Automated] CleverAgents Build Session - 2026-04-05 (Resumed) #3654

freemo referenced this issue

2026-04-06 09:18:32 +00:00

[Automated] CleverAgents Build Session - 2026-04-06 #3775

freemo referenced this issue

2026-04-06 09:58:05 +00:00

UAT: LLMTrace model missing total_tokens, temperature, and context_refs fields required by spec #3937