feat(observability): add LLMTrace model and operational metrics #533

Merged
freemo merged 1 commit from feature/m6-llm-trace-metrics into master 2026-03-03 22:12:14 +00:00
Owner

Summary

Implements issue #500 — adds the LLMTrace domain model, OperationalMetricKey enum (14 keys), MetricCollector, LLMTraceRepository, and TraceService with LangSmith forwarding support and application lifecycle hooks.

Changes

New modules

  • domain/models/observability/LLMTrace, LLMTraceQuery (Pydantic v2), OperationalMetricKey, MetricEntry, MetricCollector
  • infrastructure/database/llm_trace_repository.py — SQLAlchemy-backed repository (save / get / list_by_plan / list_by_decision)
  • application/services/trace_service.py — Recording, querying, metrics aggregation, optional LangSmith forwarding, lifecycle hooks

Modified modules

  • infrastructure/database/models.py — Added LLMTraceModel (table llm_traces)
  • application/container.py — Wired TraceService + LLMTraceRepository providers
  • application/services/__init__.py — Exported TraceService

Tests & docs

  • 28 BDD scenarios (features/llm_trace.feature)
  • 6 Robot Framework smoke tests (robot/llm_trace.robot)
  • 3 ASV benchmark suites (benchmarks/llm_trace_bench.py)
  • Reference docs (docs/reference/observability.md)

Bug fix (pre-existing)

  • Fixed cli_core server_mode test flake caused by host ~/.cleveragents/config.toml leaking into tests

Quality gates

Gate Result
nox -s lint Passed
nox -s typecheck 0 errors, 0 warnings
nox -s unit_tests 250 features, 7948 scenarios, 30732 steps, 0 failures
nox -s coverage_report 97% coverage
nox -s integration_tests 1093/1097 pass (3 pre-existing failures)

Closes #500

## Summary Implements issue #500 — adds the `LLMTrace` domain model, `OperationalMetricKey` enum (14 keys), `MetricCollector`, `LLMTraceRepository`, and `TraceService` with LangSmith forwarding support and application lifecycle hooks. ## Changes ### New modules - **`domain/models/observability/`** — `LLMTrace`, `LLMTraceQuery` (Pydantic v2), `OperationalMetricKey`, `MetricEntry`, `MetricCollector` - **`infrastructure/database/llm_trace_repository.py`** — SQLAlchemy-backed repository (save / get / list_by_plan / list_by_decision) - **`application/services/trace_service.py`** — Recording, querying, metrics aggregation, optional LangSmith forwarding, lifecycle hooks ### Modified modules - **`infrastructure/database/models.py`** — Added `LLMTraceModel` (table `llm_traces`) - **`application/container.py`** — Wired `TraceService` + `LLMTraceRepository` providers - **`application/services/__init__.py`** — Exported `TraceService` ### Tests & docs - 28 BDD scenarios (`features/llm_trace.feature`) - 6 Robot Framework smoke tests (`robot/llm_trace.robot`) - 3 ASV benchmark suites (`benchmarks/llm_trace_bench.py`) - Reference docs (`docs/reference/observability.md`) ### Bug fix (pre-existing) - Fixed `cli_core` server_mode test flake caused by host `~/.cleveragents/config.toml` leaking into tests ## Quality gates | Gate | Result | |------|--------| | `nox -s lint` | ✅ Passed | | `nox -s typecheck` | ✅ 0 errors, 0 warnings | | `nox -s unit_tests` | ✅ 250 features, 7948 scenarios, 30732 steps, 0 failures | | `nox -s coverage_report` | ✅ 97% coverage | | `nox -s integration_tests` | ✅ 1093/1097 pass (3 pre-existing failures) | Closes #500
freemo force-pushed feature/m6-llm-trace-metrics from 30e90ce9b4
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 18s
CI / build (pull_request) Successful in 18s
CI / security (pull_request) Successful in 34s
CI / typecheck (pull_request) Successful in 34s
CI / unit_tests (pull_request) Successful in 3m13s
CI / docker (pull_request) Successful in 42s
CI / integration_tests (pull_request) Successful in 4m4s
CI / coverage (pull_request) Successful in 4m0s
CI / benchmark-regression (pull_request) Has been cancelled
to 260692b72d
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 18s
CI / build (pull_request) Successful in 19s
CI / security (pull_request) Successful in 32s
CI / typecheck (pull_request) Successful in 52s
CI / unit_tests (pull_request) Successful in 2m46s
CI / integration_tests (pull_request) Successful in 3m5s
CI / docker (pull_request) Successful in 44s
CI / coverage (pull_request) Successful in 4m43s
CI / benchmark-regression (pull_request) Has been cancelled
2026-03-03 19:53:11 +00:00
Compare
freemo scheduled this pull request to auto merge when all checks succeed 2026-03-03 19:53:28 +00:00
freemo force-pushed feature/m6-llm-trace-metrics from 260692b72d
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 18s
CI / build (pull_request) Successful in 19s
CI / security (pull_request) Successful in 32s
CI / typecheck (pull_request) Successful in 52s
CI / unit_tests (pull_request) Successful in 2m46s
CI / integration_tests (pull_request) Successful in 3m5s
CI / docker (pull_request) Successful in 44s
CI / coverage (pull_request) Successful in 4m43s
CI / benchmark-regression (pull_request) Has been cancelled
to eeae246572
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 17s
CI / build (pull_request) Successful in 14s
CI / typecheck (pull_request) Successful in 35s
CI / security (pull_request) Successful in 39s
CI / unit_tests (pull_request) Successful in 2m18s
CI / integration_tests (pull_request) Successful in 2m53s
CI / docker (pull_request) Successful in 39s
CI / coverage (pull_request) Failing after 4m1s
CI / benchmark-regression (pull_request) Successful in 25m6s
2026-03-03 19:59:21 +00:00
Compare
freemo force-pushed feature/m6-llm-trace-metrics from eeae246572
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 17s
CI / build (pull_request) Successful in 14s
CI / typecheck (pull_request) Successful in 35s
CI / security (pull_request) Successful in 39s
CI / unit_tests (pull_request) Successful in 2m18s
CI / integration_tests (pull_request) Successful in 2m53s
CI / docker (pull_request) Successful in 39s
CI / coverage (pull_request) Failing after 4m1s
CI / benchmark-regression (pull_request) Successful in 25m6s
to 1d5106de73
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Failing after 13s
CI / build (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 18s
CI / typecheck (pull_request) Successful in 32s
CI / security (pull_request) Successful in 32s
CI / coverage (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has been skipped
CI / unit_tests (pull_request) Successful in 2m8s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Has been cancelled
2026-03-03 21:10:43 +00:00
Compare
freemo force-pushed feature/m6-llm-trace-metrics from 1d5106de73
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Failing after 13s
CI / build (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 18s
CI / typecheck (pull_request) Successful in 32s
CI / security (pull_request) Successful in 32s
CI / coverage (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has been skipped
CI / unit_tests (pull_request) Successful in 2m8s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Has been cancelled
to e14420acc9
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 28s
CI / security (pull_request) Successful in 33s
CI / typecheck (pull_request) Successful in 36s
CI / unit_tests (pull_request) Successful in 2m11s
CI / docker (pull_request) Successful in 41s
CI / integration_tests (pull_request) Successful in 2m54s
CI / coverage (pull_request) Successful in 4m3s
CI / benchmark-regression (pull_request) Successful in 24m28s
2026-03-03 21:13:18 +00:00
Compare
freemo force-pushed feature/m6-llm-trace-metrics from e14420acc9
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 28s
CI / security (pull_request) Successful in 33s
CI / typecheck (pull_request) Successful in 36s
CI / unit_tests (pull_request) Successful in 2m11s
CI / docker (pull_request) Successful in 41s
CI / integration_tests (pull_request) Successful in 2m54s
CI / coverage (pull_request) Successful in 4m3s
CI / benchmark-regression (pull_request) Successful in 24m28s
to 3f14cbbf7e
All checks were successful
CI / lint (pull_request) Successful in 40s
CI / security (pull_request) Successful in 32s
CI / quality (pull_request) Successful in 42s
CI / typecheck (pull_request) Successful in 1m1s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 14s
CI / unit_tests (pull_request) Successful in 3m58s
CI / integration_tests (pull_request) Successful in 4m34s
CI / docker (pull_request) Successful in 1m6s
CI / coverage (pull_request) Successful in 6m21s
CI / quality (push) Successful in 16s
CI / lint (push) Successful in 21s
CI / security (push) Successful in 32s
CI / typecheck (push) Successful in 35s
CI / benchmark-regression (push) Has been skipped
CI / build (push) Successful in 39s
CI / unit_tests (push) Successful in 2m38s
CI / integration_tests (push) Successful in 3m2s
CI / docker (push) Successful in 40s
CI / coverage (push) Successful in 4m9s
CI / benchmark-publish (push) Successful in 14m42s
CI / benchmark-regression (pull_request) Successful in 26m29s
2026-03-03 22:04:37 +00:00
Compare
freemo scheduled this pull request to auto merge when all checks succeed 2026-03-03 22:04:58 +00:00
freemo merged commit 3f14cbbf7e into master 2026-03-03 22:12:14 +00:00
freemo deleted branch feature/m6-llm-trace-metrics 2026-03-03 22:12:14 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!533
No description provided.