feat(cli): extend Diagnostic Dashboard with 5 health categories (index, plans, cost, performance) #580

Open
opened 2026-03-04 23:44:27 +00:00 by freemo · 1 comment
Owner

Metadata

Field Value
Commit Message feat(cli): extend Diagnostic Dashboard with 5 health categories (index, plans, cost, performance)
Branch feature/m6-diagnostic-dashboard-health-categories

Summary

Extend the existing agents diagnostics command to cover all 5 health categories defined in the specification. Currently only basic system health checks (config, DB, providers, disk, git, locks) are implemented. The spec requires 4 additional categories: index health, active plan overview, cost summary, and performance summary.

Spec Reference

Section: Architecture > Observability > Diagnostic Dashboard
Lines: ~43826-43835

Current State

  • system.py (CLI diagnostics command, lines 390-529) exists with basic checks:
    • Config file readability
    • Database accessibility
    • Provider API key validation
    • Disk space
    • Git status
    • Lock file checks
  • Missing: Index health, active plan overview, cost summary, performance summary

Description

The spec defines 5 diagnostic categories:

  1. System health (IMPLEMENTED): Config file readability, database accessibility, disk space, provider API key validation.
  2. Index health (MISSING): Text index status, vector index dimensionality and count, graph store connectivity.
  3. Active plan overview (MISSING): Running plans, queued plans, resource utilization.
  4. Cost summary (MISSING): Current session cost, budget utilization, provider-level breakdown.
  5. Performance summary (MISSING): Average plan duration, tool call latency percentiles, context build times.

Acceptance Criteria

  • Index health section: text index status (ready/building/error), vector index dimensionality and entry count, graph store connectivity and triple count
  • Active plan overview: list of running plans with phase/state, queued plans count, resource utilization per plan
  • Cost summary: current session total cost, budget utilization percentage, per-provider cost breakdown
  • Performance summary: average plan duration (last N plans), tool call latency percentiles (p50/p95/p99), context build time averages
  • All new sections render in both plain and rich output modes
  • JSON format output includes all 5 categories in structured form
  • Graceful degradation: if metrics/data unavailable, show "N/A" rather than error
  • Unit tests for each new diagnostic section
  • Depends on: Metrics Collection Framework (for performance summary data)
  • Related: Existing diagnostics in system.py lines 390-529

Suggested Milestone

v3.5.0

Priority

Low

Suggested Assignee

@brent.edwards — QA/testing/CLI polish

Subtasks

  • Code: Implement Index health section (text index status, vector index dimensionality/count, graph store connectivity/triple count)
  • Code: Implement Active plan overview (running plans with phase/state, queued plans, resource utilization)
  • Code: Implement Cost summary (session total cost, budget utilization %, per-provider breakdown)
  • Code: Implement Performance summary (average plan duration, tool call latency percentiles, context build time averages)
  • Code: Ensure all sections render in plain, rich, and JSON output modes with graceful degradation
  • Docs: Update CLI documentation for extended agents diagnostics output
  • Behave tests: Add BDD feature file features/cli/diagnostic_dashboard_extended.feature covering all 5 health categories
  • Robot tests: Add Robot Framework integration test for each new diagnostic section with mock data
  • ASV benchmarks: Not applicable (CLI display feature)
  • Quality: coverage ≥97%: Verify via nox -s coverage_report
  • Quality: nox full suite: Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks below are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata | Field | Value | |-------|-------| | **Commit Message** | `feat(cli): extend Diagnostic Dashboard with 5 health categories (index, plans, cost, performance)` | | **Branch** | `feature/m6-diagnostic-dashboard-health-categories` | ## Summary Extend the existing `agents diagnostics` command to cover all 5 health categories defined in the specification. Currently only basic system health checks (config, DB, providers, disk, git, locks) are implemented. The spec requires 4 additional categories: index health, active plan overview, cost summary, and performance summary. ## Spec Reference **Section**: Architecture > Observability > Diagnostic Dashboard **Lines**: ~43826-43835 ## Current State - `system.py` (CLI diagnostics command, lines 390-529) exists with basic checks: - Config file readability - Database accessibility - Provider API key validation - Disk space - Git status - Lock file checks - **Missing**: Index health, active plan overview, cost summary, performance summary ## Description The spec defines 5 diagnostic categories: 1. **System health** (IMPLEMENTED): Config file readability, database accessibility, disk space, provider API key validation. 2. **Index health** (MISSING): Text index status, vector index dimensionality and count, graph store connectivity. 3. **Active plan overview** (MISSING): Running plans, queued plans, resource utilization. 4. **Cost summary** (MISSING): Current session cost, budget utilization, provider-level breakdown. 5. **Performance summary** (MISSING): Average plan duration, tool call latency percentiles, context build times. ## Acceptance Criteria - [x] **Index health** section: text index status (ready/building/error), vector index dimensionality and entry count, graph store connectivity and triple count - [x] **Active plan overview**: list of running plans with phase/state, queued plans count, resource utilization per plan - [x] **Cost summary**: current session total cost, budget utilization percentage, per-provider cost breakdown - [x] **Performance summary**: average plan duration (last N plans), tool call latency percentiles (p50/p95/p99), context build time averages - [x] All new sections render in both `plain` and `rich` output modes - [x] JSON format output includes all 5 categories in structured form - [x] Graceful degradation: if metrics/data unavailable, show "N/A" rather than error - [x] Unit tests for each new diagnostic section ## Related Issues - Depends on: Metrics Collection Framework (for performance summary data) - Related: Existing diagnostics in `system.py` lines 390-529 ## Suggested Milestone v3.5.0 ## Priority Low ## Suggested Assignee @brent.edwards — QA/testing/CLI polish ## Subtasks - [x] **Code**: Implement Index health section (text index status, vector index dimensionality/count, graph store connectivity/triple count) - [x] **Code**: Implement Active plan overview (running plans with phase/state, queued plans, resource utilization) - [x] **Code**: Implement Cost summary (session total cost, budget utilization %, per-provider breakdown) - [x] **Code**: Implement Performance summary (average plan duration, tool call latency percentiles, context build time averages) - [x] **Code**: Ensure all sections render in `plain`, `rich`, and JSON output modes with graceful degradation - [x] **Docs**: Update CLI documentation for extended `agents diagnostics` output - [x] **Behave tests**: Add BDD feature file `features/cli/diagnostic_dashboard_extended.feature` covering all 5 health categories - [x] **Robot tests**: Add Robot Framework integration test for each new diagnostic section with mock data - [x] **ASV benchmarks**: Not applicable (CLI display feature) - [x] **Quality: coverage ≥97%**: Verify via `nox -s coverage_report` - [x] **Quality: nox full suite**: Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks below are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.5.0 milestone 2026-03-05 00:30:14 +00:00
Member

Implementation Notes

Design Decisions

  1. get_container_if_initialized() pattern: The 4 new diagnostic check functions need access to DI container services (plan lifecycle, session, cost, performance). Calling get_container() directly would force container singleton initialization, which in CI's parallel test runner (--processes 16) pollutes global state for other feature files. Solution: Added get_container_if_initialized() at container.py:595 that returns None without triggering initialization. All 4 check functions guard on this, producing graceful WARN/N/A results when the container isn't booted.

  2. Graceful degradation everywhere: Every new check function wraps its logic in try/except Exception and falls through to WARN status with "N/A" detail values when backend services are unavailable. This means agents diagnostics never crashes even if the database, index, or metrics backends are down.

  3. Output format consistency: All 4 new categories emit structured data compatible with plain, rich, and json output modes. The build_diagnostics_data() function at system.py:764-767 aggregates results from all check functions into the unified diagnostics dictionary.

Key Code Locations

  • src/cleveragents/application/container.py:595get_container_if_initialized() (new)
  • src/cleveragents/cli/commands/system.py:446-520_check_index_health()
  • src/cleveragents/cli/commands/system.py:523-600_check_active_plans()
  • src/cleveragents/cli/commands/system.py:603-670_check_cost_summary()
  • src/cleveragents/cli/commands/system.py:673-743_check_performance_summary()
  • src/cleveragents/cli/commands/system.py:764-767 — Updated build_diagnostics_data() aggregation
  • docs/reference/diagnostics_checks.md — Documentation for categories 2-5

Discoveries

  • CI parallel test interference: The original approach using get_container() caused Settings singleton corruption in parallel Behave runs. Feature files like session_list_error.feature would fail with stale global state. The get_container_if_initialized() guard completely resolves this.
  • Container singleton pattern: _container module-level variable at container.py:580, get_container() at 583 always creates if None, reset_container() at 605 sets to None.

Test Results

  • Behave: 23 new scenarios in features/diagnostic_dashboard_extended.feature — all passing
  • Robot Framework: 6 integration tests in robot/diagnostic_dashboard_extended.robot — all passing
  • Typecheck: 0 Pyright errors (strict mode)
  • Unit tests: 10,700 scenarios / 0 failures
  • Coverage: 98% (above 97% threshold)
  • Lint: Clean

Files Modified/Created

File Change
src/cleveragents/application/container.py Added get_container_if_initialized()
src/cleveragents/cli/commands/system.py 4 new check functions + updated aggregation
docs/reference/diagnostics_checks.md New documentation for categories 2-5
features/diagnostic_dashboard_extended.feature 23 BDD scenarios
features/steps/diagnostic_dashboard_extended_steps.py Step defs with mock container helpers
robot/diagnostic_dashboard_extended.robot 6 Robot integration tests
robot/helper_diagnostic_dashboard_extended.py Robot helper script

PR Reference

PR #791feature/m6-diagnostic-dashboard-health-categories branch, commit a509725c

## Implementation Notes ### Design Decisions 1. **`get_container_if_initialized()` pattern**: The 4 new diagnostic check functions need access to DI container services (plan lifecycle, session, cost, performance). Calling `get_container()` directly would force container singleton initialization, which in CI's parallel test runner (`--processes 16`) pollutes global state for other feature files. Solution: Added `get_container_if_initialized()` at `container.py:595` that returns `None` without triggering initialization. All 4 check functions guard on this, producing graceful `WARN`/`N/A` results when the container isn't booted. 2. **Graceful degradation everywhere**: Every new check function wraps its logic in `try/except Exception` and falls through to `WARN` status with `"N/A"` detail values when backend services are unavailable. This means `agents diagnostics` never crashes even if the database, index, or metrics backends are down. 3. **Output format consistency**: All 4 new categories emit structured data compatible with `plain`, `rich`, and `json` output modes. The `build_diagnostics_data()` function at `system.py:764-767` aggregates results from all check functions into the unified diagnostics dictionary. ### Key Code Locations - `src/cleveragents/application/container.py:595` — `get_container_if_initialized()` (new) - `src/cleveragents/cli/commands/system.py:446-520` — `_check_index_health()` - `src/cleveragents/cli/commands/system.py:523-600` — `_check_active_plans()` - `src/cleveragents/cli/commands/system.py:603-670` — `_check_cost_summary()` - `src/cleveragents/cli/commands/system.py:673-743` — `_check_performance_summary()` - `src/cleveragents/cli/commands/system.py:764-767` — Updated `build_diagnostics_data()` aggregation - `docs/reference/diagnostics_checks.md` — Documentation for categories 2-5 ### Discoveries - **CI parallel test interference**: The original approach using `get_container()` caused Settings singleton corruption in parallel Behave runs. Feature files like `session_list_error.feature` would fail with stale global state. The `get_container_if_initialized()` guard completely resolves this. - **Container singleton pattern**: `_container` module-level variable at `container.py:580`, `get_container()` at 583 always creates if None, `reset_container()` at 605 sets to None. ### Test Results - **Behave**: 23 new scenarios in `features/diagnostic_dashboard_extended.feature` — all passing - **Robot Framework**: 6 integration tests in `robot/diagnostic_dashboard_extended.robot` — all passing - **Typecheck**: 0 Pyright errors (strict mode) - **Unit tests**: 10,700 scenarios / 0 failures - **Coverage**: 98% (above 97% threshold) - **Lint**: Clean ### Files Modified/Created | File | Change | |------|--------| | `src/cleveragents/application/container.py` | Added `get_container_if_initialized()` | | `src/cleveragents/cli/commands/system.py` | 4 new check functions + updated aggregation | | `docs/reference/diagnostics_checks.md` | New documentation for categories 2-5 | | `features/diagnostic_dashboard_extended.feature` | 23 BDD scenarios | | `features/steps/diagnostic_dashboard_extended_steps.py` | Step defs with mock container helpers | | `robot/diagnostic_dashboard_extended.robot` | 6 Robot integration tests | | `robot/helper_diagnostic_dashboard_extended.py` | Robot helper script | ### PR Reference PR #791 — `feature/m6-diagnostic-dashboard-health-categories` branch, commit `a509725c`
freemo modified the milestone from v3.5.0 to v3.3.0 2026-03-13 22:05:24 +00:00
freemo self-assigned this 2026-04-02 06:13:52 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#580
No description provided.