feat(observability): wire AuditService.record() into domain services via EventBus auto-dispatch #581

Closed
opened 2026-03-04 23:44:49 +00:00 by freemo · 3 comments
Owner

Metadata

Field Value
Commit Message feat(observability): wire AuditService.record() into domain services via EventBus auto-dispatch
Branch feature/m4-audit-service-eventbus-wiring

Summary

Wire the existing AuditService.record() method into all domain services so that security-relevant operations are automatically logged to the audit_log table. Currently AuditService has full record()/query() functionality but is NOT automatically invoked by domain services — manual calls would be needed at every audit point. The spec requires automatic dispatch through the EventBus.

Spec Reference

Section: Architecture > Security Model > Audit Logging
Lines: ~43924-43941
Also: Architecture > Observability > Event System (lines ~43655-43774) — EventBus is the dispatch mechanism

Current State

  • audit_service.py exists with record() and query() methods — fully functional for manual use.
  • The audit_log database table exists.
  • No automatic wiring: Domain services (plan_lifecycle_service.py, plan_executor.py, config_service.py, etc.) do NOT call AuditService.record() at their audit points.
  • EventBus (#473) is a prerequisite — once the EventBus exists, AuditService can subscribe to domain events and automatically persist them.

Description

The spec defines these security-relevant events that must be audit-logged:

Event Type Trigger Details Captured
plan_applied agents plan apply Plan ID, project names, files changed, validation results, user identity
plan_cancelled agents plan cancel Plan ID, reason, resources released
resource_modified Tool write operations during Execute Resource ID, modification type, sandbox path, tool name
correction_applied agents plan correct Correction attempt ID, original decision ID, mode, guidance
config_changed agents config set Key, old value (masked), new value (masked), user identity
entity_deleted agents <entity> delete/remove Entity type, entity name, cascade effects
session_created agents session create Session ID, actor name, user identity
auth_success Server mode login User identity, IP address, token prefix
auth_failure Server mode failed login Attempted identity, IP address, failure reason

Architecture: The ReactiveEventBus (from Event System spec) has an emit() method that should:

  1. Push to the RxPY reactive stream for real-time subscribers
  2. Persist to the audit log (via AuditService)
  3. Dispatch to type-specific handlers

The AuditService should subscribe to all security-relevant EventTypes and automatically persist them. This decouples domain services from audit concerns.

Acceptance Criteria

  • AuditService subscribes to all security-relevant EventTypes via EventBus
  • plan_applied events logged when plan_lifecycle_service applies a plan
  • plan_cancelled events logged when a plan is cancelled
  • resource_modified events logged during tool write operations in Execute
  • correction_applied events logged when agents plan correct is used
  • config_changed events logged when agents config set modifies a value
  • entity_deleted events logged for delete/remove operations
  • session_created events logged when sessions are created
  • All audit log entries include: timestamp, event_type, plan_id (if applicable), user identity, details dict
  • Secret masking applied to audit log details (via shared redaction module)
  • Audit log retention: logs are never automatically deleted
  • Unit tests: verify domain operations trigger audit log entries
  • Integration test: perform plan apply, verify audit_log table has the entry
  • Blocked by: #473 (EventBus implementation) — prerequisite for auto-dispatch
  • Depends on: Event System Domain Event Taxonomy (EventType enum)
  • Uses: existing audit_service.py record/query infrastructure

Suggested Milestone

v3.3.0

Priority

High

Suggested Assignee

@CoreRasurae — Async/validation/domain service wiring

Subtasks

  • Code: Subscribe AuditService to all security-relevant EventTypes via EventBus
  • Code: Wire audit log emission into domain services: plan_applied, plan_cancelled, resource_modified, correction_applied, config_changed, entity_deleted, session_created
  • Code: Apply secret masking (via shared redaction module) to audit log details
  • Code: Ensure audit log entries include: timestamp, event_type, plan_id, user identity, details dict
  • Docs: Document the audit logging auto-dispatch architecture and event-to-audit mapping
  • Behave tests: Add BDD feature file features/observability/audit_service_wiring.feature covering all 9 event types
  • Robot tests: Add Robot Framework integration test: perform plan apply, verify audit_log table has the entry
  • ASV benchmarks: Add ASV benchmark for audit log write throughput (benchmarks/bench_audit_service.py)
  • Quality: coverage ≥97%: Verify via nox -s coverage_report
  • Quality: nox full suite: Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks below are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata | Field | Value | |-------|-------| | **Commit Message** | `feat(observability): wire AuditService.record() into domain services via EventBus auto-dispatch` | | **Branch** | `feature/m4-audit-service-eventbus-wiring` | ## Summary Wire the existing `AuditService.record()` method into all domain services so that security-relevant operations are automatically logged to the `audit_log` table. Currently `AuditService` has full `record()`/`query()` functionality but is NOT automatically invoked by domain services — manual calls would be needed at every audit point. The spec requires automatic dispatch through the EventBus. ## Spec Reference **Section**: Architecture > Security Model > Audit Logging **Lines**: ~43924-43941 **Also**: Architecture > Observability > Event System (lines ~43655-43774) — EventBus is the dispatch mechanism ## Current State - `audit_service.py` exists with `record()` and `query()` methods — fully functional for manual use. - The `audit_log` database table exists. - **No automatic wiring**: Domain services (`plan_lifecycle_service.py`, `plan_executor.py`, `config_service.py`, etc.) do NOT call `AuditService.record()` at their audit points. - **EventBus** (#473) is a prerequisite — once the EventBus exists, AuditService can subscribe to domain events and automatically persist them. ## Description The spec defines these security-relevant events that must be audit-logged: | Event Type | Trigger | Details Captured | |---|---|---| | `plan_applied` | `agents plan apply` | Plan ID, project names, files changed, validation results, user identity | | `plan_cancelled` | `agents plan cancel` | Plan ID, reason, resources released | | `resource_modified` | Tool write operations during Execute | Resource ID, modification type, sandbox path, tool name | | `correction_applied` | `agents plan correct` | Correction attempt ID, original decision ID, mode, guidance | | `config_changed` | `agents config set` | Key, old value (masked), new value (masked), user identity | | `entity_deleted` | `agents <entity> delete/remove` | Entity type, entity name, cascade effects | | `session_created` | `agents session create` | Session ID, actor name, user identity | | `auth_success` | Server mode login | User identity, IP address, token prefix | | `auth_failure` | Server mode failed login | Attempted identity, IP address, failure reason | **Architecture**: The `ReactiveEventBus` (from Event System spec) has an `emit()` method that should: 1. Push to the RxPY reactive stream for real-time subscribers 2. Persist to the audit log (via AuditService) 3. Dispatch to type-specific handlers The AuditService should subscribe to all security-relevant EventTypes and automatically persist them. This decouples domain services from audit concerns. ## Acceptance Criteria - [ ] AuditService subscribes to all security-relevant EventTypes via EventBus - [ ] `plan_applied` events logged when `plan_lifecycle_service` applies a plan - [ ] `plan_cancelled` events logged when a plan is cancelled - [ ] `resource_modified` events logged during tool write operations in Execute - [ ] `correction_applied` events logged when `agents plan correct` is used - [ ] `config_changed` events logged when `agents config set` modifies a value - [ ] `entity_deleted` events logged for delete/remove operations - [ ] `session_created` events logged when sessions are created - [ ] All audit log entries include: timestamp, event_type, plan_id (if applicable), user identity, details dict - [ ] Secret masking applied to audit log details (via shared redaction module) - [ ] Audit log retention: logs are never automatically deleted - [ ] Unit tests: verify domain operations trigger audit log entries - [ ] Integration test: perform plan apply, verify audit_log table has the entry ## Related Issues - **Blocked by**: #473 (EventBus implementation) — prerequisite for auto-dispatch - Depends on: Event System Domain Event Taxonomy (EventType enum) - Uses: existing `audit_service.py` record/query infrastructure ## Suggested Milestone v3.3.0 ## Priority High ## Suggested Assignee @CoreRasurae — Async/validation/domain service wiring ## Subtasks - [ ] **Code**: Subscribe AuditService to all security-relevant EventTypes via EventBus - [ ] **Code**: Wire audit log emission into domain services: plan_applied, plan_cancelled, resource_modified, correction_applied, config_changed, entity_deleted, session_created - [ ] **Code**: Apply secret masking (via shared redaction module) to audit log details - [ ] **Code**: Ensure audit log entries include: timestamp, event_type, plan_id, user identity, details dict - [ ] **Docs**: Document the audit logging auto-dispatch architecture and event-to-audit mapping - [ ] **Behave tests**: Add BDD feature file `features/observability/audit_service_wiring.feature` covering all 9 event types - [ ] **Robot tests**: Add Robot Framework integration test: perform plan apply, verify audit_log table has the entry - [ ] **ASV benchmarks**: Add ASV benchmark for audit log write throughput (`benchmarks/bench_audit_service.py`) - [ ] **Quality: coverage ≥97%**: Verify via `nox -s coverage_report` - [ ] **Quality: nox full suite**: Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks below are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.3.0 milestone 2026-03-05 00:30:15 +00:00
Author
Owner

PM Note (Day 29) — State Change

Changes:

  • State: Unverified → Verified

Rationale: This issue (AuditService.record() wiring, 5 SP, M3) is well-scoped and the specification clearly defines the required integration points. Verified and ready for @CoreRasurae to begin work. This should be prioritized after #627 (TDD infrastructure).

**PM Note (Day 29) — State Change** **Changes:** - **State**: Unverified → **Verified** **Rationale:** This issue (AuditService.record() wiring, 5 SP, M3) is well-scoped and the specification clearly defines the required integration points. Verified and ready for @CoreRasurae to begin work. This should be prioritized after #627 (TDD infrastructure).
Member

Implementation Summary

Branch: feature/m4-audit-service-eventbus-wiring
Commit: 592874493263137dc74ec87f9c55f4a89749af00
PR: #659

Changes

  • src/cleveragents/application/services/audit_event_subscriber.py (new): Core AuditEventSubscriber class that subscribes to 9 security-relevant event types (plan_applied, plan_cancelled, resource_modified, correction_applied, config_changed, entity_deleted, session_created, auth_success, auth_failure) and persists them via AuditService.record(). Secret masking is applied via shared/redaction.py's redact_dict to all audit log details before persistence.

  • src/cleveragents/infrastructure/events/types.py: Added 5 new EventType enum members: CORRECTION_APPLIED, CONFIG_CHANGED, ENTITY_DELETED, AUTH_SUCCESS, AUTH_FAILURE.

  • src/cleveragents/application/services/audit_service.py: Added auth_success and auth_failure to VALID_EVENT_TYPES.

  • src/cleveragents/application/services/plan_lifecycle_service.py: Wired to emit PLAN_APPLIED and PLAN_CANCELLED events at the appropriate audit points.

  • src/cleveragents/application/container.py: Registered AuditEventSubscriber as a singleton in the DI container alongside a new AuditService singleton.

  • src/cleveragents/application/services/__init__.py: Exported AuditEventSubscriber.

  • features/observability/audit_service_wiring.feature: 17 BDD scenarios covering all 9 event types, redaction, filtering, field propagation, and error resilience.

  • robot/audit_service_wiring.robot: 4 integration tests.

  • benchmarks/bench_audit_service.py: ASV benchmark for audit log write throughput.

Quality Gates

Gate Result
nox -s lint PASS
nox -s typecheck PASS (0 errors, 0 warnings)
nox -s unit_tests PASS (350 features, 9715 scenarios, 37386 steps)
nox -s integration_tests PASS (1346 tests)
nox -s coverage_report PASS (99% >= 97% threshold)
## Implementation Summary **Branch**: `feature/m4-audit-service-eventbus-wiring` **Commit**: `592874493263137dc74ec87f9c55f4a89749af00` **PR**: #659 ### Changes - **`src/cleveragents/application/services/audit_event_subscriber.py`** (new): Core `AuditEventSubscriber` class that subscribes to 9 security-relevant event types (`plan_applied`, `plan_cancelled`, `resource_modified`, `correction_applied`, `config_changed`, `entity_deleted`, `session_created`, `auth_success`, `auth_failure`) and persists them via `AuditService.record()`. Secret masking is applied via `shared/redaction.py`'s `redact_dict` to all audit log details before persistence. - **`src/cleveragents/infrastructure/events/types.py`**: Added 5 new `EventType` enum members: `CORRECTION_APPLIED`, `CONFIG_CHANGED`, `ENTITY_DELETED`, `AUTH_SUCCESS`, `AUTH_FAILURE`. - **`src/cleveragents/application/services/audit_service.py`**: Added `auth_success` and `auth_failure` to `VALID_EVENT_TYPES`. - **`src/cleveragents/application/services/plan_lifecycle_service.py`**: Wired to emit `PLAN_APPLIED` and `PLAN_CANCELLED` events at the appropriate audit points. - **`src/cleveragents/application/container.py`**: Registered `AuditEventSubscriber` as a singleton in the DI container alongside a new `AuditService` singleton. - **`src/cleveragents/application/services/__init__.py`**: Exported `AuditEventSubscriber`. - **`features/observability/audit_service_wiring.feature`**: 17 BDD scenarios covering all 9 event types, redaction, filtering, field propagation, and error resilience. - **`robot/audit_service_wiring.robot`**: 4 integration tests. - **`benchmarks/bench_audit_service.py`**: ASV benchmark for audit log write throughput. ### Quality Gates | Gate | Result | |------|--------| | `nox -s lint` | PASS | | `nox -s typecheck` | PASS (0 errors, 0 warnings) | | `nox -s unit_tests` | PASS (350 features, 9715 scenarios, 37386 steps) | | `nox -s integration_tests` | PASS (1346 tests) | | `nox -s coverage_report` | PASS (99% >= 97% threshold) |
Author
Owner

PM Acknowledgment (Day 31):

Thank you @CoreRasurae. PR #659 submitted.

Status: PR #659 has merge conflict and 4 comments (self-comment by author). No external reviews yet.

Action needed: Rebase PR #659 and request review. Note: Related spec issues documented in #678 (which has been triaged and assigned to you).

Priority: After TDD infra (#627, #629) is complete, this is next in your queue.

**PM Acknowledgment (Day 31)**: Thank you @CoreRasurae. PR #659 submitted. **Status**: PR #659 has merge conflict and 4 comments (self-comment by author). No external reviews yet. **Action needed**: Rebase PR #659 and request review. Note: Related spec issues documented in #678 (which has been triaged and assigned to you). **Priority**: After TDD infra (#627, #629) is complete, this is next in your queue.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#581
No description provided.