Specification issues during the development of issue #581 #678

Closed
opened 2026-03-10 21:53:37 +00:00 by CoreRasurae · 2 comments
Member

Architectural issues that may require architecture enhanchement:

H1 (High) — Only 2 of 9 subscribed event types have emission points
Finding: AuditEventSubscriber subscribes to 9 EventType values, but only PLAN_APPLIED and PLAN_CANCELLED are actually emitted anywhere in the codebase. The remaining 7 (RESOURCE_MODIFIED, CORRECTION_APPLIED, CONFIG_CHANGED, ENTITY_DELETED, SESSION_CREATED, AUTH_SUCCESS, AUTH_FAILURE) have no event_bus.emit() call site.

Why not fixable here: Adding emission points for these event types requires modifying domain services that are outside the audit wiring scope — e.g. SessionService, ConfigService, authentication middleware, resource management services. Each would need its own event_bus dependency injected and domain-appropriate event payloads designed. This PR's charter (issue #581) is to wire the subscriber and the two PlanLifecycleService events; the remaining emitters belong in separate PRs per service.

Suggested follow-up: Open dedicated issues for each domain service to emit its security-relevant events (e.g. "Wire SessionService to emit SESSION_CREATED via EventBus").


M1 (Medium) — user_identity not propagated to audit entries
Finding: AuditService.record() accepts a user_identity parameter, but AuditEventSubscriber always passes None because DomainEvent has no user_identity field.

Why not fixable here: DomainEvent is a frozen Pydantic model defined in src/cleveragents/infrastructure/events/models.py. Adding a user_identity field is an infrastructure-level schema change that would affect every event emitter and consumer across the codebase. The subscriber can only forward what exists on the event — it cannot fabricate identity data. Resolving this requires:

Adding user_identity (or an equivalent auth-context field) to DomainEvent
Ensuring all emit sites populate it from the current authentication context
Updating the subscriber to forward it
This is a cross-cutting infrastructure concern, not an audit wiring issue.

Suggested follow-up: An infrastructure issue to extend DomainEvent with auth-context fields and propagate identity through the event pipeline.


M4 (Medium) — plan_applied event details are incomplete
Finding: The PLAN_APPLIED event only includes action_name, phase, and project_names in its details. The specification's SEC7 audit logging section suggests richer detail (e.g. changeset statistics, number of files affected).

Why not fixable here: At the point where PLAN_APPLIED is emitted in PlanLifecycleService._apply_plan() (around line 1160), the method has just transitioned the plan's phase. The changeset data, diff statistics, and file counts are computed later in the apply pipeline by PlanApplyService. The lifecycle service does not have access to this data at emit time, and restructuring the apply pipeline to emit the event later would change the semantics of when "plan applied" is recorded.

Suggested follow-up: Either emit a supplementary PLAN_APPLY_COMPLETED event from PlanApplyService after changeset application with full statistics, or restructure the pipeline so the lifecycle service receives apply results before emitting.


M5 (Medium) — plan_cancelled event details are incomplete
Finding: The PLAN_CANCELLED event only includes reason and project_names. Richer context (e.g. plan progress at cancellation, resources released) would improve the audit trail.

Why not fixable here: Similar to M4 — at the cancellation point in PlanLifecycleService.cancel_plan() (around line 1252), the service only has the plan model and the caller-provided reason. Resource cleanup and sandbox teardown happen in separate services after cancellation. The lifecycle service does not orchestrate or observe those downstream effects, so it cannot report on them.

Suggested follow-up: Downstream services (sandbox manager, resource cleanup) could emit their own events, or a cancellation orchestrator could aggregate results into a richer audit entry.


M6 (Medium) — Synchronous SQLite commit blocks the event pipeline
Finding: AuditService.record() performs a synchronous SQLite INSERT + COMMIT. Since ReactiveEventBus.emit() dispatches handlers synchronously in the calling thread, every audit write blocks the domain operation that emitted the event.

Why not fixable here: Addressing this requires changing the AuditService API to be asynchronous (e.g. async def record()) or introducing a write-behind queue. Both are API-breaking changes that affect all existing AuditService callers and the service's threading model. The ReactiveEventBus itself would also need to support async handlers. This is an architectural decision beyond the scope of wiring the subscriber.

Suggested follow-up: An architecture issue to evaluate async audit recording or a write-behind buffer pattern for AuditService.


L2 (Low) — DomainEvent.timestamp not forwarded to AuditService.record()
Finding: DomainEvent has a timestamp field (auto-set to datetime.now(UTC) on creation), but AuditService.record() does not accept a timestamp parameter — it generates its own timestamp internally. The event's original timestamp is therefore lost.

Why not fixable here: AuditService.record() is defined in src/cleveragents/application/services/audit_service.py and its signature does not include a timestamp parameter. Adding one would change the service's public API, affecting all direct callers. The subscriber can only pass what the API accepts. Under normal operation the difference between event creation time and audit recording time is negligible (sub-millisecond), so the practical impact is minimal.

Suggested follow-up: A low-priority enhancement to add an optional timestamp parameter to AuditService.record() so callers can preserve the original event timestamp.

Architectural issues that may require architecture enhanchement: H1 (High) — Only 2 of 9 subscribed event types have emission points Finding: AuditEventSubscriber subscribes to 9 EventType values, but only PLAN_APPLIED and PLAN_CANCELLED are actually emitted anywhere in the codebase. The remaining 7 (RESOURCE_MODIFIED, CORRECTION_APPLIED, CONFIG_CHANGED, ENTITY_DELETED, SESSION_CREATED, AUTH_SUCCESS, AUTH_FAILURE) have no event_bus.emit() call site. Why not fixable here: Adding emission points for these event types requires modifying domain services that are outside the audit wiring scope — e.g. SessionService, ConfigService, authentication middleware, resource management services. Each would need its own event_bus dependency injected and domain-appropriate event payloads designed. This PR's charter (issue #581) is to wire the subscriber and the two PlanLifecycleService events; the remaining emitters belong in separate PRs per service. Suggested follow-up: Open dedicated issues for each domain service to emit its security-relevant events (e.g. "Wire SessionService to emit SESSION_CREATED via EventBus"). ---- M1 (Medium) — user_identity not propagated to audit entries Finding: AuditService.record() accepts a user_identity parameter, but AuditEventSubscriber always passes None because DomainEvent has no user_identity field. Why not fixable here: DomainEvent is a frozen Pydantic model defined in src/cleveragents/infrastructure/events/models.py. Adding a user_identity field is an infrastructure-level schema change that would affect every event emitter and consumer across the codebase. The subscriber can only forward what exists on the event — it cannot fabricate identity data. Resolving this requires: Adding user_identity (or an equivalent auth-context field) to DomainEvent Ensuring all emit sites populate it from the current authentication context Updating the subscriber to forward it This is a cross-cutting infrastructure concern, not an audit wiring issue. Suggested follow-up: An infrastructure issue to extend DomainEvent with auth-context fields and propagate identity through the event pipeline. --- M4 (Medium) — plan_applied event details are incomplete Finding: The PLAN_APPLIED event only includes action_name, phase, and project_names in its details. The specification's SEC7 audit logging section suggests richer detail (e.g. changeset statistics, number of files affected). Why not fixable here: At the point where PLAN_APPLIED is emitted in PlanLifecycleService._apply_plan() (around line 1160), the method has just transitioned the plan's phase. The changeset data, diff statistics, and file counts are computed later in the apply pipeline by PlanApplyService. The lifecycle service does not have access to this data at emit time, and restructuring the apply pipeline to emit the event later would change the semantics of when "plan applied" is recorded. Suggested follow-up: Either emit a supplementary PLAN_APPLY_COMPLETED event from PlanApplyService after changeset application with full statistics, or restructure the pipeline so the lifecycle service receives apply results before emitting. ---- M5 (Medium) — plan_cancelled event details are incomplete Finding: The PLAN_CANCELLED event only includes reason and project_names. Richer context (e.g. plan progress at cancellation, resources released) would improve the audit trail. Why not fixable here: Similar to M4 — at the cancellation point in PlanLifecycleService.cancel_plan() (around line 1252), the service only has the plan model and the caller-provided reason. Resource cleanup and sandbox teardown happen in separate services after cancellation. The lifecycle service does not orchestrate or observe those downstream effects, so it cannot report on them. Suggested follow-up: Downstream services (sandbox manager, resource cleanup) could emit their own events, or a cancellation orchestrator could aggregate results into a richer audit entry. ---- M6 (Medium) — Synchronous SQLite commit blocks the event pipeline Finding: AuditService.record() performs a synchronous SQLite INSERT + COMMIT. Since ReactiveEventBus.emit() dispatches handlers synchronously in the calling thread, every audit write blocks the domain operation that emitted the event. Why not fixable here: Addressing this requires changing the AuditService API to be asynchronous (e.g. async def record()) or introducing a write-behind queue. Both are API-breaking changes that affect all existing AuditService callers and the service's threading model. The ReactiveEventBus itself would also need to support async handlers. This is an architectural decision beyond the scope of wiring the subscriber. Suggested follow-up: An architecture issue to evaluate async audit recording or a write-behind buffer pattern for AuditService. ---- L2 (Low) — DomainEvent.timestamp not forwarded to AuditService.record() Finding: DomainEvent has a timestamp field (auto-set to datetime.now(UTC) on creation), but AuditService.record() does not accept a timestamp parameter — it generates its own timestamp internally. The event's original timestamp is therefore lost. Why not fixable here: AuditService.record() is defined in src/cleveragents/application/services/audit_service.py and its signature does not include a timestamp parameter. Adding one would change the service's public API, affecting all direct callers. The subscriber can only pass what the API accepts. Under normal operation the difference between event creation time and audit recording time is negligible (sub-millisecond), so the practical impact is minimal. Suggested follow-up: A low-priority enhancement to add an optional timestamp parameter to AuditService.record() so callers can preserve the original event timestamp.
freemo added this to the v3.5.0 milestone 2026-03-11 05:35:57 +00:00
Owner

PM Triage (Day 31):

  • Labels: Type/Task, Priority/Medium, MoSCoW/Should have, State/Unverified, Points/5
  • Milestone: v3.5.0
  • Assignee: @CoreRasurae (author, discovered during #581 development)

Assessment: Architectural issues discovered during AuditService implementation (#581). The finding that only 2 of 9 subscribed event types have emission points is a legitimate spec gap. This should be addressed after the #581 PR (#659) merges.

Removed enhancement and Needs feedback labels (not in org label set per CONTRIBUTING.md). Upgraded from Priority/Backlog to Priority/Medium — these are real architectural gaps that affect event system completeness.

**PM Triage (Day 31)**: - **Labels**: `Type/Task`, `Priority/Medium`, `MoSCoW/Should have`, `State/Unverified`, `Points/5` - **Milestone**: v3.5.0 - **Assignee**: @CoreRasurae (author, discovered during #581 development) **Assessment**: Architectural issues discovered during AuditService implementation (#581). The finding that only 2 of 9 subscribed event types have emission points is a legitimate spec gap. This should be addressed after the #581 PR (#659) merges. Removed `enhancement` and `Needs feedback` labels (not in org label set per CONTRIBUTING.md). Upgraded from `Priority/Backlog` to `Priority/Medium` — these are real architectural gaps that affect event system completeness.
Author
Member

Follow-up issues created

All 6 architectural findings from #581 have been decomposed into dedicated follow-up issues:

Finding Severity Follow-up Issue
H1: Only 2 of 9 subscribed event types have emission points High #714
M1: user_identity not propagated to audit entries Medium #715
M4: plan_applied event details are incomplete Medium #716
M5: plan_cancelled event details are incomplete Medium #717
M6: Synchronous SQLite commit blocks event pipeline Medium #718
L2: DomainEvent.timestamp not forwarded to AuditService.record() Low #719

All issues follow the CONTRIBUTING.md template with Metadata, Subtasks, and Definition of Done sections. They reference #678 as their parent issue and #581 as the related AuditService wiring issue.

These issues are created as State/Unverified and unassigned, awaiting PM triage for milestone and assignee decisions.

## Follow-up issues created All 6 architectural findings from #581 have been decomposed into dedicated follow-up issues: | Finding | Severity | Follow-up Issue | |---------|----------|----------------| | **H1**: Only 2 of 9 subscribed event types have emission points | High | #714 | | **M1**: user_identity not propagated to audit entries | Medium | #715 | | **M4**: plan_applied event details are incomplete | Medium | #716 | | **M5**: plan_cancelled event details are incomplete | Medium | #717 | | **M6**: Synchronous SQLite commit blocks event pipeline | Medium | #718 | | **L2**: DomainEvent.timestamp not forwarded to AuditService.record() | Low | #719 | All issues follow the CONTRIBUTING.md template with Metadata, Subtasks, and Definition of Done sections. They reference #678 as their parent issue and #581 as the related AuditService wiring issue. These issues are created as `State/Unverified` and unassigned, awaiting PM triage for milestone and assignee decisions.
CoreRasurae 2026-03-12 01:56:25 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#678
No description provided.