test(a2a): integration tests for full A2A session and plan lifecycle #10032

Closed
opened 2026-04-16 13:39:11 +00:00 by HAL9000 · 1 comment
Owner

Background: Individual unit and BDD tests exist for session lifecycle and plan lifecycle operations in isolation, but there are no end-to-end integration tests that exercise the complete flow: creating a session, creating a plan within that session, executing the plan, monitoring events, and closing/deleting the session. These integration tests are required to validate that all A2A facade components (session management, plan lifecycle, event queue, guard enforcement) work correctly together as a system before v3.5.0 ships.

Acceptance criteria:

  • Integration tests cover the full happy-path flow: session create → plan create → plan execute → plan status → plan cancel/rollback → session close → session delete
  • Integration tests verify that plan execution events are published to the event queue and received by subscribers during the lifecycle
  • Integration tests verify that guard enforcement (denylist, budget cap, tool call limit) is applied correctly during plan execution within a session
  • Integration tests use real service instances (no mocks) and run against the full DI-wired application stack
  • All integration tests pass under nox with coverage ≥ 97%

Metadata

  • Commit Message: test(a2a): add integration tests for full A2A session and plan lifecycle
  • Branch: test/a2a-session-plan-lifecycle-integration

Subtasks

  • Set up integration test fixture that bootstraps the full DI container with real A2A facade, event queue, and guard services
  • Write integration test: session create → list → show → close → delete (full session lifecycle)
  • Write integration test: plan create → execute → status → cancel (full plan lifecycle within a session)
  • Write integration test: plan execute → event queue receives published events → subscriber callback invoked
  • Write integration test: plan execute with denylist violation → GuardViolationError raised, plan halted
  • Write integration test: plan execute with budget cap exceeded → execution halted at cap
  • Write integration test: plan execute with tool call limit exceeded → execution halted at limit
  • Write integration test: plan rollback after failed execution → session state consistent
  • Verify all integration tests pass via nox -s integration (or equivalent session)
  • Verify coverage ≥ 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

  • All acceptance criteria met
  • Tests written and passing (coverage ≥ 97%)
  • Code reviewed and approved
  • Documentation updated
  • No regressions introduced

Parent Epic

Child of and blocks #8423 — A2A Protocol Integration (v3.5.0)


Automated by CleverAgents Bot
Supervisor: Epic Planning | Agent: epic-planning-pool-supervisor
Worker: [AUTO-EPIC-3]

**Background**: Individual unit and BDD tests exist for session lifecycle and plan lifecycle operations in isolation, but there are no end-to-end integration tests that exercise the complete flow: creating a session, creating a plan within that session, executing the plan, monitoring events, and closing/deleting the session. These integration tests are required to validate that all A2A facade components (session management, plan lifecycle, event queue, guard enforcement) work correctly together as a system before v3.5.0 ships. **Acceptance criteria**: - [ ] Integration tests cover the full happy-path flow: session create → plan create → plan execute → plan status → plan cancel/rollback → session close → session delete - [ ] Integration tests verify that plan execution events are published to the event queue and received by subscribers during the lifecycle - [ ] Integration tests verify that guard enforcement (denylist, budget cap, tool call limit) is applied correctly during plan execution within a session - [ ] Integration tests use real service instances (no mocks) and run against the full DI-wired application stack - [ ] All integration tests pass under `nox` with coverage ≥ 97% ## Metadata - **Commit Message**: `test(a2a): add integration tests for full A2A session and plan lifecycle` - **Branch**: `test/a2a-session-plan-lifecycle-integration` ## Subtasks - [ ] Set up integration test fixture that bootstraps the full DI container with real A2A facade, event queue, and guard services - [ ] Write integration test: session create → list → show → close → delete (full session lifecycle) - [ ] Write integration test: plan create → execute → status → cancel (full plan lifecycle within a session) - [ ] Write integration test: plan execute → event queue receives published events → subscriber callback invoked - [ ] Write integration test: plan execute with denylist violation → GuardViolationError raised, plan halted - [ ] Write integration test: plan execute with budget cap exceeded → execution halted at cap - [ ] Write integration test: plan execute with tool call limit exceeded → execution halted at limit - [ ] Write integration test: plan rollback after failed execution → session state consistent - [ ] Verify all integration tests pass via `nox -s integration` (or equivalent session) - [ ] Verify coverage ≥ 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done - [ ] All acceptance criteria met - [ ] Tests written and passing (coverage ≥ 97%) - [ ] Code reviewed and approved - [ ] Documentation updated - [ ] No regressions introduced ## Parent Epic Child of and blocks #8423 — A2A Protocol Integration (v3.5.0) --- **Automated by CleverAgents Bot** Supervisor: Epic Planning | Agent: epic-planning-pool-supervisor Worker: [AUTO-EPIC-3]
Author
Owner

Implementation Attempt — Tier 1: Haiku — Success

Implemented integration tests for the full A2A session and plan lifecycle.

What was done:

  • Created robot/a2a_session_plan_lifecycle_integration.robot with 7 integration test cases
  • Created robot/helper_a2a_session_plan_lifecycle_integration.py with real service instances (no mocks)
  • Tests cover: full session lifecycle, full plan lifecycle, event queue publish/subscribe, guard enforcement (budget cap, tool call limit, denylist), and plan rollback with session state consistency
  • All tests use in-memory SQLite with real DI-wired services

Quality gate status:

  • lint ✓
  • typecheck ✓
  • integration_tests ✓ (7/7 tests pass)

PR created: #10760#10760


Automated by CleverAgents Bot
Supervisor: Implementation Pool | Agent: implementation-worker

**Implementation Attempt** — Tier 1: Haiku — Success Implemented integration tests for the full A2A session and plan lifecycle. **What was done:** - Created `robot/a2a_session_plan_lifecycle_integration.robot` with 7 integration test cases - Created `robot/helper_a2a_session_plan_lifecycle_integration.py` with real service instances (no mocks) - Tests cover: full session lifecycle, full plan lifecycle, event queue publish/subscribe, guard enforcement (budget cap, tool call limit, denylist), and plan rollback with session state consistency - All tests use in-memory SQLite with real DI-wired services **Quality gate status:** - lint ✓ - typecheck ✓ - integration_tests ✓ (7/7 tests pass) **PR created:** #10760 — https://git.cleverthis.com/cleveragents/cleveragents-core/pulls/10760 --- **Automated by CleverAgents Bot** Supervisor: Implementation Pool | Agent: implementation-worker
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10032
No description provided.