test(e2e): add M2 actor + tool source smoke suite #169

Closed
opened 2026-02-22 23:39:47 +00:00 by freemo · 3 comments
Owner

Metadata

  • Commit Message: test(e2e): add M2 actor + tool source smoke suite
  • Branch: feature/m2-actor-tool-smoke

Background

E2E fixtures cover hierarchical actor YAML, an MCP stub server, and skill packs. Helper steps manage the MCP stub server lifecycle (start/stop) and clean up temp directories after test runs.

Acceptance Criteria

  • Add fixtures for hierarchical actor YAML, MCP stub server, and skill packs.
  • Add helper steps for starting/stopping the MCP stub server and cleaning temp directories.
  • Update docs/development/testing.md with M2 smoke suite usage and MCP stub bootstrap steps.

Definition of Done

This issue is complete when:

  • All subtasks below are completed and checked off.
  • A Git commit is created where the first line of the commit message matches
    the Commit Message in Metadata exactly, followed by a blank line, then
    additional lines providing relevant details about the implementation. The
    commit body should be appropriate in size for a commit message and relatively
    complete in describing what was done.
  • The commit is pushed to the remote on the branch matching the Branch in
    Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and
    merged before this issue is marked done.

Subtasks

  • Add fixtures for hierarchical actor YAML, MCP stub server, and skill packs.
  • Add helper steps for starting/stopping the MCP stub server and cleaning temp directories.
  • Update docs/development/testing.md with M2 smoke suite usage and MCP stub bootstrap steps.
  • Tests (Behave): Add features/m2_actor_tool_smoke.feature covering actor compile, MCP tool invocation, and skill registry.
  • Tests (Robot): Add Robot CLI smoke suite for actor compile + plan execute.
  • Tests (ASV): Add benchmarks/m2_actor_tool_smoke_bench.py for baseline runtime.
  • Verify coverage >=97% via nox -s coverage_report. If coverage is <97% then review the current unit test coverage report at build/coverage.xml and use it to write new Behave based unit tests to improve code coverage. Specifically, write Behave style unit tests that are descriptively named and specifically improves coverage on whichever file has the most uncovered lines by writing tests that will target the uncovered lines in the report. Once that is done rerun nox -s coverage_report to verify all tests pass and coverage is above >=97%. Only mark this as complete once coverage is >=97%, if not repeat this task as many times as is needed until coverage reaches >=97%.
  • Run nox (all default sessions, including benchmark), fix any errors if needed ensuring nox passes across entire code base, do not ignore any failure even if it seems unrelated to this commit, fix it.

Section: #### M2: Actor Graphs + Tool Sources (Day 14)
Status: In Review

## Metadata - **Commit Message**: `test(e2e): add M2 actor + tool source smoke suite` - **Branch**: `feature/m2-actor-tool-smoke` ## Background E2E fixtures cover hierarchical actor YAML, an MCP stub server, and skill packs. Helper steps manage the MCP stub server lifecycle (start/stop) and clean up temp directories after test runs. ## Acceptance Criteria - [x] Add fixtures for hierarchical actor YAML, MCP stub server, and skill packs. - [x] Add helper steps for starting/stopping the MCP stub server and cleaning temp directories. - [x] Update `docs/development/testing.md` with M2 smoke suite usage and MCP stub bootstrap steps. ## Definition of Done This issue is complete when: - All subtasks below are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. The commit body should be appropriate in size for a commit message and relatively complete in describing what was done. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. ## Subtasks - [x] Add fixtures for hierarchical actor YAML, MCP stub server, and skill packs. - [x] Add helper steps for starting/stopping the MCP stub server and cleaning temp directories. - [x] Update `docs/development/testing.md` with M2 smoke suite usage and MCP stub bootstrap steps. - [x] Tests (Behave): Add `features/m2_actor_tool_smoke.feature` covering actor compile, MCP tool invocation, and skill registry. - [x] Tests (Robot): Add Robot CLI smoke suite for actor compile + plan execute. - [x] Tests (ASV): Add `benchmarks/m2_actor_tool_smoke_bench.py` for baseline runtime. - [x] Verify coverage >=97% via `nox -s coverage_report`. If coverage is <97% then review the current unit test coverage report at `build/coverage.xml` and use it to write new Behave based unit tests to improve code coverage. Specifically, write Behave style unit tests that are descriptively named and specifically improves coverage on whichever file has the most uncovered lines by writing tests that will target the uncovered lines in the report. Once that is done rerun `nox -s coverage_report` to verify all tests pass and coverage is above >=97%. Only mark this as complete once coverage is >=97%, if not repeat this task as many times as is needed until coverage reaches >=97%. - [x] Run `nox` (all default sessions, including benchmark), fix any errors if needed ensuring nox passes across **entire** code base, do not ignore any failure even if it seems unrelated to this commit, fix it. **Section**: #### M2: Actor Graphs + Tool Sources (Day 14) **Status**: In Review
freemo added this to the v3.1.0 milestone 2026-02-22 23:39:47 +00:00
Author
Owner

Expected completion updated (Day 15 rebaseline): Day 22 / 2026-03-02 (previously Day 18 / 2026-02-26)

**Expected completion updated (Day 15 rebaseline):** Day 22 / 2026-03-02 (previously Day 18 / 2026-02-26)
freemo added the due date 2026-02-25 2026-02-23 18:41:26 +00:00
Member

Implementation Complete

Branch: feature/m2-actor-tool-smoke
Commit: test(e2e): add M2 actor + tool source smoke suite

Files Added (9)

File Purpose
features/m2_actor_tool_smoke.feature 10 Behave BDD scenarios
features/steps/m2_actor_tool_smoke_steps.py Step definitions (all M2-prefixed)
features/mocks/mcp_stub_server.py MCP stub server with 3 tools
features/fixtures/m2/actors/m2_hierarchical_actor.yaml Graph-type actor (3 nodes)
features/fixtures/m2/m2_skill_pack.yaml Skill pack (2 tool refs + 1 inline)
robot/m2_actor_tool_smoke.robot 6 Robot Framework integration tests
robot/helper_m2_actor_tool_smoke.py Robot CLI helper script
benchmarks/m2_actor_tool_smoke_bench.py 12 ASV benchmarks
docs/development/testing.md Updated with M2 smoke suite section

Quality Gates

  • lint: PASS
  • typecheck: PASS (0 errors)
  • unit_tests: PASS (198 features, 0 failures)
  • integration_tests: PASS
  • coverage_report: PASS (97.2% >= 97% threshold)
  • benchmark: PASS (789 benchmarks)

Test Coverage

  • Behave: 10 scenarios, 53 steps — actor YAML loading, ActorLoader discovery, skill pack loading, skill registry+resolution, skill context invocation, tool lifecycle (discover/activate/execute/deactivate), tool registry, MCP stub discovery+invocation, MCP lifecycle guards
  • Robot: 6 tests — actor fixture load, actor discovery, skill fixture load, skill registry, tool lifecycle, MCP stub
  • ASV: 12 benchmarks — actor loading (3), skill registry (3), tool lifecycle (3), MCP stub (3+setup/teardown)
## Implementation Complete Branch: `feature/m2-actor-tool-smoke` Commit: `test(e2e): add M2 actor + tool source smoke suite` ### Files Added (9) | File | Purpose | |------|--------| | `features/m2_actor_tool_smoke.feature` | 10 Behave BDD scenarios | | `features/steps/m2_actor_tool_smoke_steps.py` | Step definitions (all M2-prefixed) | | `features/mocks/mcp_stub_server.py` | MCP stub server with 3 tools | | `features/fixtures/m2/actors/m2_hierarchical_actor.yaml` | Graph-type actor (3 nodes) | | `features/fixtures/m2/m2_skill_pack.yaml` | Skill pack (2 tool refs + 1 inline) | | `robot/m2_actor_tool_smoke.robot` | 6 Robot Framework integration tests | | `robot/helper_m2_actor_tool_smoke.py` | Robot CLI helper script | | `benchmarks/m2_actor_tool_smoke_bench.py` | 12 ASV benchmarks | | `docs/development/testing.md` | Updated with M2 smoke suite section | ### Quality Gates - **lint**: PASS - **typecheck**: PASS (0 errors) - **unit_tests**: PASS (198 features, 0 failures) - **integration_tests**: PASS - **coverage_report**: PASS (97.2% >= 97% threshold) - **benchmark**: PASS (789 benchmarks) ### Test Coverage - **Behave**: 10 scenarios, 53 steps — actor YAML loading, ActorLoader discovery, skill pack loading, skill registry+resolution, skill context invocation, tool lifecycle (discover/activate/execute/deactivate), tool registry, MCP stub discovery+invocation, MCP lifecycle guards - **Robot**: 6 tests — actor fixture load, actor discovery, skill fixture load, skill registry, tool lifecycle, MCP stub - **ASV**: 12 benchmarks — actor loading (3), skill registry (3), tool lifecycle (3), MCP stub (3+setup/teardown)
Member

Implementation Notes — PR #421 Code Review Feedback

Addressed all three required changes and both advisory items from Jeff's review on PR #421.

Required Changes

  1. CHANGELOG.md — Added entry under ## Unreleased describing the M2 E2E test suite.
  2. CONTRIBUTORS.md — Added Brent E. Edwards <brent.edwards@cleverthis.com> to contributors list and details section.
  3. docs/development/testing.md — Corrected M2 Behave scenario count from 11 to 10 to match the actual features/m2_actor_tool_smoke.feature file.

Advisory Items (also addressed)

  1. benchmarks/m2_actor_tool_smoke_bench.py — Added standard explanatory comments to sys.path / importlib.reload() blocks, matching the project-wide ASV benchmark convention (same pattern in action_cli_bench.py, action_schema_bench.py, etc.).
  2. features/steps/m2_actor_tool_smoke_steps.py — Replaced untyped lambda **kwargs: {"ok": True} with a properly typed _m2_noop_handler(**kwargs: Any) -> dict[str, bool] function.

Quality Gates

All nox stages pass:

  • lint: PASS
  • typecheck: 0 errors
  • format: 750 files unchanged
  • unit_tests: 280 features / 5846 scenarios / 0 failures
  • integration_tests: 682 tests / 0 failures
  • coverage_report: 97.2% (threshold: 97%)

Commit: 9bf87a2 on feature/m2-actor-tool-smoke.

## Implementation Notes — PR #421 Code Review Feedback Addressed all three required changes and both advisory items from Jeff's review on PR #421. ### Required Changes 1. **CHANGELOG.md** — Added entry under `## Unreleased` describing the M2 E2E test suite. 2. **CONTRIBUTORS.md** — Added `Brent E. Edwards <brent.edwards@cleverthis.com>` to contributors list and details section. 3. **docs/development/testing.md** — Corrected M2 Behave scenario count from 11 to 10 to match the actual `features/m2_actor_tool_smoke.feature` file. ### Advisory Items (also addressed) 4. **benchmarks/m2_actor_tool_smoke_bench.py** — Added standard explanatory comments to `sys.path` / `importlib.reload()` blocks, matching the project-wide ASV benchmark convention (same pattern in `action_cli_bench.py`, `action_schema_bench.py`, etc.). 5. **features/steps/m2_actor_tool_smoke_steps.py** — Replaced untyped `lambda **kwargs: {"ok": True}` with a properly typed `_m2_noop_handler(**kwargs: Any) -> dict[str, bool]` function. ### Quality Gates All nox stages pass: - lint: PASS - typecheck: 0 errors - format: 750 files unchanged - unit_tests: 280 features / 5846 scenarios / 0 failures - integration_tests: 682 tests / 0 failures - coverage_report: 97.2% (threshold: 97%) Commit: `9bf87a2` on `feature/m2-actor-tool-smoke`.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

2026-02-25

Reference
cleveragents/cleveragents-core#169
No description provided.