test(e2e): add M3 decision + validation suites #437

2026-02-25T21:27:29Z

brent.edwards commented

2026-02-25 21:27:29 +00:00

Summary

Closes #179

Added features/m3_decision_validation_smoke.feature (25 scenarios) covering decision tree fixtures, invariant CLI, validation CLI, plan correct (dry-run/revert/append), and negative cases
Added features/steps/m3_decision_validation_smoke_steps.py with step implementations
Added fixtures in features/fixtures/m3/ (decision trees, validation attachments, invariant configs)
Added robot/m3_decision_validation_smoke.robot (8 integration tests) + helper
Added benchmarks/m3_smoke_bench.py (5 ASV suites)
Updated docs/development/testing.md with M3 suite documentation

Local checks: lint passed, typecheck passed (0 errors)

## Summary Closes #179 - Added `features/m3_decision_validation_smoke.feature` (25 scenarios) covering decision tree fixtures, invariant CLI, validation CLI, plan correct (dry-run/revert/append), and negative cases - Added `features/steps/m3_decision_validation_smoke_steps.py` with step implementations - Added fixtures in `features/fixtures/m3/` (decision trees, validation attachments, invariant configs) - Added `robot/m3_decision_validation_smoke.robot` (8 integration tests) + helper - Added `benchmarks/m3_smoke_bench.py` (5 ASV suites) - Updated `docs/development/testing.md` with M3 suite documentation **Local checks**: lint passed, typecheck passed (0 errors)

brent.edwards added 1 commit 2026-02-25 21:27:29 +00:00

test(e2e): add M3 decision + validation suites

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 16s

Details

CI / quality (pull_request) Successful in 19s

Details

CI / build (pull_request) Successful in 20s

Details

CI / security (pull_request) Successful in 55s

Details

CI / typecheck (pull_request) Successful in 1m5s

Details

CI / integration_tests (pull_request) Successful in 4m4s

Details

CI / benchmark-regression (pull_request) Successful in 22m21s

Details

CI / unit_tests (pull_request) Failing after 34m7s

Details

CI / docker (pull_request) Has been skipped

Details

CI / coverage (pull_request) Has been cancelled

Details

ace7311de4

brent.edwards added 1 commit 2026-02-25 22:09:35 +00:00

fix(test): correct dry-run assertion in M3 decision validation smoke

CI / lint (pull_request) Successful in 24s

Details

CI / typecheck (pull_request) Successful in 58s

Details

CI / security (pull_request) Successful in 56s

Details

CI / quality (pull_request) Successful in 42s

Details

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 25s

Details

CI / integration_tests (pull_request) Successful in 5m7s

Details

CI / unit_tests (pull_request) Successful in 25m18s

Details

CI / benchmark-regression (pull_request) Successful in 26m14s

Details

CI / docker (pull_request) Successful in 16s

Details

CI / coverage (pull_request) Successful in 1h40m20s

Details

22986aaaf9

The CLI plain format output does not include the literal text "Dry Run"
(that string only appears in the Rich panel title). Assert against
"risk_level" which is present in the dry-run impact output instead.

brent.edwards referenced this pull request

2026-02-25 22:14:58 +00:00

test: consolidated Brent QA batch — issues #179, #180, #404, #405, #187 #442

brent.edwards commented

2026-02-25 22:15:07 +00:00

Do not merge this PR individually. All changes are consolidated into PR #442 (develop-brent-5). Please review and merge #442 instead.

**Do not merge this PR individually.** All changes are consolidated into PR #442 (`develop-brent-5`). Please review and merge #442 instead.

brent.edwards commented

2026-02-25 23:53:02 +00:00

Code Review — PR #437: test(e2e): add M3 decision + validation suites

Reviewer: @brent.edwards | Review type: Comment-only

Nice coverage on the M3 fixtures and CLI flows. A few issues to fix before merge:

P2:should-fix — Behave hook won’t run (patchers never stop)

features/steps/m3_decision_validation_smoke_steps.py defines after_scenario() at the bottom, but Behave only picks up hooks from features/environment.py. That means invariant_patcher, validation_patcher, plan_patcher, and correction_patcher never get stopped, and the mocks can leak into later scenarios/features. Please move the hook into features/environment.py or wrap patchers in context.add_cleanup / per-step with patch(...).

P2:should-fix — Doc update includes files not in this PR

docs/development/testing.md adds the decision persistence benchmark/feature sections, but PR #437 doesn’t include those files. If this merges first, the docs will reference missing files. Please split the decision-persistence doc section into PR #438 or remove it here.

P3:nit — Invalid ULID constants in tests

The hard-coded IDs in features/steps/m3_decision_validation_smoke_steps.py and robot/helper_m3_decision_validation_smoke.py include non-ULID strings (e.g., 25-char IDs and _INVARIANT_ULID contains I). This is fine today, but will break if any ULID validation is introduced on these fields. Safer to use str(ULID()) or 26-char Crockford ULIDs.

P3:nit — Temporary validation configs never deleted

Both the Behave steps and Robot helper create NamedTemporaryFile(..., delete=False) without cleanup. Not a blocker, but consider deleting in teardown to avoid temp accumulation.

Happy to re-review after the P2s are addressed.

## Code Review — PR #437: test(e2e): add M3 decision + validation suites **Reviewer:** @brent.edwards | **Review type:** Comment-only Nice coverage on the M3 fixtures and CLI flows. A few issues to fix before merge: --- ### P2:should-fix — Behave hook won’t run (patchers never stop) `features/steps/m3_decision_validation_smoke_steps.py` defines `after_scenario()` at the bottom, but Behave only picks up hooks from `features/environment.py`. That means `invariant_patcher`, `validation_patcher`, `plan_patcher`, and `correction_patcher` never get stopped, and the mocks can leak into later scenarios/features. Please move the hook into `features/environment.py` or wrap patchers in `context.add_cleanup` / per-step `with patch(...)`. --- ### P2:should-fix — Doc update includes files not in this PR `docs/development/testing.md` adds the decision persistence benchmark/feature sections, but PR #437 doesn’t include those files. If this merges first, the docs will reference missing files. Please split the decision-persistence doc section into PR #438 or remove it here. --- ### P3:nit — Invalid ULID constants in tests The hard-coded IDs in `features/steps/m3_decision_validation_smoke_steps.py` and `robot/helper_m3_decision_validation_smoke.py` include non-ULID strings (e.g., 25-char IDs and `_INVARIANT_ULID` contains `I`). This is fine today, but will break if any ULID validation is introduced on these fields. Safer to use `str(ULID())` or 26-char Crockford ULIDs. --- ### P3:nit — Temporary validation configs never deleted Both the Behave steps and Robot helper create `NamedTemporaryFile(..., delete=False)` without cleanup. Not a blocker, but consider deleting in teardown to avoid temp accumulation. --- Happy to re-review after the P2s are addressed.

brent.edwards added 1 commit 2026-02-26 01:13:30 +00:00

fix(test): address review findings in M3 smoke tests

CI / lint (pull_request) Successful in 24s

Details

CI / typecheck (pull_request) Successful in 55s

Details

CI / security (pull_request) Successful in 47s

Details

CI / quality (pull_request) Successful in 28s

Details

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 23s

Details

CI / integration_tests (pull_request) Successful in 5m7s

Details

CI / unit_tests (pull_request) Successful in 20m55s

Details

CI / benchmark-regression (pull_request) Successful in 25m40s

Details

CI / docker (pull_request) Successful in 14s

Details

CI / coverage (pull_request) Successful in 1h48m41s

Details

93730067a2

- Replace dead after_scenario() hook with context.add_cleanup() calls
  so patchers are properly stopped (Behave only runs hooks from
  environment.py)
- Fix invalid ULID constants: pad DECISION/CORRECTION to 26 chars,
  replace invalid 'I' with 'J' in INVARIANT_ULID
- Add temp file cleanup for NamedTemporaryFile(delete=False) via
  context.add_cleanup() in behave steps and try/finally in robot helper
- Remove decision-persistence doc sections from testing.md that belong
  in PR #438, not this PR

brent.edwards added 1 commit 2026-02-26 03:43:10 +00:00

Merge branch 'master' into feature/m3-decision-validation-smoke

CI / lint (pull_request) Successful in 17s

Details

CI / quality (pull_request) Successful in 20s

Details

CI / security (pull_request) Successful in 37s

Details

CI / typecheck (pull_request) Successful in 39s

Details

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 20s

Details

CI / integration_tests (pull_request) Successful in 4m46s

Details

CI / unit_tests (pull_request) Successful in 31m4s

Details

CI / benchmark-regression (pull_request) Successful in 24m29s

Details

CI / docker (pull_request) Successful in 15s

Details

CI / coverage (pull_request) Successful in 1h25m36s

Details

f40f1202fb

brent.edwards closed this pull request

2026-02-26 23:54:30 +00:00

brent.edwards deleted branch feature/m3-decision-validation-smoke

2026-02-26 23:54:43 +00:00

freemo added the

State

Wont Do

label 2026-03-04 00:58:50 +00:00

CI / lint (pull_request) Successful in 17s

Required

Details

CI / quality (pull_request) Successful in 20s

Required

Details

CI / security (pull_request) Successful in 37s

Required

Details

CI / typecheck (pull_request) Successful in 39s

Required

Details

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 20s

Required

Details

CI / integration_tests (pull_request) Successful in 4m46s

Required

Details

CI / unit_tests (pull_request) Successful in 31m4s

Required

Details

CI / benchmark-regression (pull_request) Successful in 24m29s

Details

CI / docker (pull_request) Successful in 15s

Required

Details

CI / coverage (pull_request) Successful in 1h25m36s

Required

Details

Pull request closed

This pull request cannot be reopened because the branch was deleted.

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: cleveragents/cleveragents-core#437