test(e2e): workflow example 4 — multi-project dependency update (supervised profile) #815
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Depends on
#627 Implement @tdd_expected_fail tag handling in Behave environment
cleveragents/cleveragents-core
#628 Implement @tdd_expected_fail tag handling in Robot Framework
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core!815
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "test/e2e-wf04-multi-project"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Implements WF04 E2E coverage for ticket #750 — multi-project dependency update using the supervised automation profile. Addresses all review feedback by hardening assertions, adding unit tests, fixing import hygiene, and improving debuggability.
Closes #750.
What changed
Robot test (
robot/e2e/wf04_multi_project.robot):plan useparsing to--format jsonand extractedplan_iddeterministically.project_linksmatch all 4 expected projects exactly.Extract JSON From Stdout(usesjson.JSONDecoder().raw_decode()for resilience against trailing non-JSON output).subplan_count >= 1. If the LLM produces 0 subplans, the entire test isSkipped (visible SKIPPED in CI) rather than silently passing with all ACs individually skipped. After apply, a hard assertion catches the case where subplans existed post-execute but vanished post-apply.Count Decision Nodeskeyword now invokeswf04_snapshot_helper.py --count-nodesas a subprocess, eliminating the complex inlineEvaluateexpression and preventing the application DI container from being imported into the Robot test runner process.WF04 Test Teardownkeyword capturing plan status and decision tree on failure (mirrors WF05 pattern).ULID_PATTERNvariable.plan executecalls and positional argument order.WF04_PLAN_IDtest variable for teardown access.Snapshot helper (
robot/e2e/wf04_snapshot_helper.py):_iso()normalises all timestamps to UTC-aware format before serialisation._iso()guards against non-datetime truthy values (returns empty string).count_decision_nodes()now has amax_depth=50parameter to prevent unbounded recursion on malformed trees.count_decision_nodes()no longer decrements depth for sibling list iteration — list items are siblings, not children, so depth is passed unchanged.count_decision_nodes()root parameter typed withDecisionTreeunion alias (dict[str, Any] | list[Any]) instead of bareAny.--count-nodes <json_file|->CLI mode for subprocess-based decision-node counting from Robot keywords.sys.path.append()instead ofsys.path.insert(0, ...)to avoid shadowing the standard library.unmapped_resourcesfield to each subplan entry for debugging.json.dumps()now usesdefault=strfor defensive serialisation.ValueError | RuntimeErrorfor expected errors, broadexceptwithtraceback.format_exc()for unexpected errors. Broadexceptblocks documented with inline comments explaining the intentional catch-all pattern.Unit tests (
features/wf04_snapshot_helper.feature+features/steps/wf04_snapshot_helper_steps.py):_iso(),_enum_value(),count_decision_nodes(), and_build_snapshot()with mocked lifecycle service._build_snapshotimport moved to top-level per CONTRIBUTING.md import guidelines (no function-scoped imports).sys.path.append()instead ofsys.path.insert(0, ...)in step file to match snapshot helper convention."2026-03-15T10:30:00+00:00") instead of substring checks.plan_id,project_scopes, andvalidation_summaryfields (not justsubplan_countandsubplans).subplan_count == 2, concrete mapped project names per subplan (SUB01→proj-a,SUB02→proj-b), and serialized field values:status,child_phase,started_at, andchild_validation_summary.required_passed.child_phase,child_state,child_updated_at,execute_started_at,execute_completed_at,apply_started_at,applied_at.count_decision_nodes truncates at max_depth— verifies depth truncation behavior with a 5-level deep tree capped at max_depth=3."2026-03-15T05:30:00+00:00").Quality gates
All required gates pass on this branch:
nox -e lint✅nox -e typecheck✅nox -e unit_tests✅ (481 features, 12583 scenarios)nox -e integration_tests✅ (1775 tests)nox -e e2e_tests✅ (55 passed, 1 skipped — WF04 skipped without LLM subplan output)nox -e coverage_report✅ (98%)Notes
origin/master(0762815e, includes feat(lsp) and feat(resource) merges) and force-pushed.Deferred items (nits from prior review cycles)
The following review nits were acknowledged but deferred as they do not affect correctness:
long_descriptionfield from spec (#13)_enum_valuescenarios don't cover non-string non-enum types (#14)_build_snapshot(#15)argssection from spec Example 4 (#7) —lifecycle-applyis the system-level approach used;argsare spec examples, not test requirements.plan lifecycle-applyinstead of per-childplan apply(#6) —lifecycle-applyis the correct user-facing command that handles dependency ordering internally.f7f8017b8c2e6278bfae2e6278bfaedd6b24209ddd6b24209dd57e36f22cd57e36f22c92ee9f2c5892ee9f2c58ddd0b6accePM Review — Day 34
Status: Mergeable, 0 reviews, M4 (v3.3.0)
Closes: #750 | Author: @freemo
E2E test for WF04 (multi-project dependency update, supervised profile). Creates 4 git repos, exercises full plan lifecycle with subplan spawning. Zero mocking. Good PR body with detailed manual verification steps.
[MINOR]
PLAN_IDplaceholder in manual verification commands could confuse copy-paste users.Action Items
PM Status — Day 36 (2026-03-16)
Day 34 review assignment deadline check. This PR has 0 reviewer activity after 2 days.
Priority note: M3 PRs take precedence. Reviewers should complete M3 reviews first, then address M4+ PRs in milestone order.
Assigned reviewer: Please acknowledge and provide an ETA for your review, or flag if reassignment is needed.
@hurui200320 I am going to have you take over this PR, it is mostly completed but is waiting on #628 and #966 One is yours and one is Brent's. Please be sure to get this PR and the two blocking PRs I listed in asap, thanks.
PM Status — Day 37
Ownership transferred to @hurui200320. Blocked on #628 and #966. PR is M4 (v3.3.0).
Author: Please rebase onto latest
masterby Day 39 EOD (2026-03-19) and confirm blocker status. Check for merge conflicts proactively.PM status — Day 37
ddd0b6accecc09528a46Code Review — PR #815
(Cannot submit formal approval — self-authored PR.)
E2E test for WF04. Well-structured with proper labels, milestone, and issue linkage. No issues found.
cc09528a46862b34b6be862b34b6be839fcc38c7839fcc38c76dbdbc618f6dbdbc618f5c855decb85c855decb85ad17c45555ad17c455563c7b71a9763c7b71a97d777417b03d777417b03cdee9ea73fcdee9ea73fcbde5e1342cbde5e1342fd4612b4ebfd4612b4eb51e1905a0451e1905a041e5525e1a91e5525e1a9364d0b0e6b364d0b0e6be986b4dd35e986b4dd35845cf61b47