test(integration): workflow example 11 — complex graph actor for multi-stage code review (trusted profile) #810

Closed
brent.edwards wants to merge 1 commit from test/int-wf11-graph-actor into master
Member

Summary

Closes #775

Add Robot Framework integration test suite for Specification Workflow Example 11: Complex Graph Actor for Multi-Stage Code Review. Exercises a custom graph-type actor (5 nodes, 6 edges) with parallel fan-out to security/performance/style reviewer nodes and result synthesis using mocked LLM providers. The action is read-only.

Graph Topology

dispatcher ──┬──> security_reviewer  ──┐
             ├──> performance_reviewer ─┤──> synthesizer
             └──> style_reviewer ──────┘
  • 5 nodes, 6 edges
  • Entry: dispatcher, Exit: synthesizer
  • 3-way fan-out (dispatcher → 3 reviewers), 3-way fan-in (3 reviewers → synthesizer)

New Files

File Description
robot/wf11_graph_actor.robot 8 Robot Framework test cases
robot/helper_wf11_graph_actor.py 8-subcommand Python helper (~470 lines)

Test Cases (8)

  1. Register Graph Actor Via CLI — Register 5-node graph actor via actor add --config
  2. Verify Graph Topology 5 Nodes 6 Edges — Parse YAML, validate 5 nodes, 6 edges, fan-out/fan-in
  3. Compile Graph Actor To LangGraph StateGraph — Compile actor, verify metadata
  4. Create Read Only Review Action — Action referencing graph actor, read-only flag
  5. Plan Use Review With Graph Actor — Create plan from review action
  6. Verify Read Only Guard On Plan Execute — Confirm no crash on read-only execute
  7. Verify Review Synthesis Structure — 3 reviewer edges into synthesizer, sole exit
  8. Verify No File Modifications — Action marked read-only in output

Verification

  • nox -s integration_tests — All 8 tests pass
  • nox -s coverage_report — 98% (≥>=97% threshold)
  • nox -s typecheck — 0 errors
  • nox -s lint — All checks passed
  • nox -s format — No changes needed
  • nox -s docs — Builds successfully
  • nox -s build — Wheel built successfully
## Summary Closes #775 Add Robot Framework integration test suite for Specification Workflow Example 11: Complex Graph Actor for Multi-Stage Code Review. Exercises a custom graph-type actor (5 nodes, 6 edges) with parallel fan-out to security/performance/style reviewer nodes and result synthesis using mocked LLM providers. The action is read-only. ## Graph Topology ``` dispatcher ──┬──> security_reviewer ──┐ ├──> performance_reviewer ─┤──> synthesizer └──> style_reviewer ──────┘ ``` - 5 nodes, 6 edges - Entry: `dispatcher`, Exit: `synthesizer` - 3-way fan-out (dispatcher → 3 reviewers), 3-way fan-in (3 reviewers → synthesizer) ## New Files | File | Description | |------|-------------| | `robot/wf11_graph_actor.robot` | 8 Robot Framework test cases | | `robot/helper_wf11_graph_actor.py` | 8-subcommand Python helper (~470 lines) | ## Test Cases (8) 1. **Register Graph Actor Via CLI** — Register 5-node graph actor via `actor add --config` 2. **Verify Graph Topology 5 Nodes 6 Edges** — Parse YAML, validate 5 nodes, 6 edges, fan-out/fan-in 3. **Compile Graph Actor To LangGraph StateGraph** — Compile actor, verify metadata 4. **Create Read Only Review Action** — Action referencing graph actor, read-only flag 5. **Plan Use Review With Graph Actor** — Create plan from review action 6. **Verify Read Only Guard On Plan Execute** — Confirm no crash on read-only execute 7. **Verify Review Synthesis Structure** — 3 reviewer edges into synthesizer, sole exit 8. **Verify No File Modifications** — Action marked read-only in output ## Verification - `nox -s integration_tests` — All 8 tests pass - `nox -s coverage_report` — 98% (≥>=97% threshold) - `nox -s typecheck` — 0 errors - `nox -s lint` — All checks passed - `nox -s format` — No changes needed - `nox -s docs` — Builds successfully - `nox -s build` — Wheel built successfully
test(integration): workflow example 11 — complex graph actor for multi-stage code review (trusted profile)
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 18s
CI / e2e_tests (pull_request) Successful in 34s
CI / security (pull_request) Successful in 40s
CI / typecheck (pull_request) Successful in 41s
CI / unit_tests (pull_request) Successful in 3m5s
CI / integration_tests (pull_request) Successful in 3m39s
CI / docker (pull_request) Successful in 51s
CI / coverage (pull_request) Successful in 5m58s
CI / benchmark-regression (pull_request) Successful in 36m40s
1ef2e5064b
Add Robot Framework integration test suite for Specification Workflow
Example 11.  Exercises a custom graph-type actor (5 nodes, 6 edges) with
parallel fan-out to security/performance/style reviewer nodes and result
synthesis using mocked LLM providers.  The action is read-only.

New files:
- robot/wf11_graph_actor.robot — 8 Robot test cases
- robot/helper_wf11_graph_actor.py — 8-subcommand helper

Tests cover:
- Graph actor registration via CLI (actor add --config)
- Graph topology validation (5 nodes, 6 edges, fan-out/fan-in)
- Actor compilation to LangGraph StateGraph with metadata
- Read-only action creation referencing graph actor
- Plan creation from read-only review action
- Read-only guard on plan execute (no crash)
- Review synthesis structure (3 reviewers → 1 synthesizer)
- No-file-modification verification via read-only flag

All 8 tests pass via nox -s integration_tests.
Coverage >= 97% maintained (98%).

ISSUES CLOSED: #775
brent.edwards added this to the v3.1.0 milestone 2026-03-13 05:48:32 +00:00
Merge branch 'master' into test/int-wf11-graph-actor
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 18s
CI / e2e_tests (pull_request) Successful in 28s
CI / security (pull_request) Successful in 33s
CI / typecheck (pull_request) Successful in 33s
CI / unit_tests (pull_request) Successful in 3m14s
CI / integration_tests (pull_request) Successful in 3m55s
CI / docker (pull_request) Successful in 54s
CI / coverage (pull_request) Successful in 5m24s
CI / benchmark-regression (pull_request) Successful in 35m31s
cb10678a9e
Owner

PM Review — Day 34

Status: Mergeable, 0 reviews, M2 (v3.1.0)
Author: @brent.edwards

Integration test for WF11 (complex graph actor for multi-stage code review). Robot Framework + helper pattern.

Action Items

Who Action Deadline
@hamza.khyari Peer review Day 37
## PM Review — Day 34 **Status**: Mergeable, 0 reviews, M2 (v3.1.0) **Author**: @brent.edwards Integration test for WF11 (complex graph actor for multi-stage code review). Robot Framework + helper pattern. ### Action Items | Who | Action | Deadline | |-----|--------|----------| | @hamza.khyari | **Peer review** | Day 37 |
freemo modified the milestone from v3.1.0 to v3.2.0 2026-03-16 00:32:00 +00:00
Merge branch 'master' into test/int-wf11-graph-actor
All checks were successful
CI / lint (pull_request) Successful in 31s
CI / typecheck (pull_request) Successful in 45s
CI / quality (pull_request) Successful in 32s
CI / security (pull_request) Successful in 1m4s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 29s
CI / e2e_tests (pull_request) Successful in 2m15s
CI / unit_tests (pull_request) Successful in 3m23s
CI / integration_tests (pull_request) Successful in 3m48s
CI / docker (pull_request) Successful in 9s
CI / coverage (pull_request) Successful in 6m0s
CI / benchmark-regression (pull_request) Successful in 40m3s
98ed65856e
Owner

PM Status — Day 36 (2026-03-16)

Day 34 review assignment deadline check. 0 reviewer activity after 2 days.

Assigned reviewer: Please acknowledge and provide an ETA for review. Prioritize M3 PRs first, then M4+ in milestone order.

## PM Status — Day 36 (2026-03-16) Day 34 review assignment deadline check. 0 reviewer activity after 2 days. **Assigned reviewer**: Please acknowledge and provide an ETA for review. Prioritize M3 PRs first, then M4+ in milestone order.
Merge branch 'master' into test/int-wf11-graph-actor
All checks were successful
CI / lint (pull_request) Successful in 20s
CI / quality (pull_request) Successful in 29s
CI / benchmark-publish (pull_request) Has been skipped
CI / security (pull_request) Successful in 47s
CI / build (pull_request) Successful in 26s
CI / typecheck (pull_request) Successful in 1m22s
CI / e2e_tests (pull_request) Successful in 1m56s
CI / unit_tests (pull_request) Successful in 3m19s
CI / integration_tests (pull_request) Successful in 3m38s
CI / docker (pull_request) Successful in 1m7s
CI / coverage (pull_request) Successful in 6m17s
CI / benchmark-regression (pull_request) Successful in 37m54s
3eea042e4c
Owner

PM Status — Day 37

0 reviewer activity after 3 days. Review was assigned to @hamza.khyari on Day 34 with Day 37 deadline — now overdue. PR is M2 (v3.2.0) by @brent.edwards.

Author: Ensure PR is rebased on master by Day 39 EOD (2026-03-19). Reviewer: please post review or flag for reassignment.


PM status — Day 37

## PM Status — Day 37 0 reviewer activity after 3 days. Review was assigned to @hamza.khyari on Day 34 with Day 37 deadline — now overdue. PR is M2 (v3.2.0) by @brent.edwards. **Author**: Ensure PR is rebased on `master` by **Day 39 EOD (2026-03-19)**. Reviewer: please post review or flag for reassignment. --- *PM status — Day 37*
Merge branch 'master' into test/int-wf11-graph-actor
All checks were successful
CI / lint (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 29s
CI / benchmark-publish (pull_request) Has been skipped
CI / typecheck (pull_request) Successful in 49s
CI / build (pull_request) Successful in 17s
CI / security (pull_request) Successful in 1m2s
CI / unit_tests (pull_request) Successful in 3m5s
CI / integration_tests (pull_request) Successful in 3m39s
CI / e2e_tests (pull_request) Successful in 4m19s
CI / docker (pull_request) Successful in 58s
CI / coverage (pull_request) Successful in 6m28s
CI / benchmark-regression (pull_request) Successful in 40m3s
0afbe2d14a
hurui200320 left a comment

PR Review: !810 (Ticket #775)

Verdict: Request Changes

The PR adds a Robot Framework integration test suite for Specification Workflow Example 11 (complex graph actor for multi-stage code review). The overall approach follows established project patterns and covers the core acceptance criteria at a structural/topology level. However, there are 3 critical process violations, 9 major issues (code quality violations, missing mandatory PR artifacts, spec divergences, and weak test assertions), and several minor/nit items that should be addressed before merge.

Two review passes were performed. New findings from the second pass are marked 🆕.


Critical Issues

C1. PR description is empty

  • Location: PR !810 metadata (body field is "")
  • Problem: CONTRIBUTING.md §"Pull Request Process" point 1 requires every PR to include: (a) a summary of the changes, (b) an issue reference using a closing keyword (e.g., Closes #775), and (c) a Forgejo dependency link (PR blocks issue #775). The document states: "PRs submitted without a description or without an issue reference will not be reviewed."
  • Recommendation: Add a PR description with summary, Closes #775, and set up the Forgejo dependency link (PR blocks #775).

C2. Branch contains 4 merge commits — violates commit hygiene standards

  • Location: Branch test/int-wf11-graph-actor git history
  • Problem: The branch has 4 Merge branch 'master' into test/int-wf11-graph-actor commits alongside the single implementation commit. CONTRIBUTING.md §"Commit Hygiene" says to "Clean up history before merging" and use "interactive rebase or amend to fix typos, consolidate fixup commits, and polish the commit series before pushing to shared branches."
  • Recommendation: Rebase the branch onto origin/master to produce a clean linear history: git rebase origin/master.

C3. # type: ignore[operator] suppression on line 701

  • Location: robot/helper_wf11_graph_actor.py, line 701
  • Problem: CONTRIBUTING.md §"Type Safety" and §"Static Type Checker" explicitly state: "never use inline comments (such as # type: ignore) to suppress type checking errors." The line fn() # type: ignore[operator] violates this. Root cause: _COMMANDS is typed as dict[str, object] (line 685) instead of dict[str, Callable[[], None]].
  • Recommendation: Change line 685 to _COMMANDS: dict[str, Callable[[], None]] (import Callable from collections.abc), then remove the # type: ignore on line 701. See robot/helper_config_cli.py line 150 for the correct pattern already used in this codebase.

Major Issues

M1. verify_read_only_guard() does not verify the read-only guard fires

  • Location: robot/helper_wf11_graph_actor.py, lines 518–534
  • Problem: The function's docstring says "Verify plan execute is rejected for read-only plans" but the only assertion is the absence of a Python Traceback (line 527). The inline comment explicitly says: "Either is acceptable — the key assertion is no crash." This means the test passes even if the read-only guard is completely broken — as long as some other error (e.g., "plan not ready") prevents execution. Compare with the correct pattern in helper_m4_e2e_cli_errors.py which verifies non-zero exit code AND output contains "read-only".
  • Recommendation: Assert r4.returncode != 0 and that the combined output contains a read-only rejection keyword (e.g., "read-only" in combined.lower() or "read_only" in combined.lower()).

M2. verify_no_file_modifications() has a false-positive read-only check

  • Location: robot/helper_wf11_graph_actor.py, lines 653–668
  • Problem: The check if "read" not in out or "only" not in out (line 656) always passes because the _ACTION_YAML description is "Read-only multi-stage code review using a graph actor..." — the lowercased plain output always contains both "read" and "only" as substrings of the description text, regardless of the actual read_only flag value. The more robust YAML-format fallback (which checks for "read_only: true") is therefore never reached.
  • Recommendation: Replace the fragile substring check. Either always use the YAML format check, or check for "read_only" as a compound term.

M3. create_review_action() silently swallows JSON decode errors

  • Location: robot/helper_wf11_graph_actor.py, lines 379–384
  • Problem: When verifying the read_only flag via JSON output, except json.JSONDecodeError: pass (line 384) silently swallows the error and execution falls through to print the success sentinel — without ever having verified the flag. This also violates CONTRIBUTING.md §"Error and Exception Handling" which says "Do not suppress errors."
  • Recommendation: If JSON parsing fails, either (a) try the YAML format as a third fallback, or (b) call _fail() to report that verification was inconclusive.

M4. Imports inside function bodies violate project import guidelines

  • Location: robot/helper_wf11_graph_actor.py, lines 221, 279–280, 554–555
  • Problem: CONTRIBUTING.md §"Import Guidelines" (Project-Specific) explicitly states: "Ensure all imports are at the top of the Python file. Do not scatter imports throughout the file or bury them inside functions or methods." Five from cleveragents... imports appear inside function bodies: verify_graph_topology() (line 221), compile_graph_actor() (lines 279–280), and verify_review_synthesis() (lines 554–555).
  • Recommendation: Move all five imports to the top of the file (after the path bootstrap block), consolidated into:
    from cleveragents.actor.schema import ActorConfigSchema, ActorType  # noqa: E402
    from cleveragents.actor.compiler import CompiledActor, compile_actor  # noqa: E402
    

M5. File exceeds 500-line limit (701 lines)

  • Location: robot/helper_wf11_graph_actor.py (701 lines total)
  • Problem: CONTRIBUTING.md §"General Principles" states: "Keep files under 500 lines. Break large files into focused, cohesive modules." At 701 lines, the file is 40% over the limit. Much of the bulk comes from repeated actor-registration + action-creation boilerplate duplicated across 4 functions.
  • Recommendation: Extract the common setup sequence (register actor, create action) into a shared _setup_actor_and_action(prefix: str) helper function. This would eliminate ~60–80 lines of duplication per function and bring the file under 500 lines.

M6. Graph node IDs diverge from specification Example 11

  • Location: robot/helper_wf11_graph_actor.py, lines 57–139 (_GRAPH_ACTOR_YAML)
  • Problem: The spec (docs/specification.md, line 40382–40416) defines node IDs as: dispatch, security, performance, style, synthesize. The test YAML uses: dispatcher, security_reviewer, performance_reviewer, style_reviewer, synthesizer. The spec also includes checkpointing: true (line 40386) and parallel_execution: true (line 40416) on the graph route, which are absent from the test YAML. CONTRIBUTING.md states the spec is "the authoritative source of truth."
  • Recommendation: Align node IDs to match the spec. If the current schema doesn't support checkpointing and parallel_execution fields, add comments documenting the gap and reference the spec sections.

🆕 M7. Missing CHANGELOG.md update

  • Location: Missing change to CHANGELOG.md
  • Problem: CONTRIBUTING.md requirement 6 states: "The PR must include an update to the changelog file. Add one new entry per commit in the PR that describes the change from the user's perspective." Verified via git diff origin/master...HEAD -- CHANGELOG.md — empty. The PR adds a new integration test suite but has no CHANGELOG.md entry.
  • Recommendation: Add a changelog entry under ## Unreleased, e.g.: "Added Robot Framework integration test for Specification Workflow Example 11 (complex graph actor with multi-stage code review). (#775)"

🆕 M8. Missing --automation-profile trusted in plan use invocation

  • Location: robot/helper_wf11_graph_actor.py, lines 430–437
  • Problem: The ticket title explicitly says "trusted profile" and the spec's Example 11 Step 2 shows agents plan use --automation-profile trusted .... The test's plan_use_review() calls plan use local/wf11-code-review --format plain without passing --automation-profile trusted. The "trusted" profile causes strategize and execute to proceed automatically — the default (supervised) profile has different behavior.
  • Recommendation: Add "--automation-profile", "trusted" to the plan use CLI invocation.

🆕 M9. provider: openai phantom field silently discarded by schema

  • Location: robot/helper_wf11_graph_actor.py, line 65
  • Problem: The _GRAPH_ACTOR_YAML includes provider: openai, but ActorConfigSchema (in src/cleveragents/actor/schema.py) has no provider field. Since the schema doesn't set extra="forbid", Pydantic v2 silently ignores unknown fields. The YAML is accepted not because provider is valid, but because it's silently discarded — this is dead configuration that gives a false sense of completeness.
  • Recommendation: Remove provider: openai from the YAML fixture, or if provider is part of the spec intent, file a ticket to add it to ActorConfigSchema.

Minor Issues

m1. Points label mismatch between ticket and PR

  • Location: PR labels vs. Issue labels
  • Problem: Ticket #775 has Points/5 but PR !810 has Points/3.
  • Recommendation: Align the PR label to Points/5 to match the ticket.

m2. Resource leak if second write_yaml() fails in multi-YAML functions

  • Location: robot/helper_wf11_graph_actor.py, lines 322–325 (also 401–403, 475–477, 610–612)
  • Problem: In functions like create_review_action(), setup_workspace() and the first write_yaml() execute before the try block. If the second write_yaml() raises, the workspace and first temp file leak because finally never executes.
  • Recommendation: Move resource allocation inside the try block with None sentinel guards in finally, or use nested try/finally.

m3. Commit author uses personal email

  • Location: Commit 1ef2e506, author: Brent E. Edwards <chipuni@cemcast.net>
  • Problem: The implementation commit uses a personal email instead of the corporate email brent.edwards@cleverthis.com. The PR was created from the brent.edwards Forgejo account (corporate email), so the mismatch is inconsistent.
  • Recommendation: If the branch is rebased anyway (per C2), amend the commit author email.

m4. No negative/error-path test cases

  • Location: robot/wf11_graph_actor.robot (all 8 tests are happy-path)
  • Problem: All 8 tests are success-path scenarios. There are no negative tests (e.g., invalid graph topology, missing nodes, broken edges). The actor_examples.robot file includes negative tests as precedent.
  • Recommendation: Add at least 1–2 negative test cases (e.g., reject a graph actor with a missing node referenced in an edge).

m5. plan_use_review() has weak plan status verification

  • Location: robot/helper_wf11_graph_actor.py, lines 447–460
  • Problem: The comment says "Verify plan status shows strategize/queued" but the code only checks for absence of "Traceback" and "INTERNAL". It doesn't verify the plan is in the expected state.
  • Recommendation: Add a positive assertion for the expected plan state (e.g., check output contains "strategize" or "queued").

🆕 m6. Missing timeout/on_timeout on all Robot Run Process calls

  • Location: robot/wf11_graph_actor.robot, lines 19, 29, 39, 49, 59, 70, 81, 91
  • Problem: All 8 Run Process calls lack timeout and on_timeout parameters. 337 other Run Process calls across the project use timeout=120s on_timeout=kill. WF11's helper operations (Alembic migrations + CLI subprocesses) are susceptible to hangs that would stall a pabot worker indefinitely.
  • Recommendation: Add timeout=120s on_timeout=kill to each Run Process call.

🆕 m7. Action YAML missing spec's arguments: section

  • Location: robot/helper_wf11_graph_actor.py, lines 141–152
  • Problem: The spec's actions/deep-review.yaml defines typed arguments: target_paths (required, string) and review_depth (optional, string, default "standard"). The test's _ACTION_YAML omits all argument definitions, meaning the argument-passing workflow (plan use --arg target_paths=... --arg review_depth=...) is completely untested.
  • Recommendation: Add the arguments: section to _ACTION_YAML with the two arguments from the spec, or document why they are omitted.

🆕 m8. compile_graph_actor() doesn't assert compiled node IDs match expected set

  • Location: robot/helper_wf11_graph_actor.py, lines 291–307
  • Problem: Verifies len(compiled.nodes) == 5 but does not assert the actual node IDs match the expected set. If the compiler silently renamed or dropped/duplicated a node such that the count remained 5, this test would pass. By contrast, verify_graph_topology() does perform set-equality checks on config nodes.
  • Recommendation: Add: assert set(compiled.nodes.keys()) == {"dispatcher", "security_reviewer", ...}

🆕 m9. Coverage acceptance criterion is vacuously satisfied for Robot-only changes

  • Location: Ticket #775 acceptance criteria, commit message
  • Problem: nox -s coverage_report only measures Behave unit test coverage. Robot Framework integration tests run in a separate session and are never coverage-measured. Since this PR adds only Robot files, it is impossible for these changes to affect the 97% metric. The claim "Coverage >= 97% maintained (98%)" is trivially true and provides no signal about test quality.
  • Recommendation: Acknowledge in the PR description that coverage is unaffected by Robot-only changes, rather than claiming the criterion is met.

Nits

N1. Robot test documentation inaccuracy

  • Location: robot/wf11_graph_actor.robot, line 88
  • Problem: Documentation says "Verify the action is marked read-only in JSON output" but the helper actually uses --format plain with a YAML fallback, not JSON.
  • Recommendation: Update to "Verify the action is marked read-only in plain/YAML output."

N2. _extract_plan_id() regex is over-permissive for ULIDs

  • Location: robot/helper_wf11_graph_actor.py, line 162
  • Problem: \b([0-9A-Z]{26})\b accepts characters I, L, O, U which are excluded from Crockford's Base32 encoding used by ULIDs. Practical risk is negligible.
  • Recommendation: For completeness, could use \b([0-9A-HJKMNP-TV-Z]{26})\b.

N3. Code duplication in setup boilerplate

  • Location: robot/helper_wf11_graph_actor.py, 4 functions repeating ~20-line actor+action setup
  • Problem: create_review_action, plan_use_review, verify_read_only_guard, and verify_no_file_modifications each independently register the actor and create the action. ~80 lines of redundancy.
  • Recommendation: Extract into a _setup_actor_and_action(prefix) helper. This also helps with M5 (file length).

🆕 N4. Helper uses bare if __name__ dispatch instead of main() function

  • Location: robot/helper_wf11_graph_actor.py, lines 696–701
  • Problem: The vast majority of other helpers in the project define a main() function for dispatch (e.g., helper_cli_lifecycle.py, helper_decision_recording.py, helper_actor_examples.py). WF11 is an outlier.
  • Recommendation: Refactor to match the established main() pattern.

🆕 N5. Inconsistent docstring depth across functions

  • Location: robot/helper_wf11_graph_actor.py, multiple functions
  • Problem: verify_review_synthesis() has a thorough multi-line docstring. Most other functions have only single-line docstrings that don't describe key assertion nuances (e.g., verify_read_only_guard doesn't mention it accepts non-crash as success).
  • Recommendation: Expand docstrings to briefly describe key assertions and non-obvious acceptance criteria.

Summary

The PR adds a structurally sound Robot Framework integration test for Workflow Example 11 that covers graph actor registration, topology validation (5 nodes, 6 edges), compilation, action creation, and plan lifecycle. The tests follow the established robot/helper_*.py pattern with good isolation and cleanup.

However, the PR needs significant work before merge:

  1. Process compliance (C1, C2, M7): Empty PR description, merge commits, and missing CHANGELOG update must be fixed.
  2. Code quality (C3, M4, M5): The # type: ignore suppression, scattered imports, and 701-line file exceed project standards.
  3. Test assertion quality (M1, M2, M3): The three functions that test read-only behavior all have assertion weaknesses that would allow tests to pass even if the read-only guard were broken.
  4. Spec fidelity (M6, M8, M9): Node IDs diverge from spec, the "trusted profile" from the ticket title is not exercised, and a phantom provider field is silently discarded.

The second review pass confirmed the first pass's findings and uncovered 3 additional major issues (missing CHANGELOG, missing trusted profile, phantom schema field) plus 4 new minor/nit items. No security issues were found — the code uses secure temp file creation, list-based subprocess invocation, and proper database isolation.

## PR Review: !810 (Ticket #775) ### Verdict: Request Changes The PR adds a Robot Framework integration test suite for Specification Workflow Example 11 (complex graph actor for multi-stage code review). The overall approach follows established project patterns and covers the core acceptance criteria at a structural/topology level. However, there are **3 critical process violations**, **9 major issues** (code quality violations, missing mandatory PR artifacts, spec divergences, and weak test assertions), and several minor/nit items that should be addressed before merge. Two review passes were performed. New findings from the second pass are marked 🆕. --- ### Critical Issues **C1. PR description is empty** - **Location:** PR !810 metadata (body field is `""`) - **Problem:** CONTRIBUTING.md §"Pull Request Process" point 1 requires every PR to include: (a) a summary of the changes, (b) an issue reference using a closing keyword (e.g., `Closes #775`), and (c) a Forgejo dependency link (PR blocks issue #775). The document states: *"PRs submitted without a description or without an issue reference will not be reviewed."* - **Recommendation:** Add a PR description with summary, `Closes #775`, and set up the Forgejo dependency link (PR blocks #775). **C2. Branch contains 4 merge commits — violates commit hygiene standards** - **Location:** Branch `test/int-wf11-graph-actor` git history - **Problem:** The branch has 4 `Merge branch 'master' into test/int-wf11-graph-actor` commits alongside the single implementation commit. CONTRIBUTING.md §"Commit Hygiene" says to *"Clean up history before merging"* and use *"interactive rebase or amend to fix typos, consolidate fixup commits, and polish the commit series before pushing to shared branches."* - **Recommendation:** Rebase the branch onto `origin/master` to produce a clean linear history: `git rebase origin/master`. **C3. `# type: ignore[operator]` suppression on line 701** - **Location:** `robot/helper_wf11_graph_actor.py`, line 701 - **Problem:** CONTRIBUTING.md §"Type Safety" and §"Static Type Checker" explicitly state: *"never use inline comments (such as `# type: ignore`) to suppress type checking errors."* The line `fn() # type: ignore[operator]` violates this. Root cause: `_COMMANDS` is typed as `dict[str, object]` (line 685) instead of `dict[str, Callable[[], None]]`. - **Recommendation:** Change line 685 to `_COMMANDS: dict[str, Callable[[], None]]` (import `Callable` from `collections.abc`), then remove the `# type: ignore` on line 701. See `robot/helper_config_cli.py` line 150 for the correct pattern already used in this codebase. --- ### Major Issues **M1. `verify_read_only_guard()` does not verify the read-only guard fires** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 518–534 - **Problem:** The function's docstring says *"Verify `plan execute` is rejected for read-only plans"* but the only assertion is the absence of a Python `Traceback` (line 527). The inline comment explicitly says: *"Either is acceptable — the key assertion is no crash."* This means the test passes even if the read-only guard is completely broken — as long as some other error (e.g., "plan not ready") prevents execution. Compare with the correct pattern in `helper_m4_e2e_cli_errors.py` which verifies non-zero exit code AND output contains "read-only". - **Recommendation:** Assert `r4.returncode != 0` and that the combined output contains a read-only rejection keyword (e.g., `"read-only" in combined.lower() or "read_only" in combined.lower()`). **M2. `verify_no_file_modifications()` has a false-positive read-only check** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 653–668 - **Problem:** The check `if "read" not in out or "only" not in out` (line 656) always passes because the `_ACTION_YAML` description is *"Read-only multi-stage code review using a graph actor..."* — the lowercased plain output always contains both "read" and "only" as substrings of the description text, regardless of the actual `read_only` flag value. The more robust YAML-format fallback (which checks for `"read_only: true"`) is therefore **never reached**. - **Recommendation:** Replace the fragile substring check. Either always use the YAML format check, or check for `"read_only"` as a compound term. **M3. `create_review_action()` silently swallows JSON decode errors** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 379–384 - **Problem:** When verifying the `read_only` flag via JSON output, `except json.JSONDecodeError: pass` (line 384) silently swallows the error and execution falls through to print the success sentinel — without ever having verified the flag. This also violates CONTRIBUTING.md §"Error and Exception Handling" which says *"Do not suppress errors."* - **Recommendation:** If JSON parsing fails, either (a) try the YAML format as a third fallback, or (b) call `_fail()` to report that verification was inconclusive. **M4. Imports inside function bodies violate project import guidelines** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 221, 279–280, 554–555 - **Problem:** CONTRIBUTING.md §"Import Guidelines" (Project-Specific) explicitly states: *"Ensure all imports are at the top of the Python file. Do not scatter imports throughout the file or bury them inside functions or methods."* Five `from cleveragents...` imports appear inside function bodies: `verify_graph_topology()` (line 221), `compile_graph_actor()` (lines 279–280), and `verify_review_synthesis()` (lines 554–555). - **Recommendation:** Move all five imports to the top of the file (after the path bootstrap block), consolidated into: ```python from cleveragents.actor.schema import ActorConfigSchema, ActorType # noqa: E402 from cleveragents.actor.compiler import CompiledActor, compile_actor # noqa: E402 ``` **M5. File exceeds 500-line limit (701 lines)** - **Location:** `robot/helper_wf11_graph_actor.py` (701 lines total) - **Problem:** CONTRIBUTING.md §"General Principles" states: *"Keep files under 500 lines. Break large files into focused, cohesive modules."* At 701 lines, the file is 40% over the limit. Much of the bulk comes from repeated actor-registration + action-creation boilerplate duplicated across 4 functions. - **Recommendation:** Extract the common setup sequence (register actor, create action) into a shared `_setup_actor_and_action(prefix: str)` helper function. This would eliminate ~60–80 lines of duplication per function and bring the file under 500 lines. **M6. Graph node IDs diverge from specification Example 11** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 57–139 (`_GRAPH_ACTOR_YAML`) - **Problem:** The spec (`docs/specification.md`, line 40382–40416) defines node IDs as: `dispatch`, `security`, `performance`, `style`, `synthesize`. The test YAML uses: `dispatcher`, `security_reviewer`, `performance_reviewer`, `style_reviewer`, `synthesizer`. The spec also includes `checkpointing: true` (line 40386) and `parallel_execution: true` (line 40416) on the graph route, which are absent from the test YAML. CONTRIBUTING.md states the spec is *"the authoritative source of truth."* - **Recommendation:** Align node IDs to match the spec. If the current schema doesn't support `checkpointing` and `parallel_execution` fields, add comments documenting the gap and reference the spec sections. **🆕 M7. Missing CHANGELOG.md update** - **Location:** Missing change to `CHANGELOG.md` - **Problem:** CONTRIBUTING.md requirement 6 states: *"The PR must include an update to the changelog file. Add one new entry per commit in the PR that describes the change from the user's perspective."* Verified via `git diff origin/master...HEAD -- CHANGELOG.md` — empty. The PR adds a new integration test suite but has no CHANGELOG.md entry. - **Recommendation:** Add a changelog entry under `## Unreleased`, e.g.: *"Added Robot Framework integration test for Specification Workflow Example 11 (complex graph actor with multi-stage code review). (#775)"* **🆕 M8. Missing `--automation-profile trusted` in `plan use` invocation** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 430–437 - **Problem:** The ticket title explicitly says *"trusted profile"* and the spec's Example 11 Step 2 shows `agents plan use --automation-profile trusted ...`. The test's `plan_use_review()` calls `plan use local/wf11-code-review --format plain` without passing `--automation-profile trusted`. The "trusted" profile causes strategize and execute to proceed automatically — the default (supervised) profile has different behavior. - **Recommendation:** Add `"--automation-profile", "trusted"` to the `plan use` CLI invocation. **🆕 M9. `provider: openai` phantom field silently discarded by schema** - **Location:** `robot/helper_wf11_graph_actor.py`, line 65 - **Problem:** The `_GRAPH_ACTOR_YAML` includes `provider: openai`, but `ActorConfigSchema` (in `src/cleveragents/actor/schema.py`) has no `provider` field. Since the schema doesn't set `extra="forbid"`, Pydantic v2 silently ignores unknown fields. The YAML is accepted not because `provider` is valid, but because it's silently discarded — this is dead configuration that gives a false sense of completeness. - **Recommendation:** Remove `provider: openai` from the YAML fixture, or if provider is part of the spec intent, file a ticket to add it to `ActorConfigSchema`. --- ### Minor Issues **m1. Points label mismatch between ticket and PR** - **Location:** PR labels vs. Issue labels - **Problem:** Ticket #775 has `Points/5` but PR !810 has `Points/3`. - **Recommendation:** Align the PR label to `Points/5` to match the ticket. **m2. Resource leak if second `write_yaml()` fails in multi-YAML functions** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 322–325 (also 401–403, 475–477, 610–612) - **Problem:** In functions like `create_review_action()`, `setup_workspace()` and the first `write_yaml()` execute before the `try` block. If the second `write_yaml()` raises, the workspace and first temp file leak because `finally` never executes. - **Recommendation:** Move resource allocation inside the `try` block with `None` sentinel guards in `finally`, or use nested try/finally. **m3. Commit author uses personal email** - **Location:** Commit `1ef2e506`, author: `Brent E. Edwards <chipuni@cemcast.net>` - **Problem:** The implementation commit uses a personal email instead of the corporate email `brent.edwards@cleverthis.com`. The PR was created from the `brent.edwards` Forgejo account (corporate email), so the mismatch is inconsistent. - **Recommendation:** If the branch is rebased anyway (per C2), amend the commit author email. **m4. No negative/error-path test cases** - **Location:** `robot/wf11_graph_actor.robot` (all 8 tests are happy-path) - **Problem:** All 8 tests are success-path scenarios. There are no negative tests (e.g., invalid graph topology, missing nodes, broken edges). The `actor_examples.robot` file includes negative tests as precedent. - **Recommendation:** Add at least 1–2 negative test cases (e.g., reject a graph actor with a missing node referenced in an edge). **m5. `plan_use_review()` has weak plan status verification** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 447–460 - **Problem:** The comment says *"Verify plan status shows strategize/queued"* but the code only checks for absence of "Traceback" and "INTERNAL". It doesn't verify the plan is in the expected state. - **Recommendation:** Add a positive assertion for the expected plan state (e.g., check output contains "strategize" or "queued"). **🆕 m6. Missing `timeout`/`on_timeout` on all Robot `Run Process` calls** - **Location:** `robot/wf11_graph_actor.robot`, lines 19, 29, 39, 49, 59, 70, 81, 91 - **Problem:** All 8 `Run Process` calls lack `timeout` and `on_timeout` parameters. 337 other `Run Process` calls across the project use `timeout=120s on_timeout=kill`. WF11's helper operations (Alembic migrations + CLI subprocesses) are susceptible to hangs that would stall a pabot worker indefinitely. - **Recommendation:** Add `timeout=120s on_timeout=kill` to each `Run Process` call. **🆕 m7. Action YAML missing spec's `arguments:` section** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 141–152 - **Problem:** The spec's `actions/deep-review.yaml` defines typed arguments: `target_paths` (required, string) and `review_depth` (optional, string, default "standard"). The test's `_ACTION_YAML` omits all argument definitions, meaning the argument-passing workflow (`plan use --arg target_paths=... --arg review_depth=...`) is completely untested. - **Recommendation:** Add the `arguments:` section to `_ACTION_YAML` with the two arguments from the spec, or document why they are omitted. **🆕 m8. `compile_graph_actor()` doesn't assert compiled node IDs match expected set** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 291–307 - **Problem:** Verifies `len(compiled.nodes) == 5` but does not assert the actual node IDs match the expected set. If the compiler silently renamed or dropped/duplicated a node such that the count remained 5, this test would pass. By contrast, `verify_graph_topology()` does perform set-equality checks on config nodes. - **Recommendation:** Add: `assert set(compiled.nodes.keys()) == {"dispatcher", "security_reviewer", ...}` **🆕 m9. Coverage acceptance criterion is vacuously satisfied for Robot-only changes** - **Location:** Ticket #775 acceptance criteria, commit message - **Problem:** `nox -s coverage_report` only measures Behave unit test coverage. Robot Framework integration tests run in a separate session and are never coverage-measured. Since this PR adds only Robot files, it is impossible for these changes to affect the 97% metric. The claim "Coverage >= 97% maintained (98%)" is trivially true and provides no signal about test quality. - **Recommendation:** Acknowledge in the PR description that coverage is unaffected by Robot-only changes, rather than claiming the criterion is met. --- ### Nits **N1. Robot test documentation inaccuracy** - **Location:** `robot/wf11_graph_actor.robot`, line 88 - **Problem:** Documentation says *"Verify the action is marked read-only in JSON output"* but the helper actually uses `--format plain` with a YAML fallback, not JSON. - **Recommendation:** Update to *"Verify the action is marked read-only in plain/YAML output."* **N2. `_extract_plan_id()` regex is over-permissive for ULIDs** - **Location:** `robot/helper_wf11_graph_actor.py`, line 162 - **Problem:** `\b([0-9A-Z]{26})\b` accepts characters I, L, O, U which are excluded from Crockford's Base32 encoding used by ULIDs. Practical risk is negligible. - **Recommendation:** For completeness, could use `\b([0-9A-HJKMNP-TV-Z]{26})\b`. **N3. Code duplication in setup boilerplate** - **Location:** `robot/helper_wf11_graph_actor.py`, 4 functions repeating ~20-line actor+action setup - **Problem:** `create_review_action`, `plan_use_review`, `verify_read_only_guard`, and `verify_no_file_modifications` each independently register the actor and create the action. ~80 lines of redundancy. - **Recommendation:** Extract into a `_setup_actor_and_action(prefix)` helper. This also helps with M5 (file length). **🆕 N4. Helper uses bare `if __name__` dispatch instead of `main()` function** - **Location:** `robot/helper_wf11_graph_actor.py`, lines 696–701 - **Problem:** The vast majority of other helpers in the project define a `main()` function for dispatch (e.g., `helper_cli_lifecycle.py`, `helper_decision_recording.py`, `helper_actor_examples.py`). WF11 is an outlier. - **Recommendation:** Refactor to match the established `main()` pattern. **🆕 N5. Inconsistent docstring depth across functions** - **Location:** `robot/helper_wf11_graph_actor.py`, multiple functions - **Problem:** `verify_review_synthesis()` has a thorough multi-line docstring. Most other functions have only single-line docstrings that don't describe key assertion nuances (e.g., `verify_read_only_guard` doesn't mention it accepts non-crash as success). - **Recommendation:** Expand docstrings to briefly describe key assertions and non-obvious acceptance criteria. --- ### Summary The PR adds a structurally sound Robot Framework integration test for Workflow Example 11 that covers graph actor registration, topology validation (5 nodes, 6 edges), compilation, action creation, and plan lifecycle. The tests follow the established `robot/helper_*.py` pattern with good isolation and cleanup. However, the PR needs significant work before merge: 1. **Process compliance** (C1, C2, M7): Empty PR description, merge commits, and missing CHANGELOG update must be fixed. 2. **Code quality** (C3, M4, M5): The `# type: ignore` suppression, scattered imports, and 701-line file exceed project standards. 3. **Test assertion quality** (M1, M2, M3): The three functions that test read-only behavior all have assertion weaknesses that would allow tests to pass even if the read-only guard were broken. 4. **Spec fidelity** (M6, M8, M9): Node IDs diverge from spec, the "trusted profile" from the ticket title is not exercised, and a phantom `provider` field is silently discarded. The second review pass confirmed the first pass's findings and uncovered 3 additional major issues (missing CHANGELOG, missing trusted profile, phantom schema field) plus 4 new minor/nit items. No security issues were found — the code uses secure temp file creation, list-based subprocess invocation, and proper database isolation.
brent.edwards force-pushed test/int-wf11-graph-actor from 0afbe2d14a
All checks were successful
CI / lint (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 29s
CI / benchmark-publish (pull_request) Has been skipped
CI / typecheck (pull_request) Successful in 49s
CI / build (pull_request) Successful in 17s
CI / security (pull_request) Successful in 1m2s
CI / unit_tests (pull_request) Successful in 3m5s
CI / integration_tests (pull_request) Successful in 3m39s
CI / e2e_tests (pull_request) Successful in 4m19s
CI / docker (pull_request) Successful in 58s
CI / coverage (pull_request) Successful in 6m28s
CI / benchmark-regression (pull_request) Successful in 40m3s
to 7ab07303f8
Some checks are pending
CI / lint (pull_request) Successful in 16s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 22s
CI / quality (pull_request) Successful in 28s
CI / security (pull_request) Successful in 1m4s
CI / typecheck (pull_request) Successful in 1m21s
CI / e2e_tests (pull_request) Successful in 3m56s
CI / benchmark-regression (pull_request) Has started running
CI / unit_tests (pull_request) Successful in 7m18s
CI / integration_tests (pull_request) Successful in 7m22s
CI / docker (pull_request) Successful in 1m57s
CI / coverage (pull_request) Successful in 6m59s
2026-03-18 22:14:25 +00:00
Compare
brent.edwards force-pushed test/int-wf11-graph-actor from 7ab07303f8
Some checks are pending
CI / lint (pull_request) Successful in 16s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 22s
CI / quality (pull_request) Successful in 28s
CI / security (pull_request) Successful in 1m4s
CI / typecheck (pull_request) Successful in 1m21s
CI / e2e_tests (pull_request) Successful in 3m56s
CI / benchmark-regression (pull_request) Has started running
CI / unit_tests (pull_request) Successful in 7m18s
CI / integration_tests (pull_request) Successful in 7m22s
CI / docker (pull_request) Successful in 1m57s
CI / coverage (pull_request) Successful in 6m59s
to b59677927f
Some checks failed
CI / lint (pull_request) Successful in 15s
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Successful in 30s
CI / build (pull_request) Successful in 17s
CI / security (pull_request) Successful in 53s
CI / typecheck (pull_request) Successful in 56s
CI / unit_tests (pull_request) Successful in 3m25s
CI / docker (pull_request) Successful in 18s
CI / e2e_tests (pull_request) Successful in 4m2s
CI / integration_tests (pull_request) Failing after 4m47s
CI / coverage (pull_request) Successful in 6m15s
CI / benchmark-regression (pull_request) Successful in 38m33s
2026-03-18 23:14:27 +00:00
Compare
Author
Member

Review Fixes Applied — Commit b5967792

Addressed Rui Hu's 26 findings. Merge commits and empty PR body resolved by rebase/user.

Critical

Finding Fix
C3 Removed # type: ignore[operator] — changed _COMMANDS type to dict[str, Callable[[], None]]

Major

Finding Fix
M1 verify_read_only_guard() — asserts returncode != 0 and output contains read-only/read_only
M2 verify_no_file_modifications() — checks "read_only: true" in YAML output directly
M3 create_review_action()except json.JSONDecodeError now calls fail()
M4 All from cleveragents... imports moved to top of file
M5 Extracted _setup_actor_and_action() context manager — 466 lines (down from 701)
M7 CHANGELOG entry for #775
M8 Added --automation-profile trusted to plan use
M9 Removed phantom provider: openai from YAML fixture

Minor

  • timeout=120s on_timeout=kill on all 8 Robot Run Process calls

  • Force Tags wf11 graph trusted integration

  • nox -s lintPASS

  • nox -s typecheckPASS (0 errors)

  • Helper: 466 lines (under 500, down from 701)

## Review Fixes Applied — Commit `b5967792` Addressed Rui Hu's 26 findings. Merge commits and empty PR body resolved by rebase/user. ### Critical | Finding | Fix | |---------|-----| | **C3** | Removed `# type: ignore[operator]` — changed `_COMMANDS` type to `dict[str, Callable[[], None]]` | ### Major | Finding | Fix | |---------|-----| | **M1** | `verify_read_only_guard()` — asserts `returncode != 0` and output contains `read-only`/`read_only` | | **M2** | `verify_no_file_modifications()` — checks `"read_only: true"` in YAML output directly | | **M3** | `create_review_action()` — `except json.JSONDecodeError` now calls `fail()` | | **M4** | All `from cleveragents...` imports moved to top of file | | **M5** | Extracted `_setup_actor_and_action()` context manager — **466 lines** (down from 701) | | **M7** | CHANGELOG entry for #775 | | **M8** | Added `--automation-profile trusted` to `plan use` | | **M9** | Removed phantom `provider: openai` from YAML fixture | ### Minor - `timeout=120s on_timeout=kill` on all 8 Robot Run Process calls - `Force Tags wf11 graph trusted integration` - `nox -s lint` — **PASS** - `nox -s typecheck` — **PASS** (0 errors) - Helper: **466 lines** (under 500, down from 701)
freemo approved these changes 2026-03-19 04:57:28 +00:00
Dismissed
freemo left a comment

Code Review — PR #810

Well-structured integration test for WF11. Proper labels, milestone, and issue linkage. Approved.

## Code Review — PR #810 Well-structured integration test for WF11. Proper labels, milestone, and issue linkage. **Approved.**
brent.edwards force-pushed test/int-wf11-graph-actor from b59677927f
Some checks failed
CI / lint (pull_request) Successful in 15s
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Successful in 30s
CI / build (pull_request) Successful in 17s
CI / security (pull_request) Successful in 53s
CI / typecheck (pull_request) Successful in 56s
CI / unit_tests (pull_request) Successful in 3m25s
CI / docker (pull_request) Successful in 18s
CI / e2e_tests (pull_request) Successful in 4m2s
CI / integration_tests (pull_request) Failing after 4m47s
CI / coverage (pull_request) Successful in 6m15s
CI / benchmark-regression (pull_request) Successful in 38m33s
to 39afb0b9b3
Some checks failed
CI / lint (pull_request) Successful in 31s
CI / quality (pull_request) Successful in 33s
CI / benchmark-publish (pull_request) Has been skipped
CI / typecheck (pull_request) Successful in 56s
CI / security (pull_request) Successful in 59s
CI / build (pull_request) Successful in 24s
CI / unit_tests (pull_request) Successful in 3m54s
CI / docker (pull_request) Successful in 1m12s
CI / integration_tests (pull_request) Failing after 5m26s
CI / e2e_tests (pull_request) Successful in 5m21s
CI / coverage (pull_request) Successful in 8m16s
CI / benchmark-regression (pull_request) Failing after 30m24s
2026-03-20 00:13:52 +00:00
Compare
brent.edwards dismissed freemo's review 2026-03-20 00:13:52 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

Author
Member

Rebased onto origin/master (79b0a2c5). CHANGELOG conflict resolved (kept master, re-added PR entry). nox -s lint PASS, nox -s typecheck PASS (0 errors). Commit 39afb0b9.

Rebased onto `origin/master` (`79b0a2c5`). CHANGELOG conflict resolved (kept master, re-added PR entry). `nox -s lint` PASS, `nox -s typecheck` PASS (0 errors). Commit `39afb0b9`.
Merge remote-tracking branch 'origin/master' into test/int-wf11-graph-actor
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 17s
CI / lint (pull_request) Successful in 3m51s
CI / integration_tests (pull_request) Failing after 4m13s
CI / quality (pull_request) Successful in 4m14s
CI / typecheck (pull_request) Successful in 4m28s
CI / security (pull_request) Successful in 4m34s
CI / unit_tests (pull_request) Successful in 7m15s
CI / docker (pull_request) Successful in 1m18s
CI / e2e_tests (pull_request) Successful in 9m50s
CI / coverage (pull_request) Successful in 10m57s
CI / status-check (pull_request) Successful in 3s
CI / benchmark-regression (pull_request) Successful in 1h4m12s
6bafa9d7be
# Conflicts:
#	CHANGELOG.md
Merge remote-tracking branch 'origin/master' into test/int-wf11-graph-actor
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 27s
CI / lint (pull_request) Successful in 3m40s
CI / typecheck (pull_request) Successful in 3m49s
CI / quality (pull_request) Successful in 3m59s
CI / integration_tests (pull_request) Failing after 4m26s
CI / security (pull_request) Successful in 4m34s
CI / unit_tests (pull_request) Successful in 7m36s
CI / docker (pull_request) Successful in 1m26s
CI / coverage (pull_request) Failing after 21m0s
CI / e2e_tests (pull_request) Failing after 24m37s
CI / benchmark-regression (pull_request) Successful in 1h11m5s
CI / status-check (pull_request) Failing after 2s
38cbd8f9ce
# Conflicts:
#	CHANGELOG.md
fix(test): add missing provider field to graph actor YAML config
Some checks failed
CI / benchmark-publish (pull_request) Waiting to run
CI / build (pull_request) Successful in 22s
CI / lint (pull_request) Successful in 3m17s
CI / quality (pull_request) Successful in 3m43s
CI / typecheck (pull_request) Successful in 3m49s
CI / benchmark-regression (pull_request) Waiting to run
CI / security (pull_request) Successful in 4m4s
CI / integration_tests (pull_request) Failing after 7m57s
CI / e2e_tests (pull_request) Successful in 11m14s
CI / unit_tests (pull_request) Successful in 11m25s
CI / docker (pull_request) Successful in 8s
CI / coverage (pull_request) Successful in 10m12s
CI / status-check (pull_request) Successful in 1s
2a060f4a4c
ActorConfiguration.from_blob requires a provider field. The
wf11 graph actor YAML was missing provider: openai, causing
'BadParameter: provider is required' on actor add.
fix(test): increase container_resolve_crash timeouts from 30s to 120s
Some checks failed
CI / build (pull_request) Successful in 17s
CI / lint (pull_request) Successful in 3m36s
CI / quality (pull_request) Successful in 3m41s
CI / security (pull_request) Successful in 4m0s
CI / typecheck (pull_request) Successful in 4m5s
CI / integration_tests (pull_request) Successful in 7m16s
CI / unit_tests (pull_request) Successful in 7m22s
CI / e2e_tests (pull_request) Successful in 9m46s
CI / coverage (pull_request) Successful in 12m31s
CI / docker (pull_request) Failing after 15m7s
CI / benchmark-publish (pull_request) Has been skipped
CI / status-check (pull_request) Successful in 2s
CI / benchmark-regression (pull_request) Successful in 58m19s
581491411f
The plan-tree-crash helper takes longer than 30s in CI with real
DI wiring, causing SIGTERM (rc=-15). Increase to 120s consistent
with other robot test timeouts.
Merge remote-tracking branch 'origin/master' into test/int-wf11-graph-actor
All checks were successful
CI / build (pull_request) Successful in 24s
CI / typecheck (pull_request) Successful in 43s
CI / lint (pull_request) Successful in 3m25s
CI / quality (pull_request) Successful in 4m46s
CI / security (pull_request) Successful in 4m57s
CI / integration_tests (pull_request) Successful in 6m59s
CI / unit_tests (pull_request) Successful in 7m14s
CI / docker (pull_request) Successful in 1m9s
CI / e2e_tests (pull_request) Successful in 12m49s
CI / coverage (pull_request) Successful in 11m19s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-publish (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Successful in 57m35s
b1c345c3a0
# Conflicts:
#	CHANGELOG.md
brent.edwards force-pushed test/int-wf11-graph-actor from b1c345c3a0
All checks were successful
CI / build (pull_request) Successful in 24s
CI / typecheck (pull_request) Successful in 43s
CI / lint (pull_request) Successful in 3m25s
CI / quality (pull_request) Successful in 4m46s
CI / security (pull_request) Successful in 4m57s
CI / integration_tests (pull_request) Successful in 6m59s
CI / unit_tests (pull_request) Successful in 7m14s
CI / docker (pull_request) Successful in 1m9s
CI / e2e_tests (pull_request) Successful in 12m49s
CI / coverage (pull_request) Successful in 11m19s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-publish (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Successful in 57m35s
to e2fc9c96c3
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 31s
CI / lint (pull_request) Successful in 3m54s
CI / quality (pull_request) Successful in 4m28s
CI / typecheck (pull_request) Successful in 4m33s
CI / security (pull_request) Successful in 4m49s
CI / integration_tests (pull_request) Successful in 7m38s
CI / unit_tests (pull_request) Successful in 8m11s
CI / docker (pull_request) Successful in 1m57s
CI / e2e_tests (pull_request) Successful in 13m16s
CI / benchmark-regression (pull_request) Failing after 15m3s
CI / coverage (pull_request) Successful in 18m11s
CI / status-check (pull_request) Successful in 2s
2026-03-26 20:02:56 +00:00
Compare
freemo self-assigned this 2026-04-02 06:15:22 +00:00
Owner

🤖 Backlog Groomer (groomer-1): Closing as duplicate of #775.

Issue #775 (test(integration): workflow example 11 — complex graph actor for multi-stage code generation) is the canonical version with full labels (MoSCoW/Must have, Priority/Medium, State/In Review, Type/Testing) and milestone v3.2.0. This issue is an exact title duplicate.

🤖 **Backlog Groomer (groomer-1):** Closing as duplicate of #775. Issue #775 (`test(integration): workflow example 11 — complex graph actor for multi-stage code generation`) is the canonical version with full labels (`MoSCoW/Must have`, `Priority/Medium`, `State/In Review`, `Type/Testing`) and milestone `v3.2.0`. This issue is an exact title duplicate.
freemo closed this pull request 2026-04-02 17:32:31 +00:00
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 31s
Required
Details
CI / lint (pull_request) Successful in 3m54s
Required
Details
CI / quality (pull_request) Successful in 4m28s
Required
Details
CI / typecheck (pull_request) Successful in 4m33s
Required
Details
CI / security (pull_request) Successful in 4m49s
Required
Details
CI / integration_tests (pull_request) Successful in 7m38s
Required
Details
CI / unit_tests (pull_request) Successful in 8m11s
Required
Details
CI / docker (pull_request) Successful in 1m57s
Required
Details
CI / e2e_tests (pull_request) Successful in 13m16s
CI / benchmark-regression (pull_request) Failing after 15m3s
CI / coverage (pull_request) Successful in 18m11s
Required
Details
CI / status-check (pull_request) Successful in 2s

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core!810
No description provided.