feature/m6-estimation-actor-yaml #975
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleveragents/cleveragents-core!975
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feature/m6-estimation-actor-yaml"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Description
Adds estimation-actor support and role-aware actor validation for issue #650.
This change introduces actor schema support for
role_hintandresponse_format, adds non-fatal compatibility warnings during actorregistration and plan preflight checks, adds a new
examples/actors/estimator.yaml, and updates docs/tests accordingly.It also adjusts E2E helper expectations in local environments where provider
keys are not configured, while still failing on real crashes.
Type of Change
Quality Checklist
Anyunless justified)nox -s typecheckpasses with no errorsnox -s lintpasses with no errorsfeatures/)robot/) if applicablenox -s coverage_report)nox -s security_scan)nox -s dead_code)Testing
Ran targeted and session-level tests with log capture and grep summaries.
Test Commands Run
Additional targeted runs during debugging:
Related Issues
Closes #650
Implementation Notes
response_format.examples/actors/estimator.yamland updated docs plus Behave/Robotcoverage.
M1/M2/M3/M6 suites.
PM Day 36 Triage: Estimation actor YAML template. Closes #650. M6 scope — part of the estimation domain work (4 issues: #649-#652). Reviewer: @aditya (actor YAML domain expert). Verify actor config aligns with ADR-010 and ADR-032 Jinja2 YAML template preprocessing.
PR #975 Review —
feature/m6-estimation-actor-yaml19 files changed, +380/−14 | Linked issue: #650
P0 — Must Fix Before Merge
P0-1. Duplicated response_format lookup logic diverges from
actor_role_warningscheck_estimation_actor_compatibility_warning()inplan_preflight_guardrail.py:248-271re-implements the response_format nested-config fallback logic that already exists inactor_role_warnings()inschema.py. Critically, the preflight method omits thecontext_viewwarning thatactor_role_warningsproduces:actor add/updatewarns about wrongcontext_viewon estimation actorsThe preflight service doesn't import
actor_role_warningsat all. This is a DRY violation that will diverge further as rules are added.Fix: Have
check_estimation_actor_compatibility_warningdelegate toactor_role_warnings()fromschema.py.P0-2.
_is_expected_provider_unavailablecopy-pasted identically into 4 filesIdentical 6-line function in
robot/helper_m1_e2e_verification.py,helper_m2_e2e_verification.py,helper_m3_e2e_verification.py,helper_m6_e2e_verification.py. Any future change to the detection heuristic must be applied in 4 places.Fix: Extract to a shared module (e.g.
robot/helpers_common.py) and import.P0-3. PR missing required milestone and
Type/labelPer CONTRIBUTING.md: "A PR without a milestone will not be reviewed" and "Every PR must carry exactly one Type/ label." Currently
labels: [],milestone: null,assignee: null.P1 — Should Fix Before Merge
P1-1.
response_formataccepts anydictwith zero structural validationresponse_format: dict[str, Any] | Nonehas nofield_validator. A user could setresponse_format: {foo: bar}and it would pass schema loading, only to fail at LLM call time with an opaque error. Add a validator checking at minimum for a"type"key.P1-2. Provider-unavailable heuristic may mask real errors
The substring match
"openai_api_key" in hayis overly broad. A genuine internal error whose context dumps environment variable names would be silently swallowed. Tighten to match the specific controlled error message pattern.P1-3.
format_report()hard-codes warning category labelAll warnings render as
[WARN] actor_role_compatibility:regardless of source. Store warnings as structured objects rather than bare strings.P1-4. Missing negative test coverage for new schema fields
Missing tests:
role_hintvalue → Pydantic rejectionresponse_format→ validation catchcontext_viewwarning path (estimation +context_view: executor)actor_role_warningscalled withActorConfigSchemaobject (only dict path tested)P2 — Should Fix (Can Follow Up)
P2-1. Quality checklist items self-reported as unchecked
typecheck, lint, coverage_report, security_scan, dead_code are all unchecked in the PR body.
P2-2. Multiple files exceed 500-line limit
schema.pyis now 974 lines (limit: 500). The newactor_role_warnings+ coercion helpers are a natural extraction candidate. Pre-existing violations in 5 other files are also grown by this PR.P2-3. Robot test doc string stale
Load All Example YAML Filesdoc was updated to remove the count but the assertion still checksexample-count:8.P3 — Nitpicks
_coerce_role_hint/_coerce_context_viewlack docstrings$schemakey in estimator.yaml may confuse some YAML linters (cosmetic)Verdict: REQUEST_CHANGES — The logic duplication between
actor_role_warningsand the preflight method (P0-1) is the most concerning architectural issue, creating a divergence where the CLI warns about problems the preflight silently ignores.@ -154,1 +154,4 @@def _is_expected_provider_unavailable(output: str) -> bool:"""Return True when failure is due to missing external LLM provider config."""hay = output.lower()P0-2: This identical function is copy-pasted into 4 helper files (m1, m2, m3, m6). Extract to a shared module like
robot/helpers_common.pyand import from all four.@ -155,0 +159,4 @@"provider openai is not configured" in hayor "openai_api_key" in hayor "no provider configured" in hay)P1-2: The substring match
"openai_api_key" in hayis overly broad. A genuine internal error whose error context includes environment variable names would be silently swallowed. Consider matching on the specific controlled error message pattern (e.g."provider openai is not configured") rather than bare key name substrings.@ -715,6 +725,10 @@ class ActorConfigSchema(BaseModel):# LLM configurationmodel: str | None = Field(default=None, description="LLM model name")system_prompt: str | None = Field(default=None, description="System prompt")response_format: dict[str, Any] | None = Field(P1-1:
response_formataccepts anydictwith zero structural validation. A user could setresponse_format: {foo: bar}and it passes schema loading but fails at LLM call time. Consider adding afield_validatorthat checks for at minimum a"type"key.@ -89,2 +94,4 @@lines.append(f" [{status}] {r.check.value}: {r.message}")for warning in self.warnings:lines.append(f" [WARN] actor_role_compatibility: {warning}")return "\n".join(lines)P1-3: This hard-codes
actor_role_compatibilityas the label for ALL warnings. When more warning types are added, this will misattribute them. Consider storing warnings as structured objects (category + message) rather than bare strings.@ -235,0 +249,4 @@actor_registry: dict[str, object] | None,) -> str | None:"""Emit non-fatal warnings for estimation actor role compatibility."""registry = actor_registry or {}P0-1: This method re-implements the response_format lookup logic from
actor_role_warnings()inschema.pybut omits the context_view warning. This creates a behavioral divergence: CLIactor addwarns about context_view issues on estimation actors, but preflight does not.Suggestion: import and delegate to
actor_role_warnings()fromcleveragents.actor.schemainstead of duplicating the logic.Code Review — PR #975
feature/m6-estimation-actor-yamlReviewer: @brent.edwards | Size: M (~600 lines) | Focus: Actor schema, preflight warnings, E2E helpers
P0:blocker (1)
1. Preflight warning diverges from
actor_role_warnings()— silent logic splitplan_preflight_guardrail.py:check_estimation_actor_compatibility_warning()reimplements a subset ofschema.py:actor_role_warnings(). The guardrail only warns about missingresponse_format, butactor_role_warnings()also warns whencontext_viewis notstrategist. This meansactor addwarns about a misconfigured estimation actor butplan usepreflight doesn't — users get inconsistent feedback. As rules evolve, these will diverge further.Fix: Call
actor_role_warnings()from the guardrail instead of duplicating logic.P1:must-fix (4)
2.
response_formataccepts arbitrary dicts with no validationschema.py:728-730: The field isdict[str, Any] | Nonewith no Pydantic validator. Any dict passes — no check that it's valid JSON Schema, no depth limit, no key whitelist. If forwarded to an LLM API (e.g., OpenAI'sresponse_format), a crafted dict could inject unexpected keys. Add a@field_validatorthat at minimum validatestypeis present and is"object"or"json_schema".3.
_is_expected_provider_unavailable()copy-pasted across 4 filesThe identical 6-line function is in
helper_m1_e2e_verification.py,helper_m2_e2e_verification.py,helper_m3_e2e_verification.py, andhelper_m6_e2e_verification.py. Any change requires updating all 4 in lockstep. Extract torobot/helper_e2e_common.py.4.
_is_expected_provider_unavailable()matches"openai_api_key"too broadlyThe bare substring matches any output mentioning the key name — including internal errors like
"Failed to rotate openai_api_key". Use a more precise pattern like"openai_api_key is not set"or"openai_api_key not found".5. No negative tests for
response_formatorrole_hintvalidationThere are no Behave scenarios testing that invalid
response_formatvalues or invalidrole_hintstrings are rejected. Without negative tests, the validation could silently accept malformed input.P2:should-fix (4)
6. Missing PR metadata — No labels, no milestone, no assignee. Per CONTRIBUTING.md, all PRs must have Type/, Priority/, State/ labels and be assigned to a milestone.
7. Quality gates unchecked — PR checklist shows
nox -s typecheckandnox -s lintas unchecked. These are mandatory before review.8. Preflight warning silently skips non-dict actor registry entries —
plan_preflight_guardrail.py:256checksisinstance(estimation_actor, dict)but the registry may containActorConfigSchemainstances. Theactor_role_warnings()function handles both types; the guardrail should too.9. Stale Robot documentation —
m6_e2e_verification.robotsuite documentation references the original test scope but doesn't mention the new provider-unavailable handling added in this PR.P3:nit (3)
10. Missing docstrings on
actor_role_warnings()return values.11.
examples/actors/estimator.yaml—descriptionfield repeats the name verbatim.12. CHANGELOG entry is dense — split into separate bullet points for schema vs preflight vs E2E changes.
Positive Observations
role_hintandresponse_formatdefault toNone— no backward compatibility break for existing actor YAML.RoleHintenum properly rejects invalid values when set explicitly.actor_role_warnings()generates clear, actionable warning messages.Summary
Verdict: REQUEST_CHANGES — The core schema additions are well-designed, but the preflight logic divergence (P0-1), unsanitized
response_formatpassthrough (P1-2), and copy-pasted helper functions (P1-3) need to be resolved. The missing PR metadata (P2-6) should also be fixed.Code Review Round 2 — PR #975
feature/m6-estimation-actor-yamlReviewer: @brent.edwards | Focus: Items missed in Round 1
Round 1 findings (P0-1, P1-2 through P1-5, P2-6 through P2-9, P3-10/11/12) remain outstanding.
P1:must-fix (2)
13.
response_formatis a dead field — declared but never consumed at runtimeNo code in
src/readsresponse_formatfrom a loaded actor and passes it to an LLM provider. The field exists only in the schema and the preflight warning. The example YAML includes a full JSON Schema forEstimationReport, and warnings scold users who omit it, but nothing enforces structured output at runtime. This creates a false contract.Fix: Either wire
response_formatinto the LLM call path, or add an explicit# TODO: runtime enforcement planned for <issue>and note the gap in CHANGELOG.14.
RoleHintenum has 5 members butACTOR_ROLEShas 4 —reviewdrifts silentlyRoleHintdefinesSTRATEGY, EXECUTION, ESTIMATION, INVARIANT_RECONCILIATION, REVIEW.PlanPreflightGuardrail.ACTOR_ROLESonly lists the first 4 as strings. These are independently maintained — neither derives from the other. Adding a 6th role will require updating both in lockstep.Fix: Derive
ACTOR_ROLESfromRoleHint, or add a comment explaining whyreviewis intentionally excluded.P2:should-fix (3)
15.
schema.pyat 974 lines — nearly 2× the 500-line limitThis PR adds 79 lines (895→974). The new
RoleHint,_coerce_role_hint,_coerce_context_view, andactor_role_warningsform a natural extraction unit (actor/role_validation.py).16.
actor_role_warningsmodel-input branch (ActorConfigSchema) entirely untestedThe function has two branches: dict (tested via CLI) and
ActorConfigSchema(zero test coverage). A bug in the model branch would go undetected.17.
context_viewwarning path untestedThe second warning (
context_view != strategist) is never exercised by any test. Only theresponse_formatwarning has coverage.P3:nit (2)
18.
_coerce_role_hintsilently swallows typosrole_hint: "estmation"(typo) → coercion returnsNone→ no warning generated. Add alogging.warning()for unrecognized values.19. Estimator example test doesn't verify
toolsorskillsfieldsThe
estimator.yamlscenario checks type/context_view/role_hint/response_format but not tool/skill contents. Other example scenarios verify these.Summary (Round 2)
Combined with Round 1: 1 P0 + 6 P1 + 7 P2 + 5 P3 = 19 total findings.
Code Review Round 3 — PR #975
feature/m6-estimation-actor-yamlReviewer: @brent.edwards | Focus: Coercion edge cases, schema backward compatibility
Rounds 1-2 found 19 issues. This round adds 1 genuinely new finding.
P2:should-fix (1)
20.
_coerce_context_viewswallows invalid values — suppresses estimation actor warningschema.py:_coerce_context_view()catchesValueErroron unrecognized strings and returnsNone. Downstream inactor_role_warnings, the check isif context_view not in (None, ContextView.STRATEGIST)— sinceNoneis in the tuple, the warning is suppressed. An estimation actor withcontext_view: "exeecutor"(typo) gets no warning, because the typo becomesNonewhich is treated as "not set."This is the specific mechanism through which the already-reported preflight divergence (Round 1 P0-1) silently drops checks. The suggested fix from Round 1 (delegate to
actor_role_warnings()) would inherit this bug.Fix: Return the raw string from
_coerce_context_viewwhen unrecognized (instead ofNone), and adjust the caller to treat unrecognized values as warning-worthy:Backward compatibility verified
Both
role_hintandresponse_formatareOptionalwithdefault=None. Tested against existing YAML files (simple_llm.yaml,graph_workflow.yaml, etc.) — all validate without error. No breakage.Summary (Round 3)
Combined total across 3 rounds: 1 P0 + 6 P1 + 8 P2 + 5 P3 = 20 findings. No further review rounds planned.
Code Review Round 4 — PR #975
feature/m6-estimation-actor-yamlReviewer: @brent.edwards | Focus: Preflight execution path tracing, E2E helper consistency
Rounds 1–3 found 20 issues (1 P0, 6 P1, 8 P2, 5 P3). This round adds 1 genuinely new P1.
P1:must-fix (1)
21.
check_estimation_actor_compatibility_warning()is dead code — can never fire in productionThe sole production caller in
plan_preflight_guardrail.pybuildsactor_registryas:Values are always strings (actor names like
"local/estimator"or sentinel placeholders like"__optional__").The new method then does:
isinstance(estimation_actor, dict)is neverTruebecause the registry values are actor name strings, not config dicts. The entire estimation-compatibility warning feature introduced by this PR can never fire during preflight.This is distinct from the Round 1 P0-1 finding (which noted the logic was duplicated and divergent). This finding is that the preflight copy is functionally unreachable — the guardrail is completely inert.
Fix: The method needs to resolve the actor name string to its config dict (from the actor registry/DB) before inspecting
response_format. Or, as suggested in Round 1, delegate toactor_role_warnings()with the actual loaded config data.Verified clean (no issues)
role_hint/response_formatcrash on load?dictviaconfig_blob;ActorConfigSchemaonly used in YAML file loader; new fields default toNoneactor_role_warnings(role_hint=None)if role_hint != RoleHint.ESTIMATION: return []_coerce_role_hint_coerce_role_hintis a standalone function, not a@field_validator; only called in thedictbranch ofactor_role_warningsACTOR_ROLESunchanged, warnings don't affectall_passed_print_role_warningsfire spuriously?actor_role_warningsreturns[]for non-estimation actorsSummary (Round 4)
Combined total across 4 rounds: 1 P0 + 7 P1 + 8 P2 + 5 P3 = 21 findings.
The new P1 (#21) is the most impactful finding this round — it means the primary feature of this PR (preflight estimation warnings) is silently non-functional. Combined with the Round 1 P0-1 (logic divergence), the preflight warning system needs a fundamental rethink: resolve actor names to configs before checking, and delegate to the single
actor_role_warnings()source of truth.Fifth-pass review — Factual accuracy of CHANGELOG, PR body, and docs
Prior rounds flagged the dense CHANGELOG, stale Robot docs, and dead preflight warning code. This pass verified the factual accuracy of claims in documentation against what the code actually does. One genuinely new P1 finding.
P1 —
response_formatis inert: "output contract" claim in CHANGELOG and docs is falseWhat the docs/CHANGELOG claim:
docs/reference/actors_examples.md: "definesresponse_formatwith anEstimationReportJSON schema so outputs are consistently machine-readable"examples/actors/estimator.yamlsystem_prompt: "Always return valid JSON matching the provided response_format schema"What the code actually does:
response_formatis declared as adict[str, Any] | Nonefield onActorConfigSchema(schema.py:728)actor_role_warnings()andcheck_estimation_actor_compatibility_warning()(warning if missing)loader.py), actor compiler (compiler.py), or any LLM invocation codeI searched every
src/file on the branch forresponse_format— it appears in exactly two files:actor/schema.py(field definition + warning logic) andplan_preflight_guardrail.py(warning logic). No runtime code consumes it.Consequences:
response_formatparameter or in any other form.Suggested fix: Either:
response_formatinto the LLM invocation path so it's actually used, ORP1 (sharpened) — CHANGELOG "both" claim is factually wrong (related to known dead-code issue)
The CHANGELOG says:
Prior rounds flagged the preflight warning as dead code. To sharpen why: in
plan_lifecycle_service.py:863-867, the production caller passes strings (actor names) as actor_registry values:But
check_estimation_actor_compatibility_warning()immediately bails on non-dicts:The tests pass because they construct synthetic registries with dict values, which doesn't match production. The word "both" in the CHANGELOG is a factual inaccuracy — the warning only fires in the CLI
actor add/actor updatepath, never in preflight.Suggested fix: Change "in both actor registration flow and plan preflight guardrails" to "during CLI actor registration (
actor add/actor update)". Either fix the production integration or remove the preflight claim.@ -5,0 +8,4 @@registration flow and plan preflight guardrails. Added`examples/actors/estimator.yaml` with `context_view: strategist` and`EstimationReport` JSON schema output contract, and updated actor exampledocs and schema/example test coverage (Behave + Robot). Also aligned M1/M2/M3/M6P1 — Factual inaccuracy: This line claims an "EstimationReport JSON schema output contract" but
response_formatis never consumed by any LLM invocation code. The JSON schema is parsed and stored on the config object, then ignored at runtime. Calling it an "output contract" implies enforcement that does not exist.Also: "in both actor registration flow and plan preflight guardrails" — the preflight path never fires in production because
plan_lifecycle_service.pypasses string actor names into the registry, while the warning check requires dict values. The word "both" is factually wrong.@ -286,0 +292,4 @@This actor is intended for the optional `estimation_actor` slot on actions. It uses`context_view: strategist`, declares `role_hint: estimation`, and defines`response_format` with an `EstimationReport` JSON schema so outputs are consistentlyP1 — Misleading claim: "defines
response_formatwith anEstimationReportJSON schema so outputs are consistently machine-readable" —response_formatis not passed to the LLM API or injected into prompts. Outputs are NOT consistently machine-readable as a result of this field; it's inert metadata. The LLM may or may not return JSON depending on its mood, not this schema.@ -0,0 +18,4 @@- Risk assessment with severity and mitigation suggestions- Key assumptions and unknownsAlways return valid JSON matching the provided response_format schema.This system_prompt line says "Always return valid JSON matching the provided response_format schema" — but the response_format schema defined below is never injected into the LLM context. The LLM will see this instruction but has no access to the actual schema definition. This is a broken reference in the prompt.
Code Review Round 5 — PR #975
feature/m6-estimation-actor-yamlReviewer: @brent.edwards | Focus: CLI persistence path tracing, test assertion fidelity, CHANGELOG accuracy, line-by-line P0/P1 sweep
Rounds 1–4 found 21 issues (1 P0, 7 P1, 8 P2, 5 P3). This round examined 5 additional threads plus a final line-by-line production code audit.
No new P0 or P1 findings.
The fifth pass confirmed the production code changes are crash-safe, backward-compatible, and contain no data corruption or security vectors. The new Pydantic fields default to
None, all dict access is guarded, and the warning system is strictly advisory (cannot affectall_passed).Threads investigated:
role_hintsurvive YAML → canonicalize → DB?_canonicalize_actor_configdoes shallow dict copy, no field strippingresponse_formatwork?_print_role_warningsinputpfg-check2-warnuse a production-realistic registry?P2:should-fix (2)
22.
_coerce_role_hintis case-sensitive —role_hint: ESTIMATIONsilently bypasses warningsRoleHint("ESTIMATION")raisesValueError(enum values are lowercase). Theexcept ValueError: return Nonepath treats uppercase as "not set," soactor_role_warningsreturns[]. The actor YAML is accepted and stored, but the estimation advisory warnings never fire. A.lower()before enum construction would fix this.23.
response_formattest assertion checks onlytitlekey out of 25-line JSON Schema dictactor_schema_steps.pyasserts onlycontext.actor_config.response_format.get("title") == expected_title. If the remaining 24 lines of the JSON Schema ($schema,type,required,properties,additionalProperties) were deleted from the example YAML, the test would still pass. Add at minimum a check that"required"or"properties"keys are present.Summary (Round 5)
Combined total across 5 rounds: 1 P0 + 7 P1 + 10 P2 + 5 P3 = 23 findings.
I'm confident the P0/P1 surface for this PR is fully covered. The core issues remain:
actor_role_warningsresponse_formatdeclared but never consumed at runtimeThese three form a coherent picture: the PR's headline feature — "preflight compatibility warnings for estimation actors" — is non-functional in production. The fix is straightforward: resolve actor name strings to their config dicts before checking, and delegate to
actor_role_warnings()as the single source of truth.No further review rounds planned.
Brent Review Response (PR #975)
Hi Brent — thank you for the depth and persistence across all review rounds.
This response is intentionally mapped to the full review history in
PR_COMMENTS.md(not only the last test-stability fix), so re-review can be done against the original finding set.Scope of This Response
This covers all 5 rounds and addresses:
Detailed task tracking is in
PR_REVIEW_CHECKLIST_AND_CHANGES.md; this file is the reviewer-facing narrative summary.P0 Items
P0-1: Preflight divergence from
actor_role_warningsResolved. Preflight checks now use shared warning logic instead of maintaining a parallel, drifting subset.
This removed inconsistency between CLI actor warnings and preflight warnings and aligned behavior for
response_format+context_viewcompatibility messaging.P0-2: duplicated provider-unavailable helper in M1/M2/M3/M6 Robot helpers
Resolved. Duplicated logic was extracted to shared E2E helper code and all affected helpers now import the shared implementation.
P0-3: PR metadata (milestone/Type label/assignee)
This is a Forgejo/Git hosting metadata task, not a codebase diff task.
Code changes in this branch do not directly set repository PR labels/milestone/assignee, but the requirement is acknowledged and tracked as an explicit checklist item.
P1 Items
response_formatvalidation and runtime-claim accuracyResolved in two parts:
response_formatstructures rather than accepting arbitrary dict payloads.Provider-unavailable heuristic too broad
Resolved. Matching logic was tightened away from overly broad key-name substring behavior and centralized in the shared helper.
Warning model category hard-coding
Resolved. Warnings are represented structurally (category + message), and report rendering no longer assumes a single warning class.
Dead preflight path (string actor name vs dict config payload)
Resolved. The lifecycle/preflight path now resolves actor-name references to concrete actor config payloads before compatibility checks, so warnings execute in production-realistic flows rather than only synthetic test dict fixtures.
Role enum drift risk (
RoleHintvs static role list)Resolved. Role list derivation now follows enum-backed source logic to avoid manual drift.
P2 Items
File-size and modularity concerns
Resolved for new role logic added by this issue: role validation/coercion/warning logic was extracted into
src/cleveragents/actor/role_validation.py, reducing schema bloat and isolating responsibility.Missing branch coverage paths identified in review
Resolved with explicit Behave/Robot additions, including:
role_hintrejection,response_formatvalidation paths,ActorConfigSchemamodel-input warning branch,context_viewwarning branch,required/propertiesetc., not only title),PR checklist quality gates status
Updated in
PR_DESCRIPTION.mdto reflect completed quality gates and aligned with executed sessions.Robot stale docs/assertion mismatch
Resolved by making the assertion non-brittle and keeping docs/assertions consistent for example count behavior.
P3 Items
Resolved nitpicks include:
"$schema"quoting + improved description text),Full-Run Stability Work (Post-Review Test Execution)
After functional fixes, there remained an intermittent “passes individually, fails in full run” behavior during
coverage_report.Root cause was cross-scenario patcher leakage in certain step modules under large combined execution.
Resolved by:
PM Status — Day 37 — Rebase Required
This PR has merge conflicts and cannot be merged in its current state. 42% of all open PRs (21 of 50) have conflicts — this is a project-wide issue that must be resolved.
@aditya — Please rebase this PR onto
masterby Day 39 EOD (2026-03-19). If you cannot rebase by then, please post a comment explaining the blocker.PM rebase request — Day 37
Re-Review — PR #975
feature/m6-estimation-actor-yamlReviewer: @brent.edwards | Round 6 — verifying resolution of all critical findings from 5 prior rounds
Prior Critical Findings Status
actor_role_warningscheck_estimation_actor_compatibility_warningsnow delegates toactor_role_warnings()for all code paths (dict, model object viamodel_dump, string viaactor_resolver). No reimplemented logic.actor_resolver: Callableparameter. String entries resolved through callback before warning checks. Falls back to[]if no resolver.response_formatdead fieldactor_role_warnings()for estimation actor warnings. Not consumed at LLM call time, but that's expected for this PR's scope (schema+validation only).No new P0/P1 findings.
The most impactful finding from 5 rounds of review — that the preflight warning was 100% dead code due to registry passing strings — has been fixed with a clean
actor_resolvercallback pattern. The single source of truth is nowactor_role_warnings().Verdict: APPROVED — All 23 findings from 5 prior rounds have been addressed or accepted. The PR is ready to merge.
9e30b742932764fcef5cNew commits pushed, approval review dismissed automatically according to repository settings
Rebased onto latest master, resolved conflict in CHANGELOG.md, already approved merging once all checks are passed