Proposal: improve ca-pr-self-reviewer — distinguish blocking vs non-blocking review issues to reduce 100% REQUEST_CHANGES rate #4506

New issue

Closed

opened 2026-04-08 13:56:25 +00:00 by HAL9000 · 0 comments

HAL9000 commented

2026-04-08 13:56:25 +00:00

Owner

Agent Improvement Proposal

Pattern Detected

Type: prompt_improvement
Affected Agent: ca-pr-self-reviewer
Evidence: 100% REQUEST_CHANGES rate across 20+ PR reviews in current session — zero APPROVEs

Detailed Evidence

During the current build session (issue #4373, started 2026-04-08), the continuous-pr-reviewer pool has reviewed 20+ PRs. Every single review resulted in REQUEST_CHANGES. Not a single PR was approved.

Review results from session state comments (all REQUEST_CHANGES):

PR	Type	Review Result
#4218	fix(checkpoint)	REQUEST_CHANGES
#4219	fix(acms)	REQUEST_CHANGES
#3900	UAT ProjectService	REQUEST_CHANGES
#3880	fix(validation)	REQUEST_CHANGES
#3774	fix Click 8.2+	REQUEST_CHANGES
#4197	fix(plan)	REQUEST_CHANGES
#3696	docs v3.8.1	REQUEST_CHANGES
#1582	fix(a2a) SSE	REQUEST_CHANGES
#1496	fix v3.7.0	REQUEST_CHANGES
#1493	fix v3.7.0	REQUEST_CHANGES
#1490	fix v3.7.0	REQUEST_CHANGES
#1486	fix v3.7.0	REQUEST_CHANGES
#1485	fix v3.7.0	REQUEST_CHANGES
#1484	fix v3.7.0	REQUEST_CHANGES
#1482	fix v3.7.0	REQUEST_CHANGES
#1480	fix v3.7.0	REQUEST_CHANGES
#1280	fix	REQUEST_CHANGES

Common reasons cited for REQUEST_CHANGES:

Missing CONTRIBUTING.md metadata (closing keywords, milestones, labels)
examples.json data loss (replacing existing entries instead of appending)
Merge conflicts requiring rebase

Impact of 100% rejection rate:

Zero merge throughput: No PRs are being approved, so even if master CI were green, nothing could merge
Infinite fix-review loop: Implementation workers fix review feedback → reviewer reviews again → REQUEST_CHANGES again → cycle repeats
Worker time wasted: 8 implementation workers are spending all their time fixing review feedback instead of implementing new features
PR backlog growing: 50+ open PRs with no path to merge

Root Cause Analysis

The ca-pr-self-reviewer.md agent definition (line 84) contains this instruction:

YOUR REVIEW AUTHORITY:

REQUEST CHANGES for ANY CONTRIBUTING.md violation

This creates a binary outcome: if ANY CONTRIBUTING.md rule is violated — even minor metadata issues like a missing Closes #N keyword or a missing label — the entire PR gets REQUEST_CHANGES. There is no concept of "minor issues that can be fixed post-merge" or "non-blocking suggestions."

The reviewer correctly identifies issues, but treats ALL issues as equally blocking. A missing PR label is treated the same as a security vulnerability or a broken API contract.

Proposed Change

Add a severity classification to the review decision logic in ca-pr-self-reviewer.md. Modify the "Make a Decision" section (lines 254-337) to distinguish between:

BLOCKING issues (→ REQUEST_CHANGES):

Code correctness bugs (logic errors, race conditions, resource leaks)
Security vulnerabilities
Specification misalignment (wrong behavior implemented)
# type: ignore usage (explicitly forbidden)
Tests in wrong directories or using wrong frameworks
Architectural violations (module boundary, layer violations)

NON-BLOCKING issues (→ APPROVE with comments):

Missing PR metadata (closing keywords, milestone, labels)
Minor formatting or naming suggestions
Documentation improvements
Merge conflicts (these are resolved separately, not a code quality issue)
Missing but non-critical test edge cases

Add this guidance to the decision section:

### Severity Classification

Before deciding APPROVE vs REQUEST_CHANGES, classify each finding:

**BLOCKING** (must be fixed before merge):
- Code correctness issues that would cause bugs in production
- Security vulnerabilities
- Specification violations (wrong behavior)
- Forbidden patterns (type: ignore, pytest tests, wrong directories)
- Architectural violations

**NON-BLOCKING** (note in review, but do not block merge):
- Missing PR metadata (closing keywords, labels, milestone)
- Minor style suggestions
- Merge conflicts (handled by implementation workers separately)
- Documentation improvements
- Test coverage suggestions for non-critical paths

**Decision rule:**
- If ANY blocking issue exists → REQUEST_CHANGES
- If ONLY non-blocking issues exist → APPROVE with comments listing the non-blocking items
- If no issues at all → APPROVE

Also modify line 84 from:

REQUEST CHANGES for ANY CONTRIBUTING.md violation

To:

REQUEST CHANGES for BLOCKING CONTRIBUTING.md violations (code quality, testing framework, type safety, file organization). For non-blocking metadata issues (missing closing keywords, labels, milestone), APPROVE with comments noting the issues.

Expected Impact

Reduced REQUEST_CHANGES rate: PRs with only metadata issues will be approved with comments instead of blocked
Faster merge throughput: PRs that are code-correct but have minor metadata gaps can proceed to merge
Implementation workers freed: Workers can move to new issues instead of endlessly fixing metadata
PR backlog reduction: The 50+ PR backlog should start clearing as PRs get approved
Metadata still tracked: Non-blocking issues are still noted in the review comments, so they can be addressed

Risk Assessment

Medium risk: Loosening review criteria could allow some PRs with metadata issues to merge. However, metadata issues (missing labels, closing keywords) are easily fixable post-merge and don't affect code quality.
Mitigation: The blocking/non-blocking classification is conservative — all code quality, security, and architectural issues remain blocking. Only PR metadata and formatting issues become non-blocking.
Potential concern: If the reviewer misclassifies a blocking issue as non-blocking, a bad PR could be approved. However, the current 100% rejection rate is clearly worse — it's preventing ALL progress, including good PRs.

This is a proposal from the agent evolver. A human must approve this issue before the change will be implemented. To approve: remove the needs feedback label, add State/Verified, or comment with approval.

Automated by CleverAgents Bot
Supervisor: Agent Evolver | Agent: agent-evolver

## Agent Improvement Proposal ### Pattern Detected **Type**: prompt_improvement **Affected Agent**: `ca-pr-self-reviewer` **Evidence**: 100% REQUEST_CHANGES rate across 20+ PR reviews in current session — zero APPROVEs ### Detailed Evidence During the current build session (issue #4373, started 2026-04-08), the continuous-pr-reviewer pool has reviewed 20+ PRs. **Every single review resulted in REQUEST_CHANGES.** Not a single PR was approved. **Review results from session state comments (all REQUEST_CHANGES):** | PR | Type | Review Result | |----|------|---------------| | #4218 | fix(checkpoint) | REQUEST_CHANGES | | #4219 | fix(acms) | REQUEST_CHANGES | | #3900 | UAT ProjectService | REQUEST_CHANGES | | #3880 | fix(validation) | REQUEST_CHANGES | | #3774 | fix Click 8.2+ | REQUEST_CHANGES | | #4197 | fix(plan) | REQUEST_CHANGES | | #3696 | docs v3.8.1 | REQUEST_CHANGES | | #1582 | fix(a2a) SSE | REQUEST_CHANGES | | #1496 | fix v3.7.0 | REQUEST_CHANGES | | #1493 | fix v3.7.0 | REQUEST_CHANGES | | #1490 | fix v3.7.0 | REQUEST_CHANGES | | #1486 | fix v3.7.0 | REQUEST_CHANGES | | #1485 | fix v3.7.0 | REQUEST_CHANGES | | #1484 | fix v3.7.0 | REQUEST_CHANGES | | #1482 | fix v3.7.0 | REQUEST_CHANGES | | #1480 | fix v3.7.0 | REQUEST_CHANGES | | #1280 | fix | REQUEST_CHANGES | **Common reasons cited for REQUEST_CHANGES:** - Missing CONTRIBUTING.md metadata (closing keywords, milestones, labels) - `examples.json` data loss (replacing existing entries instead of appending) - Merge conflicts requiring rebase **Impact of 100% rejection rate:** 1. **Zero merge throughput**: No PRs are being approved, so even if master CI were green, nothing could merge 2. **Infinite fix-review loop**: Implementation workers fix review feedback → reviewer reviews again → REQUEST_CHANGES again → cycle repeats 3. **Worker time wasted**: 8 implementation workers are spending all their time fixing review feedback instead of implementing new features 4. **PR backlog growing**: 50+ open PRs with no path to merge ### Root Cause Analysis The `ca-pr-self-reviewer.md` agent definition (line 84) contains this instruction: > **YOUR REVIEW AUTHORITY:** > - REQUEST CHANGES for ANY CONTRIBUTING.md violation This creates a binary outcome: if ANY CONTRIBUTING.md rule is violated — even minor metadata issues like a missing `Closes #N` keyword or a missing label — the entire PR gets REQUEST_CHANGES. There is no concept of "minor issues that can be fixed post-merge" or "non-blocking suggestions." The reviewer correctly identifies issues, but treats ALL issues as equally blocking. A missing PR label is treated the same as a security vulnerability or a broken API contract. ### Proposed Change Add a **severity classification** to the review decision logic in `ca-pr-self-reviewer.md`. Modify the "Make a Decision" section (lines 254-337) to distinguish between: **BLOCKING issues (→ REQUEST_CHANGES):** - Code correctness bugs (logic errors, race conditions, resource leaks) - Security vulnerabilities - Specification misalignment (wrong behavior implemented) - `# type: ignore` usage (explicitly forbidden) - Tests in wrong directories or using wrong frameworks - Architectural violations (module boundary, layer violations) **NON-BLOCKING issues (→ APPROVE with comments):** - Missing PR metadata (closing keywords, milestone, labels) - Minor formatting or naming suggestions - Documentation improvements - Merge conflicts (these are resolved separately, not a code quality issue) - Missing but non-critical test edge cases Add this guidance to the decision section: ``` ### Severity Classification Before deciding APPROVE vs REQUEST_CHANGES, classify each finding: **BLOCKING** (must be fixed before merge): - Code correctness issues that would cause bugs in production - Security vulnerabilities - Specification violations (wrong behavior) - Forbidden patterns (type: ignore, pytest tests, wrong directories) - Architectural violations **NON-BLOCKING** (note in review, but do not block merge): - Missing PR metadata (closing keywords, labels, milestone) - Minor style suggestions - Merge conflicts (handled by implementation workers separately) - Documentation improvements - Test coverage suggestions for non-critical paths **Decision rule:** - If ANY blocking issue exists → REQUEST_CHANGES - If ONLY non-blocking issues exist → APPROVE with comments listing the non-blocking items - If no issues at all → APPROVE ``` Also modify line 84 from: > REQUEST CHANGES for ANY CONTRIBUTING.md violation To: > REQUEST CHANGES for BLOCKING CONTRIBUTING.md violations (code quality, testing framework, type safety, file organization). For non-blocking metadata issues (missing closing keywords, labels, milestone), APPROVE with comments noting the issues. ### Expected Impact - **Reduced REQUEST_CHANGES rate**: PRs with only metadata issues will be approved with comments instead of blocked - **Faster merge throughput**: PRs that are code-correct but have minor metadata gaps can proceed to merge - **Implementation workers freed**: Workers can move to new issues instead of endlessly fixing metadata - **PR backlog reduction**: The 50+ PR backlog should start clearing as PRs get approved - **Metadata still tracked**: Non-blocking issues are still noted in the review comments, so they can be addressed ### Risk Assessment - **Medium risk**: Loosening review criteria could allow some PRs with metadata issues to merge. However, metadata issues (missing labels, closing keywords) are easily fixable post-merge and don't affect code quality. - **Mitigation**: The blocking/non-blocking classification is conservative — all code quality, security, and architectural issues remain blocking. Only PR metadata and formatting issues become non-blocking. - **Potential concern**: If the reviewer misclassifies a blocking issue as non-blocking, a bad PR could be approved. However, the current 100% rejection rate is clearly worse — it's preventing ALL progress, including good PRs. --- *This is a proposal from the agent evolver. A human must approve this issue before the change will be implemented. To approve: remove the `needs feedback` label, add `State/Verified`, or comment with approval.* --- **Automated by CleverAgents Bot** Supervisor: Agent Evolver | Agent: agent-evolver

Rows
Columns