Proposal: improve ca-test-infra-improver — prevent false positive TLS issues and duplicate filing #2029

Open
opened 2026-04-03 02:43:04 +00:00 by freemo · 1 comment
Owner

Agent Improvement Proposal

Pattern Detected

Type: Prompt improvement — false positive infrastructure issues and massive duplicate filing
Affected Agent: ca-test-infra-improver (Worker Mode)
Evidence: During the v3.7.0 session, the test-infra-improver workers filed at least 12 duplicate issues about TLS/SSL clone failures on git.cleveragents.com:

Issue Title Status
#1592 TEST-INFRA: [ci-pipeline-design] Git clone fails with gnutls_handshake error Open
#1596 TEST-INFRA: [ci-pipeline-design] Repository is empty Open
#1598 TEST-INFRA: [ci-pipeline-design] Unable to clone repository Open
#1608 INFRA: Git clone fails with TLS handshake error for git.cleveragents.com Open
#1614 TEST-INFRA: [ci-pipeline-design] Implement explicit dependency caching for uv Open
#1615 TEST-INFRA: [ci-execution-time] Git clone fails with TLS error Open
#1616 TEST-INFRA: [ci-pipeline-design] Implement CI dependency caching Open
#1622 TEST-INFRA: [ci-pipeline-design] Optimize job dependencies and sequencing Open
#1634 TEST-INFRA: [test-data-quality] Improve test data management Open
#1641 TEST-INFRA: [ci-execution-time] High execution time for CI quality check Open
#1645 TEST-INFRA: [ci-execution-time] CRITICAL: Git repository is inaccessible due to TLS/SNI misconfiguration Open

Root Cause: The test-infra-improver workers are constructing the git clone URL using git.cleveragents.com (derived from the organization name "cleveragents") instead of the actual Forgejo host git.cleverthis.com. When the clone fails, each worker files a separate issue about the failure — and since each worker has a different focus_area, the titles are different enough that the duplicate avoidance logic (which searches for "TEST-INFRA:" prefix + similar titles) doesn't catch them.

Impact:

  • 12+ false positive issues polluting the issue tracker
  • Every worker session is wasted because none can actually clone the repo
  • Human review time wasted triaging duplicate infrastructure issues
  • The test-infra-improver has produced ZERO useful analysis across all 8 focus areas because every worker fails at the clone step

Related: This is the same root cause as the bug-hunter proposal #1595, but affects a different agent with a different manifestation (duplicate filing pattern).

Proposed Change

Modify the Worker Mode section in ca-test-infra-improver.md to:

  1. Add explicit hostname resolution guidance — instruct the agent to derive the git host from the Forgejo PAT URL or the FORGEJO_URL environment variable, NOT from the organization name. Add a warning: "The Forgejo host is NOT necessarily git.<org-name>.com. Always use the host from the PAT URL provided in your prompt (e.g., if PAT URL contains git.cleverthis.com, use that as the host)."

  2. Add scope restriction for infrastructure issues — instruct the agent that server infrastructure issues (TLS certificates, DNS, server configuration) are OUT OF SCOPE. The test-infra-improver should only file issues about the testing infrastructure within the repository (CI configs, nox sessions, test organization). If the clone fails, the agent should exit with an error, not file an issue.

  3. Add clone failure handling — if git clone fails, the agent should: (a) verify it's using the correct hostname from the PAT URL, (b) retry once, (c) if still failing, exit with a clear error message rather than filing a false positive issue.

  4. Strengthen duplicate detection — add guidance that before filing ANY issue, the agent should search for existing open issues containing keywords from the error message (e.g., "TLS", "clone", "gnutls"), not just the "TEST-INFRA:" prefix.

Expected Impact

  • Eliminates 12+ false positive infrastructure issues per session
  • Ensures test-infra-improver workers can actually clone and analyze the repo
  • Reduces noise in the issue tracker significantly
  • Allows the 8 analysis areas to actually be covered

Risk Assessment

  • Very low risk: These changes only add guardrails and guidance. No analysis logic is modified.
  • Potential concern: If a genuine CI pipeline issue involves TLS (e.g., CI runner can't fetch dependencies due to certificate issues), the scope restriction might cause the agent to miss it. However, the restriction is specifically about the git server infrastructure, not about CI pipeline TLS issues within the repository's CI configuration.

This is a proposal from the agent evolver. A human must approve this issue before the change will be implemented. To approve: remove the needs feedback label, add State/Verified, or comment with approval.


Automated by CleverAgents Bot
Supervisor: Agent Evolver | Agent: ca-agent-evolver

## Agent Improvement Proposal ### Pattern Detected **Type**: Prompt improvement — false positive infrastructure issues and massive duplicate filing **Affected Agent**: `ca-test-infra-improver` (Worker Mode) **Evidence**: During the v3.7.0 session, the test-infra-improver workers filed **at least 12 duplicate issues** about TLS/SSL clone failures on `git.cleveragents.com`: | Issue | Title | Status | |---|---|---| | #1592 | TEST-INFRA: [ci-pipeline-design] Git clone fails with gnutls_handshake error | Open | | #1596 | TEST-INFRA: [ci-pipeline-design] Repository is empty | Open | | #1598 | TEST-INFRA: [ci-pipeline-design] Unable to clone repository | Open | | #1608 | INFRA: Git clone fails with TLS handshake error for git.cleveragents.com | Open | | #1614 | TEST-INFRA: [ci-pipeline-design] Implement explicit dependency caching for uv | Open | | #1615 | TEST-INFRA: [ci-execution-time] Git clone fails with TLS error | Open | | #1616 | TEST-INFRA: [ci-pipeline-design] Implement CI dependency caching | Open | | #1622 | TEST-INFRA: [ci-pipeline-design] Optimize job dependencies and sequencing | Open | | #1634 | TEST-INFRA: [test-data-quality] Improve test data management | Open | | #1641 | TEST-INFRA: [ci-execution-time] High execution time for CI quality check | Open | | #1645 | TEST-INFRA: [ci-execution-time] CRITICAL: Git repository is inaccessible due to TLS/SNI misconfiguration | Open | **Root Cause**: The test-infra-improver workers are constructing the git clone URL using `git.cleveragents.com` (derived from the organization name "cleveragents") instead of the actual Forgejo host `git.cleverthis.com`. When the clone fails, each worker files a separate issue about the failure — and since each worker has a different `focus_area`, the titles are different enough that the duplicate avoidance logic (which searches for "TEST-INFRA:" prefix + similar titles) doesn't catch them. **Impact**: - 12+ false positive issues polluting the issue tracker - Every worker session is wasted because none can actually clone the repo - Human review time wasted triaging duplicate infrastructure issues - The test-infra-improver has produced ZERO useful analysis across all 8 focus areas because every worker fails at the clone step **Related**: This is the same root cause as the bug-hunter proposal #1595, but affects a different agent with a different manifestation (duplicate filing pattern). ### Proposed Change Modify the Worker Mode section in `ca-test-infra-improver.md` to: 1. **Add explicit hostname resolution guidance** — instruct the agent to derive the git host from the Forgejo PAT URL or the `FORGEJO_URL` environment variable, NOT from the organization name. Add a warning: "The Forgejo host is NOT necessarily `git.<org-name>.com`. Always use the host from the PAT URL provided in your prompt (e.g., if PAT URL contains `git.cleverthis.com`, use that as the host)." 2. **Add scope restriction for infrastructure issues** — instruct the agent that server infrastructure issues (TLS certificates, DNS, server configuration) are OUT OF SCOPE. The test-infra-improver should only file issues about the **testing infrastructure within the repository** (CI configs, nox sessions, test organization). If the clone fails, the agent should exit with an error, not file an issue. 3. **Add clone failure handling** — if `git clone` fails, the agent should: (a) verify it's using the correct hostname from the PAT URL, (b) retry once, (c) if still failing, exit with a clear error message rather than filing a false positive issue. 4. **Strengthen duplicate detection** — add guidance that before filing ANY issue, the agent should search for existing open issues containing keywords from the error message (e.g., "TLS", "clone", "gnutls"), not just the "TEST-INFRA:" prefix. ### Expected Impact - Eliminates 12+ false positive infrastructure issues per session - Ensures test-infra-improver workers can actually clone and analyze the repo - Reduces noise in the issue tracker significantly - Allows the 8 analysis areas to actually be covered ### Risk Assessment - **Very low risk**: These changes only add guardrails and guidance. No analysis logic is modified. - **Potential concern**: If a genuine CI pipeline issue involves TLS (e.g., CI runner can't fetch dependencies due to certificate issues), the scope restriction might cause the agent to miss it. However, the restriction is specifically about the *git server infrastructure*, not about CI pipeline TLS issues within the repository's CI configuration. --- *This is a proposal from the agent evolver. A human must approve this issue before the change will be implemented. To approve: remove the `needs feedback` label, add `State/Verified`, or comment with approval.* --- **Automated by CleverAgents Bot** Supervisor: Agent Evolver | Agent: ca-agent-evolver
Author
Owner

approved

approved
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#2029
No description provided.