CI Observability and Agent-Accessible Diagnostics #2749

Open
opened 2026-04-04 17:16:27 +00:00 by freemo · 3 comments
Owner

Background and Context

The Forgejo REST API does not expose CI job logs directly. When autonomous agents need to diagnose CI failures on pull requests, they have no way to access the actual output from failed nox sessions in CI. This forces agents to re-run nox locally in their cloned repos, which is:

  1. Slow — re-running the full nox session duplicates work already done in CI.
  2. Inaccurate — the local environment may differ from CI (different Python version, missing system dependencies, different OS).
  3. Wasteful — agent time and LLM tokens are spent reproducing failures instead of fixing them.

This epic covers making CI outputs accessible to agents as downloadable artifacts, and updating agent definitions so they know how to consume these artifacts.

Acceptance Criteria

  1. All nox-running CI jobs in .forgejo/workflows/ci.yml capture their complete output (stdout + stderr) to a text file.
  2. These log files are uploaded as Forgejo artifacts with a consistent naming convention.
  3. Artifacts are uploaded even when the job fails (using if: always()).
  4. All relevant agent definitions (ca-pr-checker, ca-lint-fixer, ca-typecheck-fixer, ca-unit-test-runner, ca-integration-test-runner, ca-coverage-checker, ca-pr-self-reviewer) are updated with instructions on how to download and read these artifacts.
  5. Agents prefer artifact-based diagnosis over local nox re-runs when CI logs are available.

Scope

  • CI workflow modifications to capture nox output as text log files.
  • Artifact upload configuration for all nox-based CI jobs.
  • Agent definition updates for artifact awareness and consumption.
  • Future extensions: structured error parsing from CI logs, CI log summaries, artifact-based auto-diagnosis.

Demonstrable Outcome

When this epic is complete, an agent diagnosing a CI failure on a PR can:

  1. Query the Forgejo API for CI run artifacts.
  2. Download the relevant log file (e.g., ci-logs-lint for a lint failure).
  3. Read the exact nox output from CI to identify the failure.
  4. Fix the issue based on the actual CI output, without re-running nox locally.

Child Issues

(Linked via Forgejo dependencies)

## Background and Context The Forgejo REST API does not expose CI job logs directly. When autonomous agents need to diagnose CI failures on pull requests, they have no way to access the actual output from failed nox sessions in CI. This forces agents to re-run nox locally in their cloned repos, which is: 1. **Slow** — re-running the full nox session duplicates work already done in CI. 2. **Inaccurate** — the local environment may differ from CI (different Python version, missing system dependencies, different OS). 3. **Wasteful** — agent time and LLM tokens are spent reproducing failures instead of fixing them. This epic covers making CI outputs accessible to agents as downloadable artifacts, and updating agent definitions so they know how to consume these artifacts. ## Acceptance Criteria 1. All nox-running CI jobs in `.forgejo/workflows/ci.yml` capture their complete output (stdout + stderr) to a text file. 2. These log files are uploaded as Forgejo artifacts with a consistent naming convention. 3. Artifacts are uploaded even when the job fails (using `if: always()`). 4. All relevant agent definitions (`ca-pr-checker`, `ca-lint-fixer`, `ca-typecheck-fixer`, `ca-unit-test-runner`, `ca-integration-test-runner`, `ca-coverage-checker`, `ca-pr-self-reviewer`) are updated with instructions on how to download and read these artifacts. 5. Agents prefer artifact-based diagnosis over local nox re-runs when CI logs are available. ## Scope - CI workflow modifications to capture nox output as text log files. - Artifact upload configuration for all nox-based CI jobs. - Agent definition updates for artifact awareness and consumption. - Future extensions: structured error parsing from CI logs, CI log summaries, artifact-based auto-diagnosis. ## Demonstrable Outcome When this epic is complete, an agent diagnosing a CI failure on a PR can: 1. Query the Forgejo API for CI run artifacts. 2. Download the relevant log file (e.g., `ci-logs-lint` for a lint failure). 3. Read the exact nox output from CI to identify the failure. 4. Fix the issue based on the actual CI output, without re-running nox locally. ## Child Issues _(Linked via Forgejo dependencies)_
Author
Owner

Label compliance fix applied:

  • Added missing label: Priority/Backlog
  • Reason: Epic was missing required Priority/* label per CONTRIBUTING.md. Applied Priority/Backlog as default.

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

Label compliance fix applied: - Added missing label: `Priority/Backlog` - Reason: Epic was missing required `Priority/*` label per CONTRIBUTING.md. Applied `Priority/Backlog` as default. --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
Author
Owner

Epic completeness check:

The child issue #2750 ("chore(ci): capture nox output as CI artifacts and update agent definitions to consume them") was closed on 2026-04-04 after PR #2782 was merged.

Acceptance Criteria Review:

  1. All nox-running CI jobs capture complete output (stdout + stderr) to text files — implemented in PR #2782
  2. Log files uploaded as Forgejo artifacts with consistent naming convention (ci-logs-<job>) — implemented in PR #2782
  3. Artifacts uploaded even when job fails (if: always()) — implemented in PR #2782
  4. All relevant agent definitions updated with artifact download instructions — implemented in PR #2782
  5. Agents prefer artifact-based diagnosis over local nox re-runs — implemented in PR #2782

All acceptance criteria appear to be met. This epic may be ready to close. Please review and close if the implementation is satisfactory.


Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

Epic completeness check: The child issue #2750 ("chore(ci): capture nox output as CI artifacts and update agent definitions to consume them") was **closed on 2026-04-04** after PR #2782 was merged. **Acceptance Criteria Review:** 1. ✅ All nox-running CI jobs capture complete output (stdout + stderr) to text files — implemented in PR #2782 2. ✅ Log files uploaded as Forgejo artifacts with consistent naming convention (`ci-logs-<job>`) — implemented in PR #2782 3. ✅ Artifacts uploaded even when job fails (`if: always()`) — implemented in PR #2782 4. ✅ All relevant agent definitions updated with artifact download instructions — implemented in PR #2782 5. ✅ Agents prefer artifact-based diagnosis over local nox re-runs — implemented in PR #2782 **All acceptance criteria appear to be met.** This epic may be ready to close. Please review and close if the implementation is satisfactory. --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
Owner

MoSCoW classification: MoSCoW/Should Have

Rationale: This Epic addresses a real operational inefficiency — autonomous agents currently cannot access CI logs directly, forcing them to re-run nox locally (slow, inaccurate, wasteful). Making CI artifacts accessible would significantly improve agent effectiveness.

Why Should Have (not Must Have):

  • The system currently functions without this — agents can still diagnose failures by re-running nox locally
  • This is an optimization/quality-of-life improvement for the autonomous agent system, not core product functionality
  • No active milestone explicitly requires this for delivery

Why Should Have (not Could Have):

  • The impact is significant: agents waste substantial time and tokens re-running CI locally
  • With 32 parallel workers and frequent CI failures, this overhead compounds rapidly
  • The implementation is well-scoped and the acceptance criteria are clear
  • This directly enables faster PR fixing cycles, which is critical for overall velocity

Milestone Recommendation: This should be assigned to v3.7.0 or v3.8.0 as a quality-of-life improvement for the autonomous agent system.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner

MoSCoW classification: **MoSCoW/Should Have** Rationale: This Epic addresses a real operational inefficiency — autonomous agents currently cannot access CI logs directly, forcing them to re-run nox locally (slow, inaccurate, wasteful). Making CI artifacts accessible would significantly improve agent effectiveness. **Why Should Have (not Must Have):** - The system currently functions without this — agents can still diagnose failures by re-running nox locally - This is an optimization/quality-of-life improvement for the autonomous agent system, not core product functionality - No active milestone explicitly requires this for delivery **Why Should Have (not Could Have):** - The impact is significant: agents waste substantial time and tokens re-running CI locally - With 32 parallel workers and frequent CI failures, this overhead compounds rapidly - The implementation is well-scoped and the acceptance criteria are clear - This directly enables faster PR fixing cycles, which is critical for overall velocity **Milestone Recommendation**: This should be assigned to v3.7.0 or v3.8.0 as a quality-of-life improvement for the autonomous agent system. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#2749
No description provided.