Proposal: update specification — sync docs/development/ci-cd.md with actual CI pipeline (11-job status-check, artifact uploads, e2e_tests, helm) #2805

Open
opened 2026-04-04 20:31:39 +00:00 by freemo · 3 comments
Owner

Proposal: Sync docs/development/ci-cd.md with Actual CI Pipeline

What Changed in the Implementation

The following merged PRs triggered this proposal:

  • PR #2782 (chore(ci): capture nox output as CI artifacts and teach agents to read them) — Added artifact upload steps to all 8 nox-running CI jobs, capturing stdout+stderr as named Forgejo artifacts (ci-logs-<job>). Also added e2e_tests and helm jobs to the status-check consolidation gate, making the total 11 required jobs.
  • Direct pushes to master (commits 77427bd7d32f, 8c13e63c750a, dd17d0f8e698) — Added e2e_tests job and helm job to the CI pipeline, both of which are now required by status-check.

The actual .forgejo/workflows/ci.yml now has 11 jobs in the status-check consolidation gate:
lint, typecheck, security, quality, unit_tests, integration_tests, e2e_tests, coverage, build, docker, helm

What Spec Sections Need Updating

1. Required Status Checks Table (lines 20–29)

Current text:

| CI Job | What It Checks | Failure Means |
|--------|---------------|---------------|
| `lint` | Ruff format and lint (nox) | Code style violations or lint errors |
| `typecheck` | Pyright strict type checking (nox) | Type errors in `src/` |
| `security` | Security scan + dead code (nox) | Security vulnerabilities or dead code |
| `quality` | Radon complexity (nox) | Extremely complex methods (31+ cyclomatic complexity) |
| `unit_tests` | BDD unit tests via `nox -s unit_tests` | Failing unit test scenarios |
| `integration_tests` | Robot integration tests via `nox -s integration_tests` | Failing integration tests |
| `coverage` | Test coverage measurement (nox) | Coverage dropped below 97% |
| `build` | Wheel build | Build failure |

Proposed text:

| CI Job | What It Checks | Failure Means |
|--------|---------------|---------------|
| `lint` | Ruff format and lint (nox) | Code style violations or lint errors |
| `typecheck` | Pyright strict type checking (nox) | Type errors in `src/` |
| `security` | Security scan + dead code (nox) | Security vulnerabilities or dead code |
| `quality` | Radon complexity (nox) | Extremely complex methods (31+ cyclomatic complexity) |
| `unit_tests` | BDD unit tests via `nox -s unit_tests` | Failing unit test scenarios |
| `integration_tests` | Robot integration tests via `nox -s integration_tests` | Failing integration tests |
| `e2e_tests` | End-to-end Robot tests with real LLM APIs via `nox -s e2e_tests` | Failing E2E tests |
| `coverage` | Test coverage measurement (nox) | Coverage dropped below 97% |
| `build` | Wheel build | Build failure |
| `docker` | Docker image build and smoke test | Docker build failure |
| `helm` | Helm chart lint, template render, kubeconform validation | Helm chart invalid |

Rationale: e2e_tests, docker, and helm are all required by status-check in ci.yml but were missing from the Required Status Checks table.

2. Deployment-gating jobs section (lines 31–35)

Current text:

**Deployment-gating jobs** (run after core checks pass):

| CI Job | Depends On | What It Checks |
|--------|-----------|----------------|
| `docker` | `lint`, `typecheck`, `unit_tests`, `security` | Docker image builds and runs |

Proposed text: Remove this section entirely — docker and helm are now full required jobs in status-check, not optional deployment-gating jobs. The docker job depends on [lint, typecheck, security, quality, unit_tests] per ci.yml.

Rationale: The distinction between "required" and "deployment-gating" no longer exists — all 11 jobs are required by status-check.

3. Branch Protection Setup section (lines 52–78)

Current text (required checks list):

[x] Require status checks to pass before merging
    - Required checks:
      - lint
      - typecheck
      - security
      - quality
      - behave
      - coverage
      - build

Proposed text:

[x] Require status checks to pass before merging
    - Required checks:
      - status-check

Rationale: The status-check consolidation job is the single required check — it depends on all 11 jobs. Listing individual jobs is redundant and was already outdated (using behave instead of unit_tests, missing e2e_tests, helm, integration_tests). Using status-check as the single required check is the correct pattern.

4. CI Job Dependency Graph (lines 153–165)

Current text:

lint ──────────────────┐
typecheck ─────────────┤
                       ├── coverage (needs lint + typecheck)
security ──────────────┤
                       ├── docker (needs lint + typecheck + unit_tests + security)
unit_tests ────────────┘
integration_tests ──────── (independent)
quality ────────────────── (independent)
build ──────────────────── (independent)

Proposed text:

lint ──────────────────┐
typecheck ─────────────┤
security ──────────────┤
quality ───────────────┴── coverage (needs lint + typecheck + security + quality)
                       └── benchmark-regression (needs lint + typecheck + security + quality, PR only)

lint ──────────────────┐
typecheck ─────────────┤
security ──────────────┤
quality ───────────────┤
unit_tests ────────────┴── docker (needs lint + typecheck + security + quality + unit_tests)

unit_tests ─────────────── (independent)
integration_tests ──────── (independent)
e2e_tests ──────────────── (independent, 45-minute timeout)
quality ────────────────── (independent)
build ──────────────────── (independent)
helm ───────────────────── (independent)

All above ──────────────── status-check (needs all 11 jobs)

Rationale: The actual ci.yml shows coverage depends on [lint, typecheck, security, quality] (not just lint + typecheck), and docker depends on [lint, typecheck, security, quality, unit_tests]. The graph was missing e2e_tests, helm, and status-check.

5. Quality Gates Summary table (lines 174–188)

Current text:

| Unit Tests | All pass (3.11-3.13) | `behave` job |

Proposed text:

| Unit Tests | All pass | `unit_tests` job |
| E2E Tests | All pass | `e2e_tests` job |
| Helm | Chart valid | `helm` job |

Rationale: The job was renamed from behave to unit_tests. E2E tests and Helm are now required quality gates. The multi-Python-version claim (3.11-3.13) is not reflected in ci.yml which only uses Python 3.13.

6. CI Secrets table (lines 252–260)

Current text: Missing GOOGLE_API_KEY.

Proposed addition:

| `GOOGLE_API_KEY` | `e2e_tests` | Google API key for E2E tests using Google LLM providers |

Rationale: ci.yml line 320 passes GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }} to e2e_tests but the secrets table doesn't document it.

7. New section: CI Artifact Capture (after line 198)

Proposed new section:

### CI Artifact Capture

All 8 nox-running CI jobs capture their stdout+stderr output as named Forgejo
artifacts, uploaded even when the job fails (using `if: always()`). This
provides immediate access to structured CI logs for diagnosis.

| Artifact Name | Job | Contents |
|---------------|-----|----------|
| `ci-logs-lint` | `lint` | Ruff lint + format check output |
| `ci-logs-typecheck` | `typecheck` | Pyright output |
| `ci-logs-security` | `security` | Bandit + Vulture output |
| `ci-logs-quality` | `quality` | Radon complexity output |
| `ci-logs-unit-tests` | `unit_tests` | Behave BDD test output |
| `ci-logs-integration-tests` | `integration_tests` | Robot Framework output |
| `ci-logs-e2e-tests` | `e2e_tests` | E2E Robot Framework output |
| `ci-logs-coverage` | `coverage` | Slipcover coverage report |

Artifacts are retained for **30 days** and can be downloaded from the Forgejo
Actions UI under the relevant workflow run. The `coverage` job additionally
uploads `coverage-reports` (XML, JSON, and HTML) as a separate artifact.

To access artifacts: **Actions** > select the workflow run > scroll to
**Artifacts** section at the bottom.

Rationale: PR #2782 added this feature but it was never documented in ci-cd.md.

Scope

All changes are to docs/development/ci-cd.md only. No changes to docs/specification.md or any source code.

Classification

This is a documentation update to bring docs/development/ci-cd.md into alignment with the actual CI pipeline as implemented in .forgejo/workflows/ci.yml. The implementation is correct; the documentation is stale.


Automated by CleverAgents Bot
Supervisor: Spec Evolution | Agent: ca-spec-updater

## Proposal: Sync `docs/development/ci-cd.md` with Actual CI Pipeline ### What Changed in the Implementation The following merged PRs triggered this proposal: - **PR #2782** (`chore(ci): capture nox output as CI artifacts and teach agents to read them`) — Added artifact upload steps to all 8 nox-running CI jobs, capturing stdout+stderr as named Forgejo artifacts (`ci-logs-<job>`). Also added `e2e_tests` and `helm` jobs to the `status-check` consolidation gate, making the total 11 required jobs. - **Direct pushes to master** (commits `77427bd7d32f`, `8c13e63c750a`, `dd17d0f8e698`) — Added `e2e_tests` job and `helm` job to the CI pipeline, both of which are now required by `status-check`. The actual `.forgejo/workflows/ci.yml` now has **11 jobs** in the `status-check` consolidation gate: `lint`, `typecheck`, `security`, `quality`, `unit_tests`, `integration_tests`, `e2e_tests`, `coverage`, `build`, `docker`, `helm` ### What Spec Sections Need Updating #### 1. Required Status Checks Table (lines 20–29) **Current text:** ``` | CI Job | What It Checks | Failure Means | |--------|---------------|---------------| | `lint` | Ruff format and lint (nox) | Code style violations or lint errors | | `typecheck` | Pyright strict type checking (nox) | Type errors in `src/` | | `security` | Security scan + dead code (nox) | Security vulnerabilities or dead code | | `quality` | Radon complexity (nox) | Extremely complex methods (31+ cyclomatic complexity) | | `unit_tests` | BDD unit tests via `nox -s unit_tests` | Failing unit test scenarios | | `integration_tests` | Robot integration tests via `nox -s integration_tests` | Failing integration tests | | `coverage` | Test coverage measurement (nox) | Coverage dropped below 97% | | `build` | Wheel build | Build failure | ``` **Proposed text:** ``` | CI Job | What It Checks | Failure Means | |--------|---------------|---------------| | `lint` | Ruff format and lint (nox) | Code style violations or lint errors | | `typecheck` | Pyright strict type checking (nox) | Type errors in `src/` | | `security` | Security scan + dead code (nox) | Security vulnerabilities or dead code | | `quality` | Radon complexity (nox) | Extremely complex methods (31+ cyclomatic complexity) | | `unit_tests` | BDD unit tests via `nox -s unit_tests` | Failing unit test scenarios | | `integration_tests` | Robot integration tests via `nox -s integration_tests` | Failing integration tests | | `e2e_tests` | End-to-end Robot tests with real LLM APIs via `nox -s e2e_tests` | Failing E2E tests | | `coverage` | Test coverage measurement (nox) | Coverage dropped below 97% | | `build` | Wheel build | Build failure | | `docker` | Docker image build and smoke test | Docker build failure | | `helm` | Helm chart lint, template render, kubeconform validation | Helm chart invalid | ``` **Rationale:** `e2e_tests`, `docker`, and `helm` are all required by `status-check` in `ci.yml` but were missing from the Required Status Checks table. #### 2. Deployment-gating jobs section (lines 31–35) **Current text:** ``` **Deployment-gating jobs** (run after core checks pass): | CI Job | Depends On | What It Checks | |--------|-----------|----------------| | `docker` | `lint`, `typecheck`, `unit_tests`, `security` | Docker image builds and runs | ``` **Proposed text:** Remove this section entirely — `docker` and `helm` are now full required jobs in `status-check`, not optional deployment-gating jobs. The `docker` job depends on `[lint, typecheck, security, quality, unit_tests]` per `ci.yml`. **Rationale:** The distinction between "required" and "deployment-gating" no longer exists — all 11 jobs are required by `status-check`. #### 3. Branch Protection Setup section (lines 52–78) **Current text (required checks list):** ``` [x] Require status checks to pass before merging - Required checks: - lint - typecheck - security - quality - behave - coverage - build ``` **Proposed text:** ``` [x] Require status checks to pass before merging - Required checks: - status-check ``` **Rationale:** The `status-check` consolidation job is the single required check — it depends on all 11 jobs. Listing individual jobs is redundant and was already outdated (using `behave` instead of `unit_tests`, missing `e2e_tests`, `helm`, `integration_tests`). Using `status-check` as the single required check is the correct pattern. #### 4. CI Job Dependency Graph (lines 153–165) **Current text:** ``` lint ──────────────────┐ typecheck ─────────────┤ ├── coverage (needs lint + typecheck) security ──────────────┤ ├── docker (needs lint + typecheck + unit_tests + security) unit_tests ────────────┘ integration_tests ──────── (independent) quality ────────────────── (independent) build ──────────────────── (independent) ``` **Proposed text:** ``` lint ──────────────────┐ typecheck ─────────────┤ security ──────────────┤ quality ───────────────┴── coverage (needs lint + typecheck + security + quality) └── benchmark-regression (needs lint + typecheck + security + quality, PR only) lint ──────────────────┐ typecheck ─────────────┤ security ──────────────┤ quality ───────────────┤ unit_tests ────────────┴── docker (needs lint + typecheck + security + quality + unit_tests) unit_tests ─────────────── (independent) integration_tests ──────── (independent) e2e_tests ──────────────── (independent, 45-minute timeout) quality ────────────────── (independent) build ──────────────────── (independent) helm ───────────────────── (independent) All above ──────────────── status-check (needs all 11 jobs) ``` **Rationale:** The actual `ci.yml` shows `coverage` depends on `[lint, typecheck, security, quality]` (not just `lint + typecheck`), and `docker` depends on `[lint, typecheck, security, quality, unit_tests]`. The graph was missing `e2e_tests`, `helm`, and `status-check`. #### 5. Quality Gates Summary table (lines 174–188) **Current text:** ``` | Unit Tests | All pass (3.11-3.13) | `behave` job | ``` **Proposed text:** ``` | Unit Tests | All pass | `unit_tests` job | | E2E Tests | All pass | `e2e_tests` job | | Helm | Chart valid | `helm` job | ``` **Rationale:** The job was renamed from `behave` to `unit_tests`. E2E tests and Helm are now required quality gates. The multi-Python-version claim (3.11-3.13) is not reflected in `ci.yml` which only uses Python 3.13. #### 6. CI Secrets table (lines 252–260) **Current text:** Missing `GOOGLE_API_KEY`. **Proposed addition:** ``` | `GOOGLE_API_KEY` | `e2e_tests` | Google API key for E2E tests using Google LLM providers | ``` **Rationale:** `ci.yml` line 320 passes `GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}` to `e2e_tests` but the secrets table doesn't document it. #### 7. New section: CI Artifact Capture (after line 198) **Proposed new section:** ```markdown ### CI Artifact Capture All 8 nox-running CI jobs capture their stdout+stderr output as named Forgejo artifacts, uploaded even when the job fails (using `if: always()`). This provides immediate access to structured CI logs for diagnosis. | Artifact Name | Job | Contents | |---------------|-----|----------| | `ci-logs-lint` | `lint` | Ruff lint + format check output | | `ci-logs-typecheck` | `typecheck` | Pyright output | | `ci-logs-security` | `security` | Bandit + Vulture output | | `ci-logs-quality` | `quality` | Radon complexity output | | `ci-logs-unit-tests` | `unit_tests` | Behave BDD test output | | `ci-logs-integration-tests` | `integration_tests` | Robot Framework output | | `ci-logs-e2e-tests` | `e2e_tests` | E2E Robot Framework output | | `ci-logs-coverage` | `coverage` | Slipcover coverage report | Artifacts are retained for **30 days** and can be downloaded from the Forgejo Actions UI under the relevant workflow run. The `coverage` job additionally uploads `coverage-reports` (XML, JSON, and HTML) as a separate artifact. To access artifacts: **Actions** > select the workflow run > scroll to **Artifacts** section at the bottom. ``` **Rationale:** PR #2782 added this feature but it was never documented in `ci-cd.md`. ### Scope All changes are to `docs/development/ci-cd.md` only. No changes to `docs/specification.md` or any source code. ### Classification This is a **documentation update** to bring `docs/development/ci-cd.md` into alignment with the actual CI pipeline as implemented in `.forgejo/workflows/ci.yml`. The implementation is correct; the documentation is stale. --- **Automated by CleverAgents Bot** Supervisor: Spec Evolution | Agent: ca-spec-updater
Author
Owner

Label compliance fix applied:

  • Added missing label: Priority/Backlog (862)
  • Reason: Issue was missing a Priority/* label. Per CONTRIBUTING.md, every issue must have exactly one Priority/* label. Priority/Backlog assigned as default for a proposal/feedback issue.

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

Label compliance fix applied: - Added missing label: `Priority/Backlog` (862) - Reason: Issue was missing a `Priority/*` label. Per CONTRIBUTING.md, every issue must have exactly one `Priority/*` label. `Priority/Backlog` assigned as default for a proposal/feedback issue. --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
Author
Owner

approved

approved
freemo added this to the v3.6.0 milestone 2026-04-05 07:08:58 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: Backlog
  • Milestone: v3.6.0 (assigned — documentation alignment fits the Advanced Concepts scope)
  • MoSCoW: Should Have — keeping CI documentation in sync with the actual pipeline is important for developer onboarding and agent accuracy; stale docs cause confusion and incorrect assumptions

The proposal is thorough and well-documented with specific line references and proposed text. All 7 sections need updating. This is a documentation-only change with no code risk.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **Priority**: Backlog - **Milestone**: v3.6.0 (assigned — documentation alignment fits the Advanced Concepts scope) - **MoSCoW**: Should Have — keeping CI documentation in sync with the actual pipeline is important for developer onboarding and agent accuracy; stale docs cause confusion and incorrect assumptions The proposal is thorough and well-documented with specific line references and proposed text. All 7 sections need updating. This is a documentation-only change with no code risk. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#2805
No description provided.