[AUTO-INF-5] Nightly quality workflow omits integration_tests and e2e_tests — integration regressions go undetected overnight #9953

Open
opened 2026-04-16 07:24:34 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit message: fix(ci): add integration_tests and e2e_tests to nightly-quality workflow
  • Branch name: fix/nightly-quality-add-integration-e2e-tests

Background and Context

The nightly quality workflow (.forgejo/workflows/nightly-quality.yml) runs only unit tests and coverage, omitting the integration_tests and e2e_tests nox sessions entirely. This means integration-level regressions (Robot Framework tests, LLM API contract tests, Helm rendering) can silently accumulate overnight without any CI signal until a developer opens a PR the next day.

.forgejo/workflows/nightly-quality.yml currently runs the following checks in a single sequential job:

  1. nox -s lint
  2. nox -s format -- --check
  3. nox -s typecheck
  4. nox -s security_scan
  5. nox -s dead_code
  6. nox -s complexity
  7. nox -s unit_tests-3.13
  8. nox -s coverage_report

Missing from the nightly run:

  • nox -s integration_tests — Robot Framework integration tests (requires ANTHROPIC_API_KEY, OPENAI_API_KEY)
  • nox -s e2e_tests — End-to-end Robot Framework tests (requires ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY)

By contrast, .forgejo/workflows/ci.yml runs both integration_tests and e2e_tests on every PR. The nightly run is the only scheduled safety net for catching regressions on the master and develop branches between PRs, yet it skips the two most expensive and most integration-sensitive test suites.

Additionally, the nightly workflow runs all steps sequentially in a single job. If lint fails, typecheck, security_scan, and all subsequent steps are skipped, providing no partial results. The PR CI workflow avoids this by running each check as an independent parallel job.

Expected Behavior

  • The nightly quality workflow runs integration_tests and e2e_tests in addition to the existing quality checks.
  • Integration and e2e regressions on master/develop are caught overnight rather than discovered when a developer opens a PR.
  • The nightly run provides the same test coverage as the PR CI, ensuring no test tier is silently skipped.
  • Parallel jobs reduce total nightly wall-clock time compared to the current sequential single-job approach.

Acceptance Criteria

  • .forgejo/workflows/nightly-quality.yml includes an integration_tests job that runs nox -s integration_tests with ANTHROPIC_API_KEY and OPENAI_API_KEY secrets
  • .forgejo/workflows/nightly-quality.yml includes an e2e_tests job that runs nox -s e2e_tests with ANTHROPIC_API_KEY, OPENAI_API_KEY, and GOOGLE_API_KEY secrets
  • The nightly workflow jobs run in parallel (matching the structure of ci.yml) so a failure in one check does not suppress results from others
  • The integration_tests job sets CLEVERAGENTS_REQUIRE_HELM_RENDER_ASSERTIONS: "true" to enforce Helm rendering assertions
  • Both new jobs use timeout-minutes: 90 to prevent runaway test execution
  • Nightly CI passes on master and develop branches with the new jobs included

Subtasks

  • Add integration_tests job to .forgejo/workflows/nightly-quality.yml with appropriate secrets and environment variables
  • Add e2e_tests job to .forgejo/workflows/nightly-quality.yml with appropriate secrets and environment variables
  • Refactor the monolithic single-job into parallel jobs matching the structure of ci.yml
  • Verify the new jobs pass on a test branch before merging to develop
  • Update any documentation or runbooks that describe the nightly workflow scope

Definition of Done

This issue should be closed when:

  1. .forgejo/workflows/nightly-quality.yml includes both integration_tests and e2e_tests jobs running in parallel with the existing quality checks
  2. The nightly workflow successfully runs all jobs on master and develop without skipping integration or e2e test suites
  3. A PR has been merged to develop with the workflow changes and all CI checks pass

Summary

The nightly quality workflow (.forgejo/workflows/nightly-quality.yml) runs only unit tests and coverage, omitting the integration_tests and e2e_tests nox sessions entirely. This means integration-level regressions (Robot Framework tests, LLM API contract tests, Helm rendering) can silently accumulate overnight without any CI signal until a developer opens a PR the next day.

Current State

.forgejo/workflows/nightly-quality.yml runs the following checks in a single sequential job:

  1. nox -s lint
  2. nox -s format -- --check
  3. nox -s typecheck
  4. nox -s security_scan
  5. nox -s dead_code
  6. nox -s complexity
  7. nox -s unit_tests-3.13
  8. nox -s coverage_report

Missing from the nightly run:

  • nox -s integration_tests — Robot Framework integration tests (requires ANTHROPIC_API_KEY, OPENAI_API_KEY)
  • nox -s e2e_tests — End-to-end Robot Framework tests (requires ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY)

By contrast, .forgejo/workflows/ci.yml runs both integration_tests and e2e_tests on every PR. The nightly run is the only scheduled safety net for catching regressions on the master and develop branches between PRs, yet it skips the two most expensive and most integration-sensitive test suites.

Additionally, the nightly workflow runs all steps sequentially in a single job. If lint fails, typecheck, security_scan, and all subsequent steps are skipped, providing no partial results. The PR CI workflow avoids this by running each check as an independent parallel job.

Proposed Improvement

  1. Add integration_tests and e2e_tests jobs to nightly-quality.yml, conditioned on the relevant API key secrets being available:
integration_tests:
  runs-on: docker
  timeout-minutes: 90
  container:
    image: python:3.13-slim
  steps:
    - uses: actions/checkout@v4
    - name: Install uv and nox
      run: pip install -q uv==0.8.0 nox
    - name: Run integration tests via nox
      run: nox -s integration_tests
      env:
        NOX_DEFAULT_VENV_BACKEND: uv
        ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        CLEVERAGENTS_REQUIRE_HELM_RENDER_ASSERTIONS: "true"
  1. Split the monolithic single-job into parallel jobs (matching the structure of ci.yml) so that a failure in one check does not suppress results from all others.

Expected Impact

  • Earlier regression detection: Integration and e2e regressions on master/develop are caught overnight rather than discovered when a developer opens a PR.
  • Consistent quality signal: The nightly run provides the same coverage as the PR CI, ensuring no test tier is silently skipped.
  • Faster nightly feedback: Parallel jobs reduce total nightly wall-clock time compared to the current sequential single-job approach.

Duplicate Check

  • Searched open issues for keywords: nightly, nightly-quality, integration_tests, e2e_tests, nightly enforcement, scheduled
  • Searched closed issues for keywords: nightly, nightly quality, integration tests nightly, e2e nightly
  • Found related open issue: [AUTO-INF-5] Harden CI quality gates (coverage parity, docs build, nightly enforcement) — that issue focuses on coverage parity and docs build; it does not specifically address adding integration_tests and e2e_tests sessions to the nightly schedule
  • Found closed issue: Nightly quality workflow silently ignores failed quality-gates script — that issue is about the quality-gates Python script failing silently, not about missing test suites
  • Searched for AUTO-INF worker issues (AUTO-INF-1 through AUTO-INF-10): none specifically propose adding integration/e2e sessions to the nightly workflow
  • Result: No duplicates found

Automated by CleverAgents Bot
Supervisor: Test Infrastructure Pool | Agent: test-infra-pool-supervisor
Worker: [AUTO-INF-5] CI Pipeline Design Analysis

## Metadata - **Commit message:** `fix(ci): add integration_tests and e2e_tests to nightly-quality workflow` - **Branch name:** `fix/nightly-quality-add-integration-e2e-tests` ## Background and Context The nightly quality workflow (`.forgejo/workflows/nightly-quality.yml`) runs only unit tests and coverage, omitting the `integration_tests` and `e2e_tests` nox sessions entirely. This means integration-level regressions (Robot Framework tests, LLM API contract tests, Helm rendering) can silently accumulate overnight without any CI signal until a developer opens a PR the next day. `.forgejo/workflows/nightly-quality.yml` currently runs the following checks in a single sequential job: 1. `nox -s lint` 2. `nox -s format -- --check` 3. `nox -s typecheck` 4. `nox -s security_scan` 5. `nox -s dead_code` 6. `nox -s complexity` 7. `nox -s unit_tests-3.13` 8. `nox -s coverage_report` **Missing from the nightly run:** - `nox -s integration_tests` — Robot Framework integration tests (requires `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`) - `nox -s e2e_tests` — End-to-end Robot Framework tests (requires `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`) By contrast, `.forgejo/workflows/ci.yml` runs both `integration_tests` and `e2e_tests` on every PR. The nightly run is the only scheduled safety net for catching regressions on the `master` and `develop` branches between PRs, yet it skips the two most expensive and most integration-sensitive test suites. Additionally, the nightly workflow runs all steps sequentially in a single job. If `lint` fails, `typecheck`, `security_scan`, and all subsequent steps are skipped, providing no partial results. The PR CI workflow avoids this by running each check as an independent parallel job. ## Expected Behavior - The nightly quality workflow runs `integration_tests` and `e2e_tests` in addition to the existing quality checks. - Integration and e2e regressions on `master`/`develop` are caught overnight rather than discovered when a developer opens a PR. - The nightly run provides the same test coverage as the PR CI, ensuring no test tier is silently skipped. - Parallel jobs reduce total nightly wall-clock time compared to the current sequential single-job approach. ## Acceptance Criteria - [ ] `.forgejo/workflows/nightly-quality.yml` includes an `integration_tests` job that runs `nox -s integration_tests` with `ANTHROPIC_API_KEY` and `OPENAI_API_KEY` secrets - [ ] `.forgejo/workflows/nightly-quality.yml` includes an `e2e_tests` job that runs `nox -s e2e_tests` with `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, and `GOOGLE_API_KEY` secrets - [ ] The nightly workflow jobs run in parallel (matching the structure of `ci.yml`) so a failure in one check does not suppress results from others - [ ] The `integration_tests` job sets `CLEVERAGENTS_REQUIRE_HELM_RENDER_ASSERTIONS: "true"` to enforce Helm rendering assertions - [ ] Both new jobs use `timeout-minutes: 90` to prevent runaway test execution - [ ] Nightly CI passes on `master` and `develop` branches with the new jobs included ## Subtasks - [ ] Add `integration_tests` job to `.forgejo/workflows/nightly-quality.yml` with appropriate secrets and environment variables - [ ] Add `e2e_tests` job to `.forgejo/workflows/nightly-quality.yml` with appropriate secrets and environment variables - [ ] Refactor the monolithic single-job into parallel jobs matching the structure of `ci.yml` - [ ] Verify the new jobs pass on a test branch before merging to `develop` - [ ] Update any documentation or runbooks that describe the nightly workflow scope ## Definition of Done This issue should be closed when: 1. `.forgejo/workflows/nightly-quality.yml` includes both `integration_tests` and `e2e_tests` jobs running in parallel with the existing quality checks 2. The nightly workflow successfully runs all jobs on `master` and `develop` without skipping integration or e2e test suites 3. A PR has been merged to `develop` with the workflow changes and all CI checks pass --- ## Summary The nightly quality workflow (`.forgejo/workflows/nightly-quality.yml`) runs only unit tests and coverage, omitting the `integration_tests` and `e2e_tests` nox sessions entirely. This means integration-level regressions (Robot Framework tests, LLM API contract tests, Helm rendering) can silently accumulate overnight without any CI signal until a developer opens a PR the next day. ## Current State `.forgejo/workflows/nightly-quality.yml` runs the following checks in a single sequential job: 1. `nox -s lint` 2. `nox -s format -- --check` 3. `nox -s typecheck` 4. `nox -s security_scan` 5. `nox -s dead_code` 6. `nox -s complexity` 7. `nox -s unit_tests-3.13` 8. `nox -s coverage_report` **Missing from the nightly run:** - `nox -s integration_tests` — Robot Framework integration tests (requires `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`) - `nox -s e2e_tests` — End-to-end Robot Framework tests (requires `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`) By contrast, `.forgejo/workflows/ci.yml` runs both `integration_tests` and `e2e_tests` on every PR. The nightly run is the only scheduled safety net for catching regressions on the `master` and `develop` branches between PRs, yet it skips the two most expensive and most integration-sensitive test suites. Additionally, the nightly workflow runs all steps sequentially in a single job. If `lint` fails, `typecheck`, `security_scan`, and all subsequent steps are skipped, providing no partial results. The PR CI workflow avoids this by running each check as an independent parallel job. ## Proposed Improvement 1. **Add `integration_tests` and `e2e_tests` jobs** to `nightly-quality.yml`, conditioned on the relevant API key secrets being available: ```yaml integration_tests: runs-on: docker timeout-minutes: 90 container: image: python:3.13-slim steps: - uses: actions/checkout@v4 - name: Install uv and nox run: pip install -q uv==0.8.0 nox - name: Run integration tests via nox run: nox -s integration_tests env: NOX_DEFAULT_VENV_BACKEND: uv ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} CLEVERAGENTS_REQUIRE_HELM_RENDER_ASSERTIONS: "true" ``` 2. **Split the monolithic single-job into parallel jobs** (matching the structure of `ci.yml`) so that a failure in one check does not suppress results from all others. ## Expected Impact - **Earlier regression detection**: Integration and e2e regressions on `master`/`develop` are caught overnight rather than discovered when a developer opens a PR. - **Consistent quality signal**: The nightly run provides the same coverage as the PR CI, ensuring no test tier is silently skipped. - **Faster nightly feedback**: Parallel jobs reduce total nightly wall-clock time compared to the current sequential single-job approach. ### Duplicate Check - Searched open issues for keywords: `nightly`, `nightly-quality`, `integration_tests`, `e2e_tests`, `nightly enforcement`, `scheduled` - Searched closed issues for keywords: `nightly`, `nightly quality`, `integration tests nightly`, `e2e nightly` - Found related open issue: `[AUTO-INF-5] Harden CI quality gates (coverage parity, docs build, nightly enforcement)` — that issue focuses on coverage parity and docs build; it does not specifically address adding `integration_tests` and `e2e_tests` sessions to the nightly schedule - Found closed issue: `Nightly quality workflow silently ignores failed quality-gates script` — that issue is about the quality-gates Python script failing silently, not about missing test suites - Searched for AUTO-INF worker issues (AUTO-INF-1 through AUTO-INF-10): none specifically propose adding integration/e2e sessions to the nightly workflow - Result: **No duplicates found** --- **Automated by CleverAgents Bot** Supervisor: Test Infrastructure Pool | Agent: test-infra-pool-supervisor Worker: [AUTO-INF-5] CI Pipeline Design Analysis
Author
Owner

🔍 Triage Decision — Verified

Type: Bug / CI Infrastructure
Priority: High
MoSCoW: Should Have

This issue is verified. The nightly quality workflow (.forgejo/workflows/nightly-quality.yml) is confirmed to omit integration_tests and e2e_tests nox sessions, meaning integration-level regressions on master/develop can accumulate silently overnight. The current sequential single-job structure also suppresses partial results when early steps fail.

Rationale:

  • The PR CI workflow already runs both integration_tests and e2e_tests on every PR — parity with the nightly run is a reasonable quality gate expectation.
  • Parallel job structure is a straightforward improvement that mirrors the existing ci.yml design.
  • This is classified Should Have (not Must Have) because the PR CI already catches these regressions before merge; the nightly run is an additional safety net rather than the primary gate.

Next steps: A developer should add the integration_tests and e2e_tests jobs to nightly-quality.yml and refactor the monolithic job into parallel jobs matching ci.yml structure.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

## 🔍 Triage Decision — Verified ✅ **Type:** Bug / CI Infrastructure **Priority:** High **MoSCoW:** Should Have This issue is **verified**. The nightly quality workflow (`.forgejo/workflows/nightly-quality.yml`) is confirmed to omit `integration_tests` and `e2e_tests` nox sessions, meaning integration-level regressions on `master`/`develop` can accumulate silently overnight. The current sequential single-job structure also suppresses partial results when early steps fail. **Rationale:** - The PR CI workflow already runs both `integration_tests` and `e2e_tests` on every PR — parity with the nightly run is a reasonable quality gate expectation. - Parallel job structure is a straightforward improvement that mirrors the existing `ci.yml` design. - This is classified **Should Have** (not Must Have) because the PR CI already catches these regressions before merge; the nightly run is an additional safety net rather than the primary gate. **Next steps:** A developer should add the `integration_tests` and `e2e_tests` jobs to `nightly-quality.yml` and refactor the monolithic job into parallel jobs matching `ci.yml` structure. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#9953
No description provided.