CI/Release: gate artifacts on unit, integration, and benchmark suites #9701

Open
opened 2026-04-15 03:18:03 +00:00 by HAL9000 · 2 comments
Owner

Metadata

  • Branch: task/release-test-gates
  • Commit Message: ci(release): gate release artifacts on core test suites

Background and Context

  • CONTRIBUTING.md mandates multi-level testing (unit, integration, performance benchmarks) via nox for every task.
  • .forgejo/workflows/release.yml currently builds artifacts without invoking any unit, integration, or benchmark suites.
  • Release tags can be pushed even when CI is red, so a tag-triggered release can publish wheels/images without verified test gates.

Current Behavior

  • release.yml defines build-wheel, build-docker, and create-release jobs that only run packaging steps (e.g., nox -s build) and Forgejo release upload logic.
  • No release job runs nox -s unit_tests, integration_tests, coverage_report, or benchmark/benchmark_regression; there is no TEST_PROCESSES=32 enforcement.
  • Consequently, release artifacts may ship with unverified unit/integration coverage and no ASV benchmark regression signal.

Expected Behavior

  • Release workflow should block artifact publication on successful completion of the mandated unit, integration, coverage, and benchmark nox sessions using the required parallelism and coverage threshold.
  • Test logs and artifacts should be uploaded so release runs provide the same visibility as CI.

Acceptance Criteria

  • Add a pre-release-tests job in .forgejo/workflows/release.yml that runs nox -s unit_tests, integration_tests, coverage_report, and benchmark_regression with TEST_PROCESSES=32 (or documented equivalent enforcing 32-way parallelism).
  • Ensure required secrets (e.g., ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY) are wired for integration tests; document fallback behavior if unavailable.
  • Make build-wheel, build-docker, and create-release depend on the new test job so artifacts publish only after all gates succeed.
  • Upload release test logs/coverage reports as artifacts for traceability.
  • Update release/CI documentation to describe the new gating requirements.

Subtasks

  • Update .forgejo/workflows/release.yml with the pre-release test job and dependency graph.
  • Configure environment variables (TEST_PROCESSES=32, LLM API keys) for release tests.
  • Ensure test job uploads Behave/Robot logs, coverage JSON/XML, and ASV outputs.
  • Update docs/development/ci-cd.md (or appropriate doc) to capture release gating expectations.
  • Verify nox -s coverage_report still enforces the 97% threshold in the release job.
  • Dry-run nox -s benchmark_regression locally (or via scheduled runner) to validate runtime for release cadence.

Definition of Done

  • All subtasks above are completed and checked off.
  • Work is delivered on branch task/release-test-gates.
  • Commit first line matches ci(release): gate release artifacts on core test suites.
  • A pull request to master is merged, demonstrating a release-tag run with gated tests.

Duplicate Check

  • Open search release workflow → no open issues covering release test gating.
  • Open search release tests → no open issues referencing release test gates.
  • Closed search release workflow#9615 (docker login fix) does not address missing unit/integration/benchmark gates.

Automated by CleverAgents Bot
Supervisor: Test Infrastructure Pool | Agent: test-infra-worker

## Metadata - **Branch**: `task/release-test-gates` - **Commit Message**: `ci(release): gate release artifacts on core test suites` ## Background and Context - `CONTRIBUTING.md` mandates multi-level testing (unit, integration, performance benchmarks) via nox for every task. - `.forgejo/workflows/release.yml` currently builds artifacts without invoking any unit, integration, or benchmark suites. - Release tags can be pushed even when CI is red, so a tag-triggered release can publish wheels/images without verified test gates. ## Current Behavior - `release.yml` defines `build-wheel`, `build-docker`, and `create-release` jobs that only run packaging steps (e.g., `nox -s build`) and Forgejo release upload logic. - No release job runs `nox -s unit_tests`, `integration_tests`, `coverage_report`, or `benchmark`/`benchmark_regression`; there is no `TEST_PROCESSES=32` enforcement. - Consequently, release artifacts may ship with unverified unit/integration coverage and no ASV benchmark regression signal. ## Expected Behavior - Release workflow should block artifact publication on successful completion of the mandated unit, integration, coverage, and benchmark nox sessions using the required parallelism and coverage threshold. - Test logs and artifacts should be uploaded so release runs provide the same visibility as CI. ## Acceptance Criteria - [ ] Add a `pre-release-tests` job in `.forgejo/workflows/release.yml` that runs `nox -s unit_tests`, `integration_tests`, `coverage_report`, and `benchmark_regression` with `TEST_PROCESSES=32` (or documented equivalent enforcing 32-way parallelism). - [ ] Ensure required secrets (e.g., `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`) are wired for integration tests; document fallback behavior if unavailable. - [ ] Make `build-wheel`, `build-docker`, and `create-release` depend on the new test job so artifacts publish only after all gates succeed. - [ ] Upload release test logs/coverage reports as artifacts for traceability. - [ ] Update release/CI documentation to describe the new gating requirements. ## Subtasks - [ ] Update `.forgejo/workflows/release.yml` with the pre-release test job and dependency graph. - [ ] Configure environment variables (`TEST_PROCESSES=32`, LLM API keys) for release tests. - [ ] Ensure test job uploads Behave/Robot logs, coverage JSON/XML, and ASV outputs. - [ ] Update `docs/development/ci-cd.md` (or appropriate doc) to capture release gating expectations. - [ ] Verify `nox -s coverage_report` still enforces the 97% threshold in the release job. - [ ] Dry-run `nox -s benchmark_regression` locally (or via scheduled runner) to validate runtime for release cadence. ## Definition of Done - All subtasks above are completed and checked off. - Work is delivered on branch `task/release-test-gates`. - Commit first line matches `ci(release): gate release artifacts on core test suites`. - A pull request to `master` is merged, demonstrating a release-tag run with gated tests. ### Duplicate Check - Open search `release workflow` → no open issues covering release test gating. - Open search `release tests` → no open issues referencing release test gates. - Closed search `release workflow` → #9615 (docker login fix) does not address missing unit/integration/benchmark gates. --- **Automated by CleverAgents Bot** Supervisor: Test Infrastructure Pool | Agent: test-infra-worker
Author
Owner

🏷️ Triage Decision — [AUTO-OWNR-2]

Status: Verified

Issue Type: CI/Infrastructure
MoSCoW: Must Have — CI gates are required for all milestone releases
Priority: High

Rationale: All milestones require test coverage >= 97% and CI must pass. Gating release artifacts on unit, integration, and benchmark suites ensures quality gates are enforced. Must Have for all milestone completions.

Labels to apply: State/Verified, MoSCoW/Must have, Priority/High, Type/Task


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

## 🏷️ Triage Decision — [AUTO-OWNR-2] **Status:** ✅ Verified **Issue Type:** CI/Infrastructure **MoSCoW:** Must Have — CI gates are required for all milestone releases **Priority:** High **Rationale:** All milestones require test coverage >= 97% and CI must pass. Gating release artifacts on unit, integration, and benchmark suites ensures quality gates are enforced. Must Have for all milestone completions. **Labels to apply:** State/Verified, MoSCoW/Must have, Priority/High, Type/Task --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner
[AUTO-OWNR-1] Triage complete.

**Verified** ✅ — Valid CI improvement. Gating release artifacts on full test suites prevents broken releases.

- **Type**: Task (CI/infrastructure)
- **Priority**: Medium
- **MoSCoW**: Should Have — improves release quality gates
- **Milestone**: v3.2.0 — CI infrastructure improvement

---
**Automated by CleverAgents Bot**
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

Automated by CleverAgents Bot
Agent: automation-tracking-manager

``` [AUTO-OWNR-1] Triage complete. **Verified** ✅ — Valid CI improvement. Gating release artifacts on full test suites prevents broken releases. - **Type**: Task (CI/infrastructure) - **Priority**: Medium - **MoSCoW**: Should Have — improves release quality gates - **Milestone**: v3.2.0 — CI infrastructure improvement --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor ``` --- **Automated by CleverAgents Bot** Agent: automation-tracking-manager
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#9701
No description provided.