feat(ci): enforce tdd_issue tag consistency via commit-history and Forgejo issue-status checks #966

Open
opened 2026-03-16 02:36:13 +00:00 by freemo · 1 comment
Owner

Metadata

  • Commit Message: feat(ci): enforce tdd_issue tag consistency via commit-history and Forgejo issue-status checks
  • Branch: feature/m3-tdd-issue-consistency-gate

Background and Context

The TDD workflow (see CONTRIBUTING.md > Bug Fix Workflow) uses three tags — @tdd_issue, @tdd_issue_<N>, and @tdd_expected_fail — to allow tests to be written before their associated issue is implemented. The @tdd_expected_fail tag inverts the test result so the test passes CI while the issue is unresolved, and must be removed when the issue is resolved.

Currently, CONTRIBUTING.md describes two enforcement rules that are not yet implemented:

  1. A PR that closes issue #N must remove @tdd_expected_fail from all tests tagged @tdd_issue_N. If the tag is still present, CI should block the PR.
  2. A PR that closes issue #N where no @tdd_issue_N test exists in the codebase should be flagged (the TDD step was skipped).

Additionally, there is no mechanism to detect broader inconsistencies between tag state and issue status. This issue implements automated enforcement via two complementary checks.

Check 1 — Commit-Based (Branch-Local)

This check runs on every CI pipeline execution and examines the commits in the current branch:

  1. Collect issue references from commits. Search every commit message in the current branch for integers preceded by # (e.g., #123, #95). This captures all common keywords (Closes, Fixes, Resolves, Refs, etc.) without needing to enumerate them — the pattern is simply #(\d+).
  2. Compile the set of referenced issue numbers.
  3. Search all test files across all three test suites:
    • Behave unit tests (features/**/*.feature)
    • Robot Framework integration tests (robot/*.robot, excluding robot/e2e/)
    • Robot Framework E2E tests (robot/e2e/*.robot)
  4. For each test with @tdd_issue_N where N is in the set of referenced issue numbers:
    • If @tdd_expected_fail is still presentERROR. The commit references the issue (likely closing it) but the expected-fail tag was not removed. The test should now run normally.
  5. If any errors are found, fail the CI check and block the PR with a clear error message listing each violation (test file, line, tag, referenced issue number).

Check 2 — Forgejo API-Based (Global Consistency)

This check provides a broader safety net by cross-referencing tag state against actual issue status in Forgejo:

  1. Collect all @tdd_issue_<N> tags across all three test suites (Behave unit, Robot integration, Robot E2E).
  2. Query the Forgejo API for each referenced issue number to determine whether the issue is open or closed.
  3. Validate consistency:
    • @tdd_expected_fail present + issue N is closedERROR. The issue was resolved but the expected-fail tag was not removed. The test is still inverting results for a fixed issue.
    • @tdd_expected_fail absent + issue N is openERROR. The expected-fail tag was removed prematurely — the issue is still unresolved, so the test should still be in expected-fail mode.
  4. If any errors are found, fail the CI check and block the PR with a clear error message listing each violation.

Implementation Notes

  • Both checks should be implemented as a single script (e.g., scripts/check-tdd-tag-consistency.py) that can be invoked by nox and/or the CI pipeline.
  • The script should accept a --forgejo-url and --forgejo-token parameter (or read from environment variables) for the Forgejo API check.
  • The commit-based check (Check 1) should work without network access by examining the local git history.
  • The Forgejo API check (Check 2) requires network access and should gracefully degrade (warning, not error) if the Forgejo API is unreachable.
  • The script should be registered as a new nox session (e.g., nox -s tdd_consistency) and integrated into the CI pipeline.
  • The three test suites should be scanned independently so that errors are reported with clear context about which suite they belong to.

Acceptance Criteria

  • A script scripts/check-tdd-tag-consistency.py (or equivalent) implements both checks.
  • Check 1: Commits in the current branch are scanned for #N references. Tests with matching @tdd_issue_N that still have @tdd_expected_fail cause the check to fail.
  • Check 2: Forgejo API is queried for issue status. Tests where tag state (@tdd_expected_fail present/absent) contradicts issue state (open/closed) cause the check to fail.
  • All three test suites are scanned: Behave unit tests, Robot integration tests, Robot E2E tests.
  • A new nox session tdd_consistency runs the checks.
  • The CI pipeline (.forgejo/workflows/ci.yml) includes the new check as a required job.
  • Error messages are clear: they list the test file path, line number, tag, issue number, and the specific violation.
  • The check passes on the current master branch (no pre-existing violations).
  • nox -s unit_tests passes.
  • nox -s integration_tests passes.
  • ruff check and ruff format pass.
  • Coverage >= 97% via nox -s coverage_report.

Definition of Done

This issue is complete when:

  • All subtasks below are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.

Subtasks

  • Code: Create scripts/check-tdd-tag-consistency.py with CLI argument parsing (--forgejo-url, --forgejo-token, --base-branch).
  • Code: Implement Check 1 — git log parsing to extract #N references from commit messages, then scan test files for @tdd_issue_N + @tdd_expected_fail violations.
  • Code: Implement Check 2 — Forgejo API client to fetch issue open/closed status, then validate tag state consistency for all @tdd_issue_N tags.
  • Code: Implement test file scanning for all three suites: Behave .feature files (unit), Robot .robot files excluding e2e/ (integration), Robot e2e/*.robot files (E2E).
  • Code: Register a new nox session tdd_consistency that invokes the script.
  • Code: Add the tdd_consistency check as a required CI job in .forgejo/workflows/ci.yml.
  • Code: Ensure graceful degradation when Forgejo API is unreachable (warn, don't error).
  • Tests (Behave): Add Behave test scenarios that verify Check 1 and Check 2 logic with mock data.
  • Tests (Robot): Add Robot test cases that verify the script's exit codes and error output format.
  • Docs: Update CONTRIBUTING.md to reference the new tdd_consistency nox session and CI job.
  • Quality: Verify nox passes (all default sessions including benchmark), fix any errors.
  • Quality: Verify coverage >= 97% via nox -s coverage_report.
## Metadata - **Commit Message**: `feat(ci): enforce tdd_issue tag consistency via commit-history and Forgejo issue-status checks` - **Branch**: `feature/m3-tdd-issue-consistency-gate` ## Background and Context The TDD workflow (see `CONTRIBUTING.md > Bug Fix Workflow`) uses three tags — `@tdd_issue`, `@tdd_issue_<N>`, and `@tdd_expected_fail` — to allow tests to be written before their associated issue is implemented. The `@tdd_expected_fail` tag inverts the test result so the test passes CI while the issue is unresolved, and must be removed when the issue is resolved. Currently, `CONTRIBUTING.md` describes two enforcement rules that are **not yet implemented**: 1. A PR that closes issue `#N` must remove `@tdd_expected_fail` from all tests tagged `@tdd_issue_N`. If the tag is still present, CI should block the PR. 2. A PR that closes issue `#N` where no `@tdd_issue_N` test exists in the codebase should be flagged (the TDD step was skipped). Additionally, there is no mechanism to detect broader inconsistencies between tag state and issue status. This issue implements automated enforcement via two complementary checks. ## Check 1 — Commit-Based (Branch-Local) This check runs on every CI pipeline execution and examines the commits in the current branch: 1. **Collect issue references from commits.** Search every commit message in the current branch for integers preceded by `#` (e.g., `#123`, `#95`). This captures all common keywords (`Closes`, `Fixes`, `Resolves`, `Refs`, etc.) without needing to enumerate them — the pattern is simply `#(\d+)`. 2. **Compile the set of referenced issue numbers.** 3. **Search all test files** across all three test suites: - **Behave unit tests** (`features/**/*.feature`) - **Robot Framework integration tests** (`robot/*.robot`, excluding `robot/e2e/`) - **Robot Framework E2E tests** (`robot/e2e/*.robot`) 4. **For each test with `@tdd_issue_N` where N is in the set of referenced issue numbers:** - If `@tdd_expected_fail` is **still present** → **ERROR**. The commit references the issue (likely closing it) but the expected-fail tag was not removed. The test should now run normally. 5. **If any errors are found**, fail the CI check and block the PR with a clear error message listing each violation (test file, line, tag, referenced issue number). ## Check 2 — Forgejo API-Based (Global Consistency) This check provides a broader safety net by cross-referencing tag state against actual issue status in Forgejo: 1. **Collect all `@tdd_issue_<N>` tags** across all three test suites (Behave unit, Robot integration, Robot E2E). 2. **Query the Forgejo API** for each referenced issue number to determine whether the issue is open or closed. 3. **Validate consistency:** - `@tdd_expected_fail` **present** + issue N is **closed** → **ERROR**. The issue was resolved but the expected-fail tag was not removed. The test is still inverting results for a fixed issue. - `@tdd_expected_fail` **absent** + issue N is **open** → **ERROR**. The expected-fail tag was removed prematurely — the issue is still unresolved, so the test should still be in expected-fail mode. 4. **If any errors are found**, fail the CI check and block the PR with a clear error message listing each violation. ## Implementation Notes - Both checks should be implemented as a single script (e.g., `scripts/check-tdd-tag-consistency.py`) that can be invoked by `nox` and/or the CI pipeline. - The script should accept a `--forgejo-url` and `--forgejo-token` parameter (or read from environment variables) for the Forgejo API check. - The commit-based check (Check 1) should work without network access by examining the local git history. - The Forgejo API check (Check 2) requires network access and should gracefully degrade (warning, not error) if the Forgejo API is unreachable. - The script should be registered as a new `nox` session (e.g., `nox -s tdd_consistency`) and integrated into the CI pipeline. - The three test suites should be scanned independently so that errors are reported with clear context about which suite they belong to. ## Acceptance Criteria - [x] A script `scripts/check-tdd-tag-consistency.py` (or equivalent) implements both checks. - [x] **Check 1**: Commits in the current branch are scanned for `#N` references. Tests with matching `@tdd_issue_N` that still have `@tdd_expected_fail` cause the check to fail. - [x] **Check 2**: Forgejo API is queried for issue status. Tests where tag state (`@tdd_expected_fail` present/absent) contradicts issue state (open/closed) cause the check to fail. - [x] All three test suites are scanned: Behave unit tests, Robot integration tests, Robot E2E tests. - [x] A new `nox` session `tdd_consistency` runs the checks. - [x] The CI pipeline (`.forgejo/workflows/ci.yml`) includes the new check as a required job. - [x] Error messages are clear: they list the test file path, line number, tag, issue number, and the specific violation. - [x] The check passes on the current `master` branch (no pre-existing violations). - [x] `nox -s unit_tests` passes. - [x] `nox -s integration_tests` passes. - [x] `ruff check` and `ruff format` pass. - [x] Coverage >= 97% via `nox -s coverage_report`. ## Definition of Done This issue is complete when: - All subtasks below are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. ## Subtasks - [x] Code: Create `scripts/check-tdd-tag-consistency.py` with CLI argument parsing (`--forgejo-url`, `--forgejo-token`, `--base-branch`). - [x] Code: Implement Check 1 — git log parsing to extract `#N` references from commit messages, then scan test files for `@tdd_issue_N` + `@tdd_expected_fail` violations. - [x] Code: Implement Check 2 — Forgejo API client to fetch issue open/closed status, then validate tag state consistency for all `@tdd_issue_N` tags. - [x] Code: Implement test file scanning for all three suites: Behave `.feature` files (unit), Robot `.robot` files excluding `e2e/` (integration), Robot `e2e/*.robot` files (E2E). - [x] Code: Register a new `nox` session `tdd_consistency` that invokes the script. - [x] Code: Add the `tdd_consistency` check as a required CI job in `.forgejo/workflows/ci.yml`. - [x] Code: Ensure graceful degradation when Forgejo API is unreachable (warn, don't error). - [x] Tests (Behave): Add Behave test scenarios that verify Check 1 and Check 2 logic with mock data. - [x] Tests (Robot): Add Robot test cases that verify the script's exit codes and error output format. - [x] Docs: Update `CONTRIBUTING.md` to reference the new `tdd_consistency` nox session and CI job. - [x] Quality: Verify `nox` passes (all default sessions including benchmark), fix any errors. - [x] Quality: Verify coverage >= 97% via `nox -s coverage_report`.
freemo added this to the v3.2.0 milestone 2026-03-16 02:36:20 +00:00
Author
Owner

PM Triage — Day 36

Actions taken:

  • Assignee: → @CoreRasurae (natural extension of #629, the TDD quality gate Luis is already working on)
  • State: Verified

Dependencies:

  • Blocked by: #965 (tag rename must complete first — this CI script validates the renamed tdd_issue tags, not the old tdd_bug tags)
  • Blocked by: #629 (TDD quality gate — the base CI infrastructure)
  • Related: #628 (Robot Framework TDD tag handling)

Priority note: This is important for CI enforcement integrity but should not start until #965 merges. Focus on your other critical-path items first (PR #711 fix commits, PR #804 and #806 rebases).

@CoreRasurae — Do not start this until #965 is merged. Your immediate priorities are: (1) push fix commits for PR #711, (2) rebase PRs #804 and #806, (3) complete #629.


PM triage comment — Day 36

## PM Triage — Day 36 **Actions taken:** - **Assignee**: → @CoreRasurae (natural extension of #629, the TDD quality gate Luis is already working on) - **State**: Verified **Dependencies:** - **Blocked by**: #965 (tag rename must complete first — this CI script validates the renamed `tdd_issue` tags, not the old `tdd_bug` tags) - **Blocked by**: #629 (TDD quality gate — the base CI infrastructure) - **Related**: #628 (Robot Framework TDD tag handling) **Priority note:** This is important for CI enforcement integrity but should not start until #965 merges. Focus on your other critical-path items first (PR #711 fix commits, PR #804 and #806 rebases). @CoreRasurae — Do not start this until #965 is merged. Your immediate priorities are: (1) push fix commits for PR #711, (2) rebase PRs #804 and #806, (3) complete #629. --- *PM triage comment — Day 36*
freemo self-assigned this 2026-04-02 06:13:48 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#966
No description provided.