test: add TDD bug-capture test for #620 — skill add cross-process persistence #1110

Closed
brent.edwards wants to merge 2 commits from tdd/m3-skill-add-regression into master
Member

Summary

Write cross-process Behave and Robot tests capturing the skill persistence regression for bug #620. Tests execute the real agents CLI via subprocess.run so each command runs in a separate OS process with its own interpreter, DI container, and DB connection.

Approach

  • Use subprocess-based CLI invocations for init, skill add, skill list, and skill show.
  • Ensure environment/bootstrap mirrors real usage (agents init --yes, on-disk SQLite DB).
  • Keep expected-failure workflow via @tdd_expected_fail while bug remains open.
  • Add explicit returncode == 0 guards for skill list assertions in both Behave and Robot paths (matching existing skill show assertion strictness).

Ticket Alignment Updates (Cycle 2)

  • Aligned bug/ticket references to bug #620 and ticket #1091 across:
    • Behave tags and feature text (@tdd_bug_620)
    • Behave/Robot helper docstrings/comments
    • Changelog entry text
    • Commit message + issue footer
  • Updated close keyword to match the ticket tracked by this PR.

Files Updated

  • CHANGELOG.md
  • features/tdd_skill_add_regression.feature
  • features/steps/tdd_skill_add_regression_steps.py
  • robot/tdd_skill_add_regression.robot
  • robot/helper_tdd_skill_add_regression.py

Quality Gates

  • nox -e lint
  • nox -e typecheck
  • nox -e unit_tests
  • ⚠️ nox -e integration_tests currently fails in unrelated existing suites (timeouts/rc=-9 in Robot.Resource Cli, Robot.M3 E2E Verification, and intermittently Robot.Rxpy Route Validation) during full parallel run; not in touched files
  • nox -e e2e_tests
  • nox -e coverage_report (summary coverage: 98%)

Closes #1091

## Summary Write cross-process Behave and Robot tests capturing the skill persistence regression for **bug #620**. Tests execute the real `agents` CLI via `subprocess.run` so each command runs in a separate OS process with its own interpreter, DI container, and DB connection. ### Approach - Use subprocess-based CLI invocations for `init`, `skill add`, `skill list`, and `skill show`. - Ensure environment/bootstrap mirrors real usage (`agents init --yes`, on-disk SQLite DB). - Keep expected-failure workflow via `@tdd_expected_fail` while bug remains open. - Add explicit `returncode == 0` guards for `skill list` assertions in both Behave and Robot paths (matching existing `skill show` assertion strictness). ### Ticket Alignment Updates (Cycle 2) - Aligned bug/ticket references to **bug #620** and **ticket #1091** across: - Behave tags and feature text (`@tdd_bug_620`) - Behave/Robot helper docstrings/comments - Changelog entry text - Commit message + issue footer - Updated close keyword to match the ticket tracked by this PR. ### Files Updated - `CHANGELOG.md` - `features/tdd_skill_add_regression.feature` - `features/steps/tdd_skill_add_regression_steps.py` - `robot/tdd_skill_add_regression.robot` - `robot/helper_tdd_skill_add_regression.py` ### Quality Gates - ✅ `nox -e lint` - ✅ `nox -e typecheck` - ✅ `nox -e unit_tests` - ⚠️ `nox -e integration_tests` currently fails in unrelated existing suites (timeouts/rc=-9 in `Robot.Resource Cli`, `Robot.M3 E2E Verification`, and intermittently `Robot.Rxpy Route Validation`) during full parallel run; not in touched files - ✅ `nox -e e2e_tests` - ✅ `nox -e coverage_report` (summary coverage: **98%**) Closes #1091
test: add TDD bug-capture test for #620 — skill add regression
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Failing after 18s
CI / build (pull_request) Successful in 25s
CI / quality (pull_request) Successful in 3m42s
CI / typecheck (pull_request) Successful in 3m56s
CI / coverage (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has been skipped
CI / security (pull_request) Successful in 4m24s
CI / integration_tests (pull_request) Successful in 9m0s
CI / unit_tests (pull_request) Successful in 9m11s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 1s
CI / e2e_tests (pull_request) Failing after 20m52s
2b401b06dd
Add a Behave BDD scenario that captures bug #620 — skill add does not
persist across CLI invocations.  The test mirrors the DI container
construction pattern (standard sessionmaker creating separate sessions
per call) and verifies that a skill added via SkillService survives
service destruction and recreation from the same database.

Root cause identified during test construction: SkillRepository.create()
flushes on Session A while SkillService._commit() commits on Session B
(a different session from the factory).  Data is never committed.  The
existing skill_add_persist.feature tests mask this by sharing a single
session instance.

Tagged with @tdd_bug, @tdd_bug_620, and @tdd_expected_fail per
CONTRIBUTING.md Bug Fix Workflow.  The underlying assertion fails
(proving the bug exists) and the tag inversion causes CI to pass.
When the fix for #620 is implemented, the fixer removes
@tdd_expected_fail and the test runs as a permanent regression guard.

ISSUES CLOSED: #1091
brent.edwards added this to the v3.2.0 milestone 2026-03-23 00:15:45 +00:00
freemo force-pushed tdd/m3-skill-add-regression from 2b401b06dd
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Failing after 18s
CI / build (pull_request) Successful in 25s
CI / quality (pull_request) Successful in 3m42s
CI / typecheck (pull_request) Successful in 3m56s
CI / coverage (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has been skipped
CI / security (pull_request) Successful in 4m24s
CI / integration_tests (pull_request) Successful in 9m0s
CI / unit_tests (pull_request) Successful in 9m11s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 1s
CI / e2e_tests (pull_request) Failing after 20m52s
to 7fdbf5d4c6
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 3m20s
CI / build (pull_request) Successful in 34s
CI / typecheck (pull_request) Successful in 4m2s
CI / quality (pull_request) Successful in 4m1s
CI / integration_tests (pull_request) Successful in 9m7s
CI / unit_tests (pull_request) Successful in 9m17s
CI / e2e_tests (pull_request) Failing after 11m42s
CI / security (pull_request) Failing after 13m26s
CI / coverage (pull_request) Successful in 12m26s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 1s
CI / benchmark-regression (pull_request) Successful in 56m8s
2026-03-23 01:06:48 +00:00
Compare
freemo changed title from test: add TDD bug-capture test for #620 — skill add regression to test: add TDD bug-capture test for #980 — skill add cross-process persistence 2026-03-23 01:07:12 +00:00
freemo approved these changes 2026-03-23 02:44:57 +00:00
Dismissed
freemo left a comment

TDD Bug-Capture Review: PR #1110 — Bug #980 (skill add cross-process persistence)

Overall Assessment: APPROVE

All seven review criteria pass cleanly.


1. Tag Compliance

  • Feature file: @tdd_expected_fail @tdd_bug @tdd_bug_980 — all three required tags present at feature level.
  • Robot file: [Tags] tdd_bug tdd_bug_980 tdd_expected_fail — all three present on both test cases.

2. Branch Naming

  • Branch: tdd/m3-skill-add-regression
  • Milestone: v3.2.0 (M3: Decisions + Validations + Invariants) — tdd/m3- prefix is correct.

3. File Organization

  • features/tdd_skill_add_regression.feature
  • features/steps/tdd_skill_add_regression_steps.py
  • robot/tdd_skill_add_regression.robot
  • robot/helper_tdd_skill_add_regression.py

All files in the correct directories.

4. Step File Naming

  • Feature tdd_skill_add_regression.feature → steps in tdd_skill_add_regression_steps.py — naming convention followed.

5. No Production Code Changes

  • 4 new files added, all in features/ and robot/. Zero deletions, zero modifications to src/.

6. Issue References

  • PR body: Closes #981 (the TDD tracking issue for bug #980).
  • Bug #980 referenced extensively in docstrings and feature description.

7. PR Description Quality

  • Clearly explains why existing tests pass (shared in-memory SQLAlchemy session) and how the new test exposes the real cross-process bug (container reset between CLI invocations). Quality gates all reported passing.

No issues found. Well-structured TDD bug-capture PR.

## TDD Bug-Capture Review: PR #1110 — Bug #980 (skill add cross-process persistence) ### Overall Assessment: **APPROVE** All seven review criteria pass cleanly. --- ### 1. Tag Compliance ✅ - Feature file: `@tdd_expected_fail @tdd_bug @tdd_bug_980` — all three required tags present at feature level. - Robot file: `[Tags] tdd_bug tdd_bug_980 tdd_expected_fail` — all three present on both test cases. ### 2. Branch Naming ✅ - Branch: `tdd/m3-skill-add-regression` - Milestone: v3.2.0 (M3: Decisions + Validations + Invariants) — `tdd/m3-` prefix is correct. ### 3. File Organization ✅ - `features/tdd_skill_add_regression.feature` - `features/steps/tdd_skill_add_regression_steps.py` - `robot/tdd_skill_add_regression.robot` - `robot/helper_tdd_skill_add_regression.py` All files in the correct directories. ### 4. Step File Naming ✅ - Feature `tdd_skill_add_regression.feature` → steps in `tdd_skill_add_regression_steps.py` — naming convention followed. ### 5. No Production Code Changes ✅ - 4 new files added, all in `features/` and `robot/`. Zero deletions, zero modifications to `src/`. ### 6. Issue References ✅ - PR body: `Closes #981` (the TDD tracking issue for bug #980). - Bug #980 referenced extensively in docstrings and feature description. ### 7. PR Description Quality ✅ - Clearly explains _why_ existing tests pass (shared in-memory SQLAlchemy session) and _how_ the new test exposes the real cross-process bug (container reset between CLI invocations). Quality gates all reported passing. --- No issues found. Well-structured TDD bug-capture PR.
freemo approved these changes 2026-03-23 02:46:51 +00:00
Dismissed
freemo left a comment

Review: APPROVED

Clean TDD bug-capture PR for skill-add cross-process persistence regression (#980).

  • Tag compliance: @tdd_expected_fail @tdd_bug @tdd_bug_980 all correct
  • Branch naming: tdd/m3-skill-add-regression correctly matches M3 (v3.2.0)
  • File organization: Proper placement in features/ and robot/
  • Step file naming: Follows convention
  • No production code changes: Test files only
  • Issue reference: Closing keyword present
  • PR description: Clear explanation of the cross-process persistence bug being captured

The reset_container() pattern for simulating independent process boundaries is a sound approach for testing DI container state leaks.

## Review: APPROVED Clean TDD bug-capture PR for skill-add cross-process persistence regression (#980). - **Tag compliance**: `@tdd_expected_fail @tdd_bug @tdd_bug_980` all correct - **Branch naming**: `tdd/m3-skill-add-regression` correctly matches M3 (v3.2.0) - **File organization**: Proper placement in `features/` and `robot/` - **Step file naming**: Follows convention - **No production code changes**: Test files only - **Issue reference**: Closing keyword present - **PR description**: Clear explanation of the cross-process persistence bug being captured The `reset_container()` pattern for simulating independent process boundaries is a sound approach for testing DI container state leaks.
freemo approved these changes 2026-03-23 03:41:04 +00:00
Dismissed
freemo left a comment

Day 43 Review — PR #1110 test: TDD for #980 — skill add cross-process persistence

Verdict: APPROVED

TDD Tag Verification

Tag Behave Robot
@tdd_bug Present Present
@tdd_bug_980 Present Present
@tdd_expected_fail Present Present
Tag validation rules PASS PASS

Test Quality Assessment

The tests correctly capture the cross-process persistence regression by calling reset_container() between CLI invocations to simulate independent process lifecycles. This is the exact code path that fails in bug #980 — existing tests pass because they share the same in-memory session factory.

Two scenarios per framework (Behave + Robot) covering skill list and skill show after cross-process skill add. Test cleanup is thorough with add_cleanup() and try/finally patterns.

Checklist

Criterion Status
Single commit PASS (1 commit: 7fdbf5d4)
Clean diff (test files only) PASS (4 new files, 0 modified)
Commit message format PASS (test: prefix, references #980, lists tags)
Closing keyword PASS (Closes #981)
No unrelated changes PASS

This is a well-structured TDD PR that accurately captures the bug behavior described in #980. Ready for merge.

## Day 43 Review — PR #1110 `test: TDD for #980 — skill add cross-process persistence` **Verdict: APPROVED** ### TDD Tag Verification | Tag | Behave | Robot | |---|---|---| | `@tdd_bug` | Present | Present | | `@tdd_bug_980` | Present | Present | | `@tdd_expected_fail` | Present | Present | | Tag validation rules | PASS | PASS | ### Test Quality Assessment The tests correctly capture the cross-process persistence regression by calling `reset_container()` between CLI invocations to simulate independent process lifecycles. This is the exact code path that fails in bug #980 — existing tests pass because they share the same in-memory session factory. Two scenarios per framework (Behave + Robot) covering `skill list` and `skill show` after cross-process `skill add`. Test cleanup is thorough with `add_cleanup()` and `try/finally` patterns. ### Checklist | Criterion | Status | |---|---| | Single commit | PASS (1 commit: `7fdbf5d4`) | | Clean diff (test files only) | PASS (4 new files, 0 modified) | | Commit message format | PASS (`test:` prefix, references #980, lists tags) | | Closing keyword | PASS (`Closes #981`) | | No unrelated changes | PASS | This is a well-structured TDD PR that accurately captures the bug behavior described in #980. Ready for merge.
brent.edwards force-pushed tdd/m3-skill-add-regression from 7fdbf5d4c6
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 3m20s
CI / build (pull_request) Successful in 34s
CI / typecheck (pull_request) Successful in 4m2s
CI / quality (pull_request) Successful in 4m1s
CI / integration_tests (pull_request) Successful in 9m7s
CI / unit_tests (pull_request) Successful in 9m17s
CI / e2e_tests (pull_request) Failing after 11m42s
CI / security (pull_request) Failing after 13m26s
CI / coverage (pull_request) Successful in 12m26s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 1s
CI / benchmark-regression (pull_request) Successful in 56m8s
to 720eeea49e
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 23s
CI / lint (pull_request) Successful in 3m33s
CI / typecheck (pull_request) Successful in 4m14s
CI / security (pull_request) Successful in 4m17s
CI / integration_tests (pull_request) Successful in 7m9s
CI / e2e_tests (pull_request) Successful in 8m40s
CI / unit_tests (pull_request) Failing after 10m54s
CI / quality (pull_request) Failing after 11m28s
CI / coverage (pull_request) Successful in 12m0s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 2s
CI / benchmark-regression (pull_request) Successful in 52m0s
2026-03-23 05:38:30 +00:00
Compare
brent.edwards dismissed freemo's review 2026-03-23 05:38:30 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

freemo force-pushed tdd/m3-skill-add-regression from 720eeea49e
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 23s
CI / lint (pull_request) Successful in 3m33s
CI / typecheck (pull_request) Successful in 4m14s
CI / security (pull_request) Successful in 4m17s
CI / integration_tests (pull_request) Successful in 7m9s
CI / e2e_tests (pull_request) Successful in 8m40s
CI / unit_tests (pull_request) Failing after 10m54s
CI / quality (pull_request) Failing after 11m28s
CI / coverage (pull_request) Successful in 12m0s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 2s
CI / benchmark-regression (pull_request) Successful in 52m0s
to 6f94183359
Some checks failed
CI / coverage (pull_request) Blocked by required conditions
CI / benchmark-regression (pull_request) Blocked by required conditions
CI / docker (pull_request) Blocked by required conditions
CI / status-check (pull_request) Blocked by required conditions
CI / build (pull_request) Successful in 21s
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 3m18s
CI / security (pull_request) Successful in 4m0s
CI / quality (pull_request) Successful in 4m4s
CI / integration_tests (pull_request) Successful in 5m56s
CI / unit_tests (pull_request) Successful in 7m9s
CI / e2e_tests (pull_request) Successful in 8m54s
CI / typecheck (pull_request) Failing after 16m2s
2026-03-23 12:00:40 +00:00
Compare
brent.edwards changed title from test: add TDD bug-capture test for #980 — skill add cross-process persistence to test: add TDD bug-capture test for #620 — skill add cross-process persistence 2026-03-23 21:24:03 +00:00
brent.edwards force-pushed tdd/m3-skill-add-regression from 6f94183359
Some checks failed
CI / coverage (pull_request) Blocked by required conditions
CI / benchmark-regression (pull_request) Blocked by required conditions
CI / docker (pull_request) Blocked by required conditions
CI / status-check (pull_request) Blocked by required conditions
CI / build (pull_request) Successful in 21s
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 3m18s
CI / security (pull_request) Successful in 4m0s
CI / quality (pull_request) Successful in 4m4s
CI / integration_tests (pull_request) Successful in 5m56s
CI / unit_tests (pull_request) Successful in 7m9s
CI / e2e_tests (pull_request) Successful in 8m54s
CI / typecheck (pull_request) Failing after 16m2s
to 5bf0baa1db
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 41s
CI / integration_tests (pull_request) Successful in 2m41s
CI / lint (pull_request) Successful in 3m20s
CI / unit_tests (pull_request) Successful in 3m33s
CI / quality (pull_request) Successful in 3m43s
CI / typecheck (pull_request) Successful in 3m56s
CI / security (pull_request) Successful in 4m0s
CI / docker (pull_request) Successful in 56s
CI / e2e_tests (pull_request) Successful in 8m29s
CI / coverage (pull_request) Failing after 21m1s
CI / benchmark-regression (pull_request) Successful in 52m59s
CI / status-check (pull_request) Failing after 2s
2026-03-23 21:24:19 +00:00
Compare
freemo force-pushed tdd/m3-skill-add-regression from 5bf0baa1db
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 41s
CI / integration_tests (pull_request) Successful in 2m41s
CI / lint (pull_request) Successful in 3m20s
CI / unit_tests (pull_request) Successful in 3m33s
CI / quality (pull_request) Successful in 3m43s
CI / typecheck (pull_request) Successful in 3m56s
CI / security (pull_request) Successful in 4m0s
CI / docker (pull_request) Successful in 56s
CI / e2e_tests (pull_request) Successful in 8m29s
CI / coverage (pull_request) Failing after 21m1s
CI / benchmark-regression (pull_request) Successful in 52m59s
CI / status-check (pull_request) Failing after 2s
to 49c4473218
Some checks failed
CI / quality (pull_request) Successful in 34s
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 3m17s
CI / build (pull_request) Successful in 11s
CI / typecheck (pull_request) Successful in 3m38s
CI / integration_tests (pull_request) Successful in 5m25s
CI / e2e_tests (pull_request) Failing after 13m53s
CI / unit_tests (pull_request) Failing after 14m38s
CI / security (pull_request) Failing after 14m48s
CI / benchmark-regression (pull_request) Failing after 16m7s
CI / coverage (pull_request) Successful in 8m23s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 1s
2026-03-24 00:19:38 +00:00
Compare
freemo force-pushed tdd/m3-skill-add-regression from 49c4473218
Some checks failed
CI / quality (pull_request) Successful in 34s
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 3m17s
CI / build (pull_request) Successful in 11s
CI / typecheck (pull_request) Successful in 3m38s
CI / integration_tests (pull_request) Successful in 5m25s
CI / e2e_tests (pull_request) Failing after 13m53s
CI / unit_tests (pull_request) Failing after 14m38s
CI / security (pull_request) Failing after 14m48s
CI / benchmark-regression (pull_request) Failing after 16m7s
CI / coverage (pull_request) Successful in 8m23s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 1s
to e709e3bd95
Some checks failed
CI / docker (pull_request) Blocked by required conditions
CI / status-check (pull_request) Blocked by required conditions
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Failing after 21m2s
CI / security (pull_request) Failing after 21m2s
CI / lint (pull_request) Failing after 25m23s
CI / typecheck (pull_request) Successful in 21m15s
CI / coverage (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has been skipped
CI / build (pull_request) Failing after 28m0s
CI / e2e_tests (pull_request) Failing after 28m4s
CI / integration_tests (pull_request) Failing after 28m4s
CI / unit_tests (pull_request) Failing after 28m4s
2026-03-24 01:39:56 +00:00
Compare
freemo force-pushed tdd/m3-skill-add-regression from e709e3bd95
Some checks failed
CI / docker (pull_request) Blocked by required conditions
CI / status-check (pull_request) Blocked by required conditions
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Failing after 21m2s
CI / security (pull_request) Failing after 21m2s
CI / lint (pull_request) Failing after 25m23s
CI / typecheck (pull_request) Successful in 21m15s
CI / coverage (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has been skipped
CI / build (pull_request) Failing after 28m0s
CI / e2e_tests (pull_request) Failing after 28m4s
CI / integration_tests (pull_request) Failing after 28m4s
CI / unit_tests (pull_request) Failing after 28m4s
to a5ced58289
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 26s
CI / lint (pull_request) Successful in 3m20s
CI / quality (pull_request) Successful in 3m48s
CI / typecheck (pull_request) Successful in 4m2s
CI / security (pull_request) Successful in 4m6s
CI / e2e_tests (pull_request) Successful in 6m14s
CI / integration_tests (pull_request) Successful in 7m1s
CI / unit_tests (pull_request) Successful in 7m33s
CI / docker (pull_request) Successful in 1m35s
CI / coverage (pull_request) Successful in 11m53s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Failing after 28m26s
2026-03-24 15:22:38 +00:00
Compare
freemo approved these changes 2026-03-24 15:28:03 +00:00
freemo left a comment

Review: APPROVED

The most thorough TDD test in the batch. Uses real subprocess.run calls across process boundaries (not just in-process CliRunner) — sets up temp CLEVERAGENTS_HOME, on-disk SQLite, runs agents init, skill add, and skill list/skill show as independent OS processes. Tags correct. Both Behave and Robot tests present with proper timeout handling.

Minor Note

Unrelated change in noxfile.py adds session.install("setuptools<81") to the security_scan session. This is a drive-by fix for pkg_resources/semgrep compatibility. Per CONTRIBUTING.md §Atomic Commits: "Do not mix concerns." Ideally this should be a separate commit, though it's understandable as a pragmatic fix to unblock CI.

## Review: APPROVED The most thorough TDD test in the batch. Uses real `subprocess.run` calls across process boundaries (not just in-process `CliRunner`) — sets up temp `CLEVERAGENTS_HOME`, on-disk SQLite, runs `agents init`, `skill add`, and `skill list`/`skill show` as independent OS processes. Tags correct. Both Behave and Robot tests present with proper timeout handling. ### Minor Note Unrelated change in `noxfile.py` adds `session.install("setuptools<81")` to the `security_scan` session. This is a drive-by fix for `pkg_resources`/semgrep compatibility. Per CONTRIBUTING.md §Atomic Commits: *"Do not mix concerns."* Ideally this should be a separate commit, though it's understandable as a pragmatic fix to unblock CI.
freemo closed this pull request 2026-03-24 20:27:34 +00:00
Author
Member

Self-QA Closeout Status (Current Work Posted)

Current PR state

  • PR status observed during closeout: closed (not merged)
  • Branch: tdd/m3-skill-add-regression

Work completed in the latest self-QA pass

  • Aligned references to ticket/bug scope (#1091 / #620) across tags, commit/PR metadata, and docs.
  • Added missing returncode == 0 guard in skill list assertions (Behave + Robot), matching skill show strictness.
  • Updated PR description with closeout details.

Quality-gate snapshot from latest fix pass

  • lint
  • typecheck
  • unit_tests
  • e2e_tests
  • coverage_report
  • integration_tests ⚠️ (reported as unrelated/pre-existing suite instability)

Closeout note

This comment records the latest implemented work from the internal self-QA loop. Since the PR is currently closed, merge follow-up requires maintainer direction (reopen this PR or open a successor PR from the updated branch state).

## Self-QA Closeout Status (Current Work Posted) ### Current PR state - PR status observed during closeout: **closed (not merged)** - Branch: `tdd/m3-skill-add-regression` ### Work completed in the latest self-QA pass - Aligned references to ticket/bug scope (`#1091` / `#620`) across tags, commit/PR metadata, and docs. - Added missing `returncode == 0` guard in `skill list` assertions (Behave + Robot), matching `skill show` strictness. - Updated PR description with closeout details. ### Quality-gate snapshot from latest fix pass - lint ✅ - typecheck ✅ - unit_tests ✅ - e2e_tests ✅ - coverage_report ✅ - integration_tests ⚠️ (reported as unrelated/pre-existing suite instability) ### Closeout note This comment records the latest implemented work from the internal self-QA loop. Since the PR is currently closed, merge follow-up requires maintainer direction (reopen this PR or open a successor PR from the updated branch state).
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 26s
Required
Details
CI / lint (pull_request) Successful in 3m20s
Required
Details
CI / quality (pull_request) Successful in 3m48s
Required
Details
CI / typecheck (pull_request) Successful in 4m2s
Required
Details
CI / security (pull_request) Successful in 4m6s
Required
Details
CI / e2e_tests (pull_request) Successful in 6m14s
CI / integration_tests (pull_request) Successful in 7m1s
Required
Details
CI / unit_tests (pull_request) Successful in 7m33s
Required
Details
CI / docker (pull_request) Successful in 1m35s
Required
Details
CI / coverage (pull_request) Successful in 11m53s
Required
Details
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Failing after 28m26s

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core!1110
No description provided.