test: add TDD bug-capture test for #620 — skill add cross-process persistence #1110

2026-03-23T00:15:38Z

brent.edwards commented

2026-03-23 00:15:38 +00:00

Summary

Write cross-process Behave and Robot tests capturing the skill persistence regression for bug #620. Tests execute the real agents CLI via subprocess.run so each command runs in a separate OS process with its own interpreter, DI container, and DB connection.

Approach

Use subprocess-based CLI invocations for init, skill add, skill list, and skill show.
Ensure environment/bootstrap mirrors real usage (agents init --yes, on-disk SQLite DB).
Keep expected-failure workflow via @tdd_expected_fail while bug remains open.
Add explicit returncode == 0 guards for skill list assertions in both Behave and Robot paths (matching existing skill show assertion strictness).

Ticket Alignment Updates (Cycle 2)

Aligned bug/ticket references to bug #620 and ticket #1091 across:
- Behave tags and feature text (@tdd_bug_620)
- Behave/Robot helper docstrings/comments
- Changelog entry text
- Commit message + issue footer
Updated close keyword to match the ticket tracked by this PR.

Files Updated

CHANGELOG.md
features/tdd_skill_add_regression.feature
features/steps/tdd_skill_add_regression_steps.py
robot/tdd_skill_add_regression.robot
robot/helper_tdd_skill_add_regression.py

Quality Gates

✅ nox -e lint
✅ nox -e typecheck
✅ nox -e unit_tests
⚠️ nox -e integration_tests currently fails in unrelated existing suites (timeouts/rc=-9 in Robot.Resource Cli, Robot.M3 E2E Verification, and intermittently Robot.Rxpy Route Validation) during full parallel run; not in touched files
✅ nox -e e2e_tests
✅ nox -e coverage_report (summary coverage: 98%)

Closes #1091

## Summary Write cross-process Behave and Robot tests capturing the skill persistence regression for **bug #620**. Tests execute the real `agents` CLI via `subprocess.run` so each command runs in a separate OS process with its own interpreter, DI container, and DB connection. ### Approach - Use subprocess-based CLI invocations for `init`, `skill add`, `skill list`, and `skill show`. - Ensure environment/bootstrap mirrors real usage (`agents init --yes`, on-disk SQLite DB). - Keep expected-failure workflow via `@tdd_expected_fail` while bug remains open. - Add explicit `returncode == 0` guards for `skill list` assertions in both Behave and Robot paths (matching existing `skill show` assertion strictness). ### Ticket Alignment Updates (Cycle 2) - Aligned bug/ticket references to **bug #620** and **ticket #1091** across: - Behave tags and feature text (`@tdd_bug_620`) - Behave/Robot helper docstrings/comments - Changelog entry text - Commit message + issue footer - Updated close keyword to match the ticket tracked by this PR. ### Files Updated - `CHANGELOG.md` - `features/tdd_skill_add_regression.feature` - `features/steps/tdd_skill_add_regression_steps.py` - `robot/tdd_skill_add_regression.robot` - `robot/helper_tdd_skill_add_regression.py` ### Quality Gates - ✅ `nox -e lint` - ✅ `nox -e typecheck` - ✅ `nox -e unit_tests` - ⚠️ `nox -e integration_tests` currently fails in unrelated existing suites (timeouts/rc=-9 in `Robot.Resource Cli`, `Robot.M3 E2E Verification`, and intermittently `Robot.Rxpy Route Validation`) during full parallel run; not in touched files - ✅ `nox -e e2e_tests` - ✅ `nox -e coverage_report` (summary coverage: **98%**) Closes #1091

brent.edwards added this to the v3.2.0 milestone 2026-03-23 00:15:45 +00:00

brent.edwards added the

Type

Testing

label 2026-03-23 00:15:46 +00:00

brent.edwards added a new dependency 2026-03-23 00:17:15 +00:00

#1091 TDD: Write failing test for #620 — skill add does not persist across CLI invocations

freemo force-pushed tdd/m3-skill-add-regression from 2b401b06dd to 7fdbf5d4c6

2026-03-23 01:06:48 +00:00

Compare

freemo changed title from ~~test: add TDD bug-capture test for #620 — skill add regression~~ to test: add TDD bug-capture test for #980 — skill add cross-process persistence

2026-03-23 01:07:12 +00:00

freemo referenced this pull request

2026-03-23 01:08:21 +00:00

TDD: Write failing test for #980 — skill add regression, cross-process persistence fails #981

freemo referenced this pull request

2026-03-23 02:39:42 +00:00

fix(skill): resolve skill add persistence regression after PR #640 #1120

freemo approved these changes 2026-03-23 02:44:57 +00:00

Dismissed

freemo left a comment

TDD Bug-Capture Review: PR #1110 — Bug #980 (skill add cross-process persistence)

Overall Assessment: APPROVE

All seven review criteria pass cleanly.

1. Tag Compliance ✅

Feature file: @tdd_expected_fail @tdd_bug @tdd_bug_980 — all three required tags present at feature level.
Robot file: [Tags] tdd_bug tdd_bug_980 tdd_expected_fail — all three present on both test cases.

2. Branch Naming ✅

Branch: tdd/m3-skill-add-regression
Milestone: v3.2.0 (M3: Decisions + Validations + Invariants) — tdd/m3- prefix is correct.

3. File Organization ✅

features/tdd_skill_add_regression.feature
features/steps/tdd_skill_add_regression_steps.py
robot/tdd_skill_add_regression.robot
robot/helper_tdd_skill_add_regression.py

All files in the correct directories.

4. Step File Naming ✅

Feature tdd_skill_add_regression.feature → steps in tdd_skill_add_regression_steps.py — naming convention followed.

5. No Production Code Changes ✅

4 new files added, all in features/ and robot/. Zero deletions, zero modifications to src/.

6. Issue References ✅

PR body: Closes #981 (the TDD tracking issue for bug #980).
Bug #980 referenced extensively in docstrings and feature description.

7. PR Description Quality ✅

Clearly explains why existing tests pass (shared in-memory SQLAlchemy session) and how the new test exposes the real cross-process bug (container reset between CLI invocations). Quality gates all reported passing.

No issues found. Well-structured TDD bug-capture PR.

## TDD Bug-Capture Review: PR #1110 — Bug #980 (skill add cross-process persistence) ### Overall Assessment: **APPROVE** All seven review criteria pass cleanly. --- ### 1. Tag Compliance ✅ - Feature file: `@tdd_expected_fail @tdd_bug @tdd_bug_980` — all three required tags present at feature level. - Robot file: `[Tags] tdd_bug tdd_bug_980 tdd_expected_fail` — all three present on both test cases. ### 2. Branch Naming ✅ - Branch: `tdd/m3-skill-add-regression` - Milestone: v3.2.0 (M3: Decisions + Validations + Invariants) — `tdd/m3-` prefix is correct. ### 3. File Organization ✅ - `features/tdd_skill_add_regression.feature` - `features/steps/tdd_skill_add_regression_steps.py` - `robot/tdd_skill_add_regression.robot` - `robot/helper_tdd_skill_add_regression.py` All files in the correct directories. ### 4. Step File Naming ✅ - Feature `tdd_skill_add_regression.feature` → steps in `tdd_skill_add_regression_steps.py` — naming convention followed. ### 5. No Production Code Changes ✅ - 4 new files added, all in `features/` and `robot/`. Zero deletions, zero modifications to `src/`. ### 6. Issue References ✅ - PR body: `Closes #981` (the TDD tracking issue for bug #980). - Bug #980 referenced extensively in docstrings and feature description. ### 7. PR Description Quality ✅ - Clearly explains _why_ existing tests pass (shared in-memory SQLAlchemy session) and _how_ the new test exposes the real cross-process bug (container reset between CLI invocations). Quality gates all reported passing. --- No issues found. Well-structured TDD bug-capture PR.

freemo approved these changes 2026-03-23 02:46:51 +00:00

Dismissed

freemo left a comment

Review: APPROVED

Clean TDD bug-capture PR for skill-add cross-process persistence regression (#980).

Tag compliance: @tdd_expected_fail @tdd_bug @tdd_bug_980 all correct
Branch naming: tdd/m3-skill-add-regression correctly matches M3 (v3.2.0)
File organization: Proper placement in features/ and robot/
Step file naming: Follows convention
No production code changes: Test files only
Issue reference: Closing keyword present
PR description: Clear explanation of the cross-process persistence bug being captured

The reset_container() pattern for simulating independent process boundaries is a sound approach for testing DI container state leaks.

## Review: APPROVED Clean TDD bug-capture PR for skill-add cross-process persistence regression (#980). - **Tag compliance**: `@tdd_expected_fail @tdd_bug @tdd_bug_980` all correct - **Branch naming**: `tdd/m3-skill-add-regression` correctly matches M3 (v3.2.0) - **File organization**: Proper placement in `features/` and `robot/` - **Step file naming**: Follows convention - **No production code changes**: Test files only - **Issue reference**: Closing keyword present - **PR description**: Clear explanation of the cross-process persistence bug being captured The `reset_container()` pattern for simulating independent process boundaries is a sound approach for testing DI container state leaks.

freemo added the

labels 2026-03-23 03:33:53 +00:00

freemo requested review from hamza.khyari 2026-03-23 03:38:39 +00:00

freemo approved these changes 2026-03-23 03:41:04 +00:00

Dismissed

freemo left a comment

Day 43 Review — PR #1110 `test: TDD for #980 — skill add cross-process persistence`

Verdict: APPROVED

TDD Tag Verification

Tag	Behave	Robot
`@tdd_bug`	Present	Present
`@tdd_bug_980`	Present	Present
`@tdd_expected_fail`	Present	Present
Tag validation rules	PASS	PASS

Test Quality Assessment

The tests correctly capture the cross-process persistence regression by calling reset_container() between CLI invocations to simulate independent process lifecycles. This is the exact code path that fails in bug #980 — existing tests pass because they share the same in-memory session factory.

Two scenarios per framework (Behave + Robot) covering skill list and skill show after cross-process skill add. Test cleanup is thorough with add_cleanup() and try/finally patterns.

Checklist

Criterion	Status
Single commit	PASS (1 commit: `7fdbf5d4`)
Clean diff (test files only)	PASS (4 new files, 0 modified)
Commit message format	PASS (`test:` prefix, references #980, lists tags)
Closing keyword	PASS (`Closes #981`)
No unrelated changes	PASS

This is a well-structured TDD PR that accurately captures the bug behavior described in #980. Ready for merge.

## Day 43 Review — PR #1110 `test: TDD for #980 — skill add cross-process persistence` **Verdict: APPROVED** ### TDD Tag Verification | Tag | Behave | Robot | |---|---|---| | `@tdd_bug` | Present | Present | | `@tdd_bug_980` | Present | Present | | `@tdd_expected_fail` | Present | Present | | Tag validation rules | PASS | PASS | ### Test Quality Assessment The tests correctly capture the cross-process persistence regression by calling `reset_container()` between CLI invocations to simulate independent process lifecycles. This is the exact code path that fails in bug #980 — existing tests pass because they share the same in-memory session factory. Two scenarios per framework (Behave + Robot) covering `skill list` and `skill show` after cross-process `skill add`. Test cleanup is thorough with `add_cleanup()` and `try/finally` patterns. ### Checklist | Criterion | Status | |---|---| | Single commit | PASS (1 commit: `7fdbf5d4`) | | Clean diff (test files only) | PASS (4 new files, 0 modified) | | Commit message format | PASS (`test:` prefix, references #980, lists tags) | | Closing keyword | PASS (`Closes #981`) | | No unrelated changes | PASS | This is a well-structured TDD PR that accurately captures the bug behavior described in #980. Ready for merge.

freemo referenced this pull request

2026-03-23 03:41:46 +00:00

`agents skill` does not register when skills are added. #620

brent.edwards force-pushed tdd/m3-skill-add-regression from 7fdbf5d4c6 to 720eeea49e

2026-03-23 05:38:30 +00:00

Compare

brent.edwards dismissed freemo's review 2026-03-23 05:38:30 +00:00

Reason:

New commits pushed, approval review dismissed automatically according to repository settings

freemo force-pushed tdd/m3-skill-add-regression from 720eeea49e to 6f94183359

2026-03-23 12:00:40 +00:00

Compare

brent.edwards changed title from ~~test: add TDD bug-capture test for #980 — skill add cross-process persistence~~ to test: add TDD bug-capture test for #620 — skill add cross-process persistence

2026-03-23 21:24:03 +00:00

brent.edwards force-pushed tdd/m3-skill-add-regression from 6f94183359 to 5bf0baa1db

2026-03-23 21:24:19 +00:00

Compare

freemo force-pushed tdd/m3-skill-add-regression from 5bf0baa1db to 49c4473218

2026-03-24 00:19:38 +00:00

Compare

freemo force-pushed tdd/m3-skill-add-regression from 49c4473218 to e709e3bd95

2026-03-24 01:39:56 +00:00

Compare

freemo force-pushed tdd/m3-skill-add-regression from e709e3bd95 to a5ced58289

2026-03-24 15:22:38 +00:00

Compare

freemo approved these changes 2026-03-24 15:28:03 +00:00

freemo left a comment

Review: APPROVED

The most thorough TDD test in the batch. Uses real subprocess.run calls across process boundaries (not just in-process CliRunner) — sets up temp CLEVERAGENTS_HOME, on-disk SQLite, runs agents init, skill add, and skill list/skill show as independent OS processes. Tags correct. Both Behave and Robot tests present with proper timeout handling.

Minor Note

Unrelated change in noxfile.py adds session.install("setuptools<81") to the security_scan session. This is a drive-by fix for pkg_resources/semgrep compatibility. Per CONTRIBUTING.md §Atomic Commits: "Do not mix concerns." Ideally this should be a separate commit, though it's understandable as a pragmatic fix to unblock CI.

## Review: APPROVED The most thorough TDD test in the batch. Uses real `subprocess.run` calls across process boundaries (not just in-process `CliRunner`) — sets up temp `CLEVERAGENTS_HOME`, on-disk SQLite, runs `agents init`, `skill add`, and `skill list`/`skill show` as independent OS processes. Tags correct. Both Behave and Robot tests present with proper timeout handling. ### Minor Note Unrelated change in `noxfile.py` adds `session.install("setuptools<81")` to the `security_scan` session. This is a drive-by fix for `pkg_resources`/semgrep compatibility. Per CONTRIBUTING.md §Atomic Commits: *"Do not mix concerns."* Ideally this should be a separate commit, though it's understandable as a pragmatic fix to unblock CI.

freemo closed this pull request

2026-03-24 20:27:34 +00:00

brent.edwards commented

2026-03-24 23:40:24 +00:00

Self-QA Closeout Status (Current Work Posted)

Current PR state

PR status observed during closeout: closed (not merged)
Branch: tdd/m3-skill-add-regression

Work completed in the latest self-QA pass

Aligned references to ticket/bug scope (#1091 / #620) across tags, commit/PR metadata, and docs.
Added missing returncode == 0 guard in skill list assertions (Behave + Robot), matching skill show strictness.
Updated PR description with closeout details.

Quality-gate snapshot from latest fix pass

lint ✅
typecheck ✅
unit_tests ✅
e2e_tests ✅
coverage_report ✅
integration_tests ⚠️ (reported as unrelated/pre-existing suite instability)

Closeout note

This comment records the latest implemented work from the internal self-QA loop. Since the PR is currently closed, merge follow-up requires maintainer direction (reopen this PR or open a successor PR from the updated branch state).

## Self-QA Closeout Status (Current Work Posted) ### Current PR state - PR status observed during closeout: **closed (not merged)** - Branch: `tdd/m3-skill-add-regression` ### Work completed in the latest self-QA pass - Aligned references to ticket/bug scope (`#1091` / `#620`) across tags, commit/PR metadata, and docs. - Added missing `returncode == 0` guard in `skill list` assertions (Behave + Robot), matching `skill show` strictness. - Updated PR description with closeout details. ### Quality-gate snapshot from latest fix pass - lint ✅ - typecheck ✅ - unit_tests ✅ - e2e_tests ✅ - coverage_report ✅ - integration_tests ⚠️ (reported as unrelated/pre-existing suite instability) ### Closeout note This comment records the latest implemented work from the internal self-QA loop. Since the PR is currently closed, merge follow-up requires maintainer direction (reopen this PR or open a successor PR from the updated branch state).

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 26s

Required

Details

CI / lint (pull_request) Successful in 3m20s

Required

Details

CI / quality (pull_request) Successful in 3m48s

Required

Details

CI / typecheck (pull_request) Successful in 4m2s

Required

Details

CI / security (pull_request) Successful in 4m6s

Required

Details

CI / e2e_tests (pull_request) Successful in 6m14s

Details

CI / integration_tests (pull_request) Successful in 7m1s

Required

Details

CI / unit_tests (pull_request) Successful in 7m33s

Required

Details

CI / docker (pull_request) Successful in 1m35s

Required

Details

CI / coverage (pull_request) Successful in 11m53s

Required

Details

CI / status-check (pull_request) Successful in 1s

Details

CI / benchmark-regression (pull_request) Failing after 28m26s

Details

Pull request closed

Please reopen this pull request to perform a merge.

Sign in to join this conversation.

2 Participants

Notifications

Due Date

No due date set.

Blocks

#1091 TDD: Write failing test for #620 — skill add does not persist across CLI invocations

cleveragents/cleveragents-core

Reference: cleveragents/cleveragents-core#1110

test: add TDD bug-capture test for #620 — skill add cross-process persistence #1110

Summary

Approach

Ticket Alignment Updates (Cycle 2)

Files Updated

Quality Gates

TDD Bug-Capture Review: PR #1110 — Bug #980 (skill add cross-process persistence)

Overall Assessment: APPROVE

1. Tag Compliance ✅

2. Branch Naming ✅

3. File Organization ✅

4. Step File Naming ✅

5. No Production Code Changes ✅

6. Issue References ✅

7. PR Description Quality ✅

Review: APPROVED

Day 43 Review — PR #1110 test: TDD for #980 — skill add cross-process persistence

TDD Tag Verification

Test Quality Assessment

Checklist

Review: APPROVED

Minor Note

Self-QA Closeout Status (Current Work Posted)

Current PR state

Work completed in the latest self-QA pass

Quality-gate snapshot from latest fix pass

Closeout note

Pull request closed

Day 43 Review — PR #1110 `test: TDD for #980 — skill add cross-process persistence`