test(resource): add failing tests for built-in git-checkout type bootstrap (#524) #568

Merged
brent.edwards merged 12 commits from feature/m3-test-resource-bootstrap-git into master 2026-03-07 02:52:51 +00:00
Member
No description provided.
test(resource): add failing tests for built-in git-checkout type bootstrap
Some checks are pending
CI / lint (pull_request) Waiting to run
CI / typecheck (pull_request) Waiting to run
CI / security (pull_request) Waiting to run
CI / quality (pull_request) Waiting to run
CI / unit_tests (pull_request) Waiting to run
CI / integration_tests (pull_request) Waiting to run
CI / coverage (pull_request) Blocked by required conditions
CI / benchmark-regression (pull_request) Blocked by required conditions
CI / benchmark-publish (pull_request) Waiting to run
CI / build (pull_request) Waiting to run
CI / docker (pull_request) Blocked by required conditions
34a65ee9d0
Add TDD-style failing tests that verify the built-in git-checkout resource
type is available after initialization. Tests assert the correct expected
behavior: after agents init, the git-checkout type should exist in the
registry and 'agents resource add git-checkout' should succeed.

Tests are expected to fail until bug #524 is fixed, because
bootstrap_builtin_types() is never called during initialization. The fix
branch should be based on this branch so the fix commit inherits these tests.

Files added:
- features/resource_type_bootstrap_git.feature (2 Behave scenarios)
- features/steps/resource_type_bootstrap_git_steps.py (step definitions)
- robot/resource_type_bootstrap_git.robot (Robot Framework smoke test)

ISSUES CLOSED: #553
brent.edwards added this to the v3.2.0 milestone 2026-03-04 20:14:15 +00:00
docs(changelog): add entry for git-checkout bootstrap test scaffolding
Some checks failed
CI / lint (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 15s
CI / typecheck (pull_request) Successful in 33s
CI / security (pull_request) Successful in 35s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 15s
CI / unit_tests (pull_request) Failing after 1m4s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 3m41s
CI / coverage (pull_request) Failing after 48s
CI / benchmark-regression (pull_request) Successful in 28m36s
26ee1d632d
ISSUES CLOSED: #553
brent.edwards left a comment

Review: TDD failing tests for built-in git-checkout type bootstrap (bug #524)

This is a well-structured TDD PR. Compared to its companion PR #567 (fs-directory), the approach here is notably improved in several ways:

  • The CLI mock uses unittest.mock.patch on get_container (the standard DI seam) rather than manual monkey-patching — more robust and consistent with PR #566's pattern.
  • The CLI exit code assertion includes result.exception in the error message (line 157), which gives immediate visibility into failures — addressing the diagnostic concern flagged on PR #567.
  • The expected assertion values (kind="physical", sandbox_strategy="git_worktree", user_addable=True) correctly match _BUILTIN_TYPES at resource_registry_service.py:68-95.
  • The dual Given/When registration of bootstrap_builtin_types is called during initialisation (lines 55-65) is a clean way to reuse the step in both precondition and action contexts.

Since this is TDD-style, CI failures are expected — acknowledged.

A few minor items inline, none blocking.

## Review: TDD failing tests for built-in `git-checkout` type bootstrap (bug #524) This is a well-structured TDD PR. Compared to its companion PR #567 (fs-directory), the approach here is notably improved in several ways: - The CLI mock uses `unittest.mock.patch` on `get_container` (the standard DI seam) rather than manual monkey-patching — more robust and consistent with PR #566's pattern. - The CLI exit code assertion includes `result.exception` in the error message (line 157), which gives immediate visibility into failures — addressing the diagnostic concern flagged on PR #567. - The expected assertion values (`kind="physical"`, `sandbox_strategy="git_worktree"`, `user_addable=True`) correctly match `_BUILTIN_TYPES` at `resource_registry_service.py:68-95`. - The dual Given/When registration of `bootstrap_builtin_types is called during initialisation` (lines 55-65) is a clean way to reuse the step in both precondition and action contexts. Since this is TDD-style, CI failures are expected — acknowledged. A few minor items inline, none blocking.
@ -0,0 +8,4 @@
# ── Registry presence after bootstrap ──────────────────────
Scenario: After initialization the git-checkout type exists in the resource type registry
Author
Member

Minor: The scenarios in PR #566 use @tdd @bug522 tags, which allow selective execution (e.g., behave --tags=@tdd to run only TDD tests, or --tags=~@tdd to skip them). This PR and the companion PR #567 don't have scenario-level tags.

Consider adding @tdd @bug524 for consistency:

  @tdd @bug524
  Scenario: After initialization the git-checkout type exists in the resource type registry
Minor: The scenarios in PR #566 use `@tdd @bug522` tags, which allow selective execution (e.g., `behave --tags=@tdd` to run only TDD tests, or `--tags=~@tdd` to skip them). This PR and the companion PR #567 don't have scenario-level tags. Consider adding `@tdd @bug524` for consistency: ```gherkin @tdd @bug524 Scenario: After initialization the git-checkout type exists in the resource type registry ```
@ -0,0 +20,4 @@
)
from cleveragents.infrastructure.database.models import Base
runner = CliRunner()
Author
Member

Nit: runner is created as a module-level singleton, while PR #566's step file creates a new CliRunner() per step invocation. Both approaches are correct (CliRunner is stateless), just noting the style difference across the companion PRs.

Nit: `runner` is created as a module-level singleton, while PR #566's step file creates a new `CliRunner()` per step invocation. Both approaches are correct (CliRunner is stateless), just noting the style difference across the companion PRs.
@ -0,0 +102,4 @@
# ---------------------------------------------------------------------------
@then('the resource type "{name}" should be present in the registry')
Author
Member

Note: The step matcher text here differs from the companion PR #567 for the same domain concepts:

Concept This PR (#568) PR #567
Type existence should be present in the registry should exist
Kind check "{name}" should have kind "{kind}" kind should be "{kind}"
Strategy check "{name}" should have sandbox_strategy sandbox_strategy should be

No runtime conflict since the matchers are distinct, and I'd argue this PR's patterns are better (parameterized by {name}, so they generalize to any type). But when both merge, the step vocabulary will be inconsistent for the same domain.

If there's an opportunity to align #567 to match this PR's patterns, it would make the test suite more uniform. Low priority — could be done in a follow-up.

Note: The step matcher text here differs from the companion PR #567 for the same domain concepts: | Concept | This PR (#568) | PR #567 | |---|---|---| | Type existence | `should be present in the registry` | `should exist` | | Kind check | `"{name}" should have kind "{kind}"` | `kind should be "{kind}"` | | Strategy check | `"{name}" should have sandbox_strategy` | `sandbox_strategy should be` | No runtime conflict since the matchers are distinct, and I'd argue this PR's patterns are better (parameterized by `{name}`, so they generalize to any type). But when both merge, the step vocabulary will be inconsistent for the same domain. If there's an opportunity to align #567 to match this PR's patterns, it would make the test suite more uniform. Low priority — could be done in a follow-up.
@ -0,0 +14,4 @@
[Documentation] Run "agents resource add git-checkout" and verify it does not
... produce a "Resource type not found" error. This currently fails
... because bootstrap_builtin_types() is never called.
${result}= Run Process ${PYTHON} -m cleveragents resource add
Author
Member

Note: This test case runs agents resource add git-checkout via subprocess against the real CLI. Unlike the second test case (which uses an in-memory SQLite script), this depends on the system having a configured/initialized environment. If CI doesn't have agents init run first, the subprocess may fail for a different reason (missing database) than the intended one (missing type).

The second test case's approach (in-memory script) is more hermetic and self-contained. Not blocking since this is expected to fail anyway in the TDD phase, but when the fix is applied the CI environment setup will need to ensure agents init runs before this Robot suite.

Note: This test case runs `agents resource add git-checkout` via subprocess against the real CLI. Unlike the second test case (which uses an in-memory SQLite script), this depends on the system having a configured/initialized environment. If CI doesn't have `agents init` run first, the subprocess may fail for a different reason (missing database) than the intended one (missing type). The second test case's approach (in-memory script) is more hermetic and self-contained. Not blocking since this is expected to fail anyway in the TDD phase, but when the fix is applied the CI environment setup will need to ensure `agents init` runs before this Robot suite.
fix(test): address self-review findings on git-checkout bootstrap tests
Some checks failed
CI / lint (pull_request) Successful in 13s
CI / typecheck (pull_request) Successful in 31s
CI / security (pull_request) Successful in 29s
CI / quality (pull_request) Successful in 15s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 17s
CI / unit_tests (pull_request) Failing after 1m9s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Has been cancelled
CI / coverage (pull_request) Has been cancelled
CI / benchmark-regression (pull_request) Has been cancelled
5a9995716b
- Add @tdd @bug524 tags to both feature scenarios for selective execution
- Move module-level CliRunner singleton to per-step instantiation for
  consistency with PR #566 pattern

Refs: #553
Author
Member

Disposition of self-review findings (review #1944)

Addressed in commit 5a999571.

F1 — Missing @tdd @bug524 tags (comment #53368)

Fixed. Added @tdd @bug524 tags to both scenarios in resource_type_bootstrap_git.feature, matching the convention established in PR #566.

F2 — Step vocabulary inconsistency with PR #567 (comment #53371)

Acknowledged, deferred. This PR's parameterized patterns ("{name}" should be present in the registry, "{name}" should have kind "{kind}") are the better design since they generalize to any resource type. Aligning #567 to match would require either:

  • Changing #567's feature file + step definitions (risky cross-PR change), or
  • Extracting shared steps to a common file (larger refactor)

Both approaches risk introducing AmbiguousStep errors if Behave loads both step files with identical matchers. Deferring to a follow-up where steps can be consolidated into a shared module.

F3 — Module-level CliRunner singleton (comment #53374)

Fixed. Removed the module-level runner = CliRunner() and replaced with per-step CliRunner() instantiation at the call site (line 82), consistent with PR #566's pattern.

F4 — Robot subprocess hermiticity (comment #53373)

Acknowledged, no change. The first Robot test case (subprocess-based) does depend on a configured environment, unlike the second test case's in-memory approach. Since this is a TDD PR with expected CI failures, the hermiticity gap is acceptable for now. When the #524 fix is applied, CI setup will need to ensure agents init runs before this Robot suite. Added as a note for the fix author.

## Disposition of self-review findings (review #1944) Addressed in commit `5a999571`. ### F1 — Missing `@tdd @bug524` tags (comment #53368) **Fixed.** Added `@tdd @bug524` tags to both scenarios in `resource_type_bootstrap_git.feature`, matching the convention established in PR #566. ### F2 — Step vocabulary inconsistency with PR #567 (comment #53371) **Acknowledged, deferred.** This PR's parameterized patterns (`"{name}" should be present in the registry`, `"{name}" should have kind "{kind}"`) are the better design since they generalize to any resource type. Aligning #567 to match would require either: - Changing #567's feature file + step definitions (risky cross-PR change), or - Extracting shared steps to a common file (larger refactor) Both approaches risk introducing `AmbiguousStep` errors if Behave loads both step files with identical matchers. Deferring to a follow-up where steps can be consolidated into a shared module. ### F3 — Module-level `CliRunner` singleton (comment #53374) **Fixed.** Removed the module-level `runner = CliRunner()` and replaced with per-step `CliRunner()` instantiation at the call site (line 82), consistent with PR #566's pattern. ### F4 — Robot subprocess hermiticity (comment #53373) **Acknowledged, no change.** The first Robot test case (subprocess-based) does depend on a configured environment, unlike the second test case's in-memory approach. Since this is a TDD PR with expected CI failures, the hermiticity gap is acceptable for now. When the #524 fix is applied, CI setup will need to ensure `agents init` runs before this Robot suite. Added as a note for the fix author.
Merge branch 'master' into feature/m3-test-resource-bootstrap-git
Some checks failed
CI / lint (pull_request) Successful in 15s
CI / security (pull_request) Successful in 30s
CI / typecheck (pull_request) Successful in 33s
CI / quality (pull_request) Successful in 15s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 14s
CI / unit_tests (pull_request) Failing after 1m8s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 3m9s
CI / coverage (pull_request) Failing after 3m7s
CI / benchmark-regression (pull_request) Successful in 29m44s
7b86008cfc
Merge branch 'master' into feature/m3-test-resource-bootstrap-git
Some checks failed
CI / lint (pull_request) Successful in 15s
CI / typecheck (pull_request) Successful in 35s
CI / security (pull_request) Successful in 37s
CI / quality (pull_request) Successful in 26s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 15s
CI / coverage (pull_request) Failing after 51s
CI / unit_tests (pull_request) Failing after 2m22s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 4m13s
CI / benchmark-regression (pull_request) Successful in 26m45s
6b59c2f58c
freemo left a comment

PM Review — Day 25

Status

  • Mergeable — no conflicts.
  • Only a self-review from Brent (4 minor inline findings, all addressed in commit 5a999571).
  • No external review yet.

Action Item

@CoreRasurae: Please review this PR. It's the third in the TDD test series (#566, #567, #568) and is the only one you haven't reviewed yet. Given you've already reviewed the sibling PRs, this should be a quick review.

Notes

  • TDD test PR for #524 (git-checkout resource bootstrap bug).
  • Same structure as #567 — Behave BDD + Robot + ASV benchmarks.
## PM Review — Day 25 ### Status - **Mergeable** — no conflicts. - Only a self-review from Brent (4 minor inline findings, all addressed in commit `5a999571`). - **No external review yet.** ### Action Item **@CoreRasurae**: Please review this PR. It's the third in the TDD test series (#566, #567, #568) and is the only one you haven't reviewed yet. Given you've already reviewed the sibling PRs, this should be a quick review. ### Notes - TDD test PR for #524 (git-checkout resource bootstrap bug). - Same structure as #567 — Behave BDD + Robot + ASV benchmarks.
hamza.khyari requested changes 2026-03-05 20:25:38 +00:00
Dismissed
hamza.khyari left a comment

Review: PR #568feature/m3-test-resource-bootstrap-git

Reviewer: reviewer-1
Verdict: Changes Requested — 1 Major + 3 Critical blocking findings, 2 Minor, 2 Nit

Mechanical Checks

Check Result
nox -s lint PASS
nox -s typecheck PASS
behave --dry-run FAILAmbiguousStep (see BUG-1)
CHANGELOG Updated
CONTRIBUTORS Author present
Diff cleanliness Clean (4 files, all intentional)

SPEC-1: Behave tests do not reproduce bug #524 — TDD contract broken

Severity: Major
Category: SPEC
File: features/resource_type_bootstrap_git.feature:14,25

Description:
Issue #553 AC-3 states: "Tests currently fail because bootstrap_builtin_types() is never called (this is expected)." AC-4 states tests should pass once the fix in #524 is merged.

Both Behave scenarios explicitly call bootstrap_builtin_types() themselves:

  • Scenario 1 line 14: When bootstrap_builtin_types is called during initialisation
  • Scenario 2 line 25: And bootstrap_builtin_types is called during initialisation

Bug #524 is that this function is never called during agents init. Since the tests manually invoke it, they would pass today without any fix if the AmbiguousStep collisions were resolved. The TDD contract (fail now, pass after fix) is broken for the entire Behave layer.

Additionally, AC-1 says the test should verify git-checkout is available "after agents init", but neither scenario runs agents init — they construct a bare ResourceRegistryService and call bootstrap directly.

Subtask 3 is checked off ([x]) with the requirement "Verify they run (and fail as expected) via nox -s unit_tests" — but the tests neither run (AmbiguousStep crash) nor fail as expected (they'd pass).

Only Robot test 1 (Resource Add Git Checkout Should Not Fail With Type Not Found) runs the real CLI path and would genuinely fail due to the missing bootstrap call — but it may fail for a different reason (DB/config error, see TEST-2).

Recommendation:
Restructure the Behave scenarios to test the actual agents init path without manually calling bootstrap_builtin_types(). For example:

Scenario: After initialization the git-checkout type exists in the resource type registry
  Given a freshly initialised resource registry service
  # No explicit bootstrap call — the bug is that init never calls it
  Then the resource type "git-checkout" should be present in the registry

This scenario would fail today (because init doesn't call bootstrap) and pass once #524 is fixed (because init will call bootstrap). That restores the TDD contract.

If the intent is to also test that bootstrap_builtin_types() works correctly when called, that's a separate valid scenario but should not be tagged @tdd @bug524 since it doesn't reproduce the bug.

Verification:
After restructuring, run:

PYTHONPATH=src python -m behave features/resource_type_bootstrap_git.feature --no-capture
# Should FAIL with "git-checkout not in registry" (not AmbiguousStep)

BUG-1: AmbiguousStep — @when('I run "..."') collides with wildcard step

Severity: Critical
Category: BUG
File: features/steps/resource_type_bootstrap_git_steps.py:65

Description:
The step @when('I run "agents resource add git-checkout local/test --path /tmp/repo --branch main"') collides with the existing wildcard @when('I run "{command}"') at features/steps/cli_plan_context_commands_steps.py:337. Behave raises AmbiguousStep during step loading, breaking the entire test suite — not just this feature file.

Confirmed by dry-run:

AmbiguousStep: @when('I run "agents resource add git-checkout ..."')
  has already been defined in existing step @when('I run "{command}"')
  at features/steps/cli_plan_context_commands_steps.py:337

Recommendation:
Remove the custom @when and reuse the existing wildcard step. Adapt the feature Gherkin to match the existing convention, or prefix the step:

When bootstrap-git- I run "agents resource add git-checkout local/test --path /tmp/repo --branch main"

Verification:

PYTHONPATH=src python -m behave features/resource_type_bootstrap_git.feature --dry-run

BUG-2: AmbiguousStep — the CLI exit code should be {code:d} duplicated

Severity: Critical
Category: BUG
File: features/steps/resource_type_bootstrap_git_steps.py:147

Description:
Identical step pattern already defined at features/steps/garbage_collection_cli_steps.py:126. This is the second AmbiguousStep collision.

Recommendation:
Remove the duplicate from resource_type_bootstrap_git_steps.py. Reuse the existing step. Ensure the CLI invocation step sets context.cli_result so the existing step can read it.

Verification:

grep -rn "the CLI exit code should be" features/steps/
# Should show exactly ONE definition

BUG-3: AmbiguousStep — the CLI output should not contain "{text}" duplicated

Severity: Critical
Category: BUG
File: features/steps/resource_type_bootstrap_git_steps.py:158

Description:
Identical step pattern already defined at features/steps/cli_steps.py:83. Third AmbiguousStep collision.

Note: the existing step in cli_steps.py reads from context.output / context.result["output"], not context.cli_result.output. If you reuse it, the CLI invocation step must set the matching context attribute.

Recommendation:
Remove the duplicate from resource_type_bootstrap_git_steps.py. Either adapt the CLI invocation step to populate context.output, or prefix this step uniquely.

Verification:

grep -rn 'the CLI output should not contain' features/steps/
# Should show exactly ONE definition

TEST-1: MagicMock() without spec= for Container

Severity: Minor
Category: TEST
File: features/steps/resource_type_bootstrap_git_steps.py:74

Description:
Bare MagicMock() used for the DI container. Per CONTRIBUTING.md and review workflow Phase 6.5: "Never use bare MagicMock() for known types — use create_autospec(TheClass, instance=True)." A bare mock silently accepts any attribute, masking typos or API changes.

Recommendation:

from unittest.mock import create_autospec
from cleveragents.application.container import Container

mock_container = create_autospec(Container, instance=True)
mock_container.resource_registry_service.return_value = service

TEST-2: Robot test 1 may fail for wrong reason

Severity: Minor
Category: TEST
File: robot/resource_type_bootstrap_git.robot:13

Description:
Resource Add Git Checkout Should Not Fail With Type Not Found runs the real CLI without pre-initialized database/project. The CLI will likely fail with a database or configuration error rather than "Resource type not found." The test asserts exit code 0 and absence of a specific string, but a DB error would also produce exit code != 0 for an unrelated reason — meaning the test "fails correctly" by accident, not because it validated the bootstrap bug.

Recommendation:
Add a Setup keyword that initializes a minimal project/DB so the only failure path is the missing bootstrap call.


CODE-1: Unnecessary hasattr guard on enum fields

Severity: Nit
Category: CODE
File: features/steps/resource_type_bootstrap_git_steps.py:116-119,129-132
File: robot/resource_type_bootstrap_git.robot:41-44

Description:
The pattern spec.resource_kind.value if hasattr(spec.resource_kind, "value") else str(spec.resource_kind) is dead code. ResourceTypeSpec.resource_kind is typed as ResourceKind (enum) which always has .value.

Recommendation:
Simplify to actual = spec.resource_kind.value.


TEST-3: bootstrap_builtin_types() return value never asserted

Severity: Nit
Category: TEST
File: features/steps/resource_type_bootstrap_git_steps.py:56

Description:
The step stores context.bootstrap_registered = service.bootstrap_builtin_types() but no scenario asserts on it. Adding an assertion that "git-checkout" is in the returned list would verify the return contract.

Recommendation:
Add a Then step asserting "git-checkout" in context.bootstrap_registered.


Summary

Severity Count IDs Blocking?
Critical 3 BUG-1, BUG-2, BUG-3 Yes
Major 1 SPEC-1 Yes
Minor 2 TEST-1, TEST-2 No
Nit 2 CODE-1, TEST-3 No

Verdict: NOT ready to merge. Two categories of blocking issues:

  1. AmbiguousStep collisions (BUG-1/2/3): The entire Behave suite crashes during step loading. Fix by removing duplicate step definitions and reusing existing ones.
  2. TDD contract violation (SPEC-1): The Behave scenarios manually call bootstrap_builtin_types(), so they'd pass today without the bug fix. They need restructuring to test the actual agents init path without explicit bootstrap calls, so they genuinely fail until #524 is fixed.

Lint and typecheck pass clean. The overall test design, property assertions, and Robot test structure are solid — the issues are with step collisions and the init-path coverage.

## Review: PR #568 — `feature/m3-test-resource-bootstrap-git` **Reviewer**: reviewer-1 **Verdict**: **Changes Requested** — 1 Major + 3 Critical blocking findings, 2 Minor, 2 Nit ### Mechanical Checks | Check | Result | |-------|--------| | `nox -s lint` | PASS | | `nox -s typecheck` | PASS | | `behave --dry-run` | **FAIL** — `AmbiguousStep` (see BUG-1) | | CHANGELOG | Updated | | CONTRIBUTORS | Author present | | Diff cleanliness | Clean (4 files, all intentional) | --- ### SPEC-1: Behave tests do not reproduce bug #524 — TDD contract broken **Severity**: Major **Category**: SPEC **File**: `features/resource_type_bootstrap_git.feature:14,25` **Description**: Issue #553 AC-3 states: *"Tests currently fail because `bootstrap_builtin_types()` is never called (this is expected)."* AC-4 states tests should pass once the fix in #524 is merged. Both Behave scenarios explicitly call `bootstrap_builtin_types()` themselves: - Scenario 1 line 14: `When bootstrap_builtin_types is called during initialisation` - Scenario 2 line 25: `And bootstrap_builtin_types is called during initialisation` Bug #524 is that this function is never called during `agents init`. Since the tests manually invoke it, they would **pass today without any fix** if the AmbiguousStep collisions were resolved. The TDD contract (fail now, pass after fix) is broken for the entire Behave layer. Additionally, AC-1 says the test should verify git-checkout is available *"after `agents init`"*, but neither scenario runs `agents init` — they construct a bare `ResourceRegistryService` and call bootstrap directly. Subtask 3 is checked off (`[x]`) with the requirement *"Verify they run (and fail as expected) via `nox -s unit_tests`"* — but the tests neither run (AmbiguousStep crash) nor fail as expected (they'd pass). Only Robot test 1 (`Resource Add Git Checkout Should Not Fail With Type Not Found`) runs the real CLI path and would genuinely fail due to the missing bootstrap call — but it may fail for a different reason (DB/config error, see TEST-2). **Recommendation**: Restructure the Behave scenarios to test the actual `agents init` path without manually calling `bootstrap_builtin_types()`. For example: ```gherkin Scenario: After initialization the git-checkout type exists in the resource type registry Given a freshly initialised resource registry service # No explicit bootstrap call — the bug is that init never calls it Then the resource type "git-checkout" should be present in the registry ``` This scenario would fail today (because init doesn't call bootstrap) and pass once #524 is fixed (because init will call bootstrap). That restores the TDD contract. If the intent is to also test that `bootstrap_builtin_types()` works correctly when called, that's a separate valid scenario but should not be tagged `@tdd @bug524` since it doesn't reproduce the bug. **Verification**: After restructuring, run: ```bash PYTHONPATH=src python -m behave features/resource_type_bootstrap_git.feature --no-capture # Should FAIL with "git-checkout not in registry" (not AmbiguousStep) ``` --- ### BUG-1: AmbiguousStep — `@when('I run "..."')` collides with wildcard step **Severity**: Critical **Category**: BUG **File**: `features/steps/resource_type_bootstrap_git_steps.py:65` **Description**: The step `@when('I run "agents resource add git-checkout local/test --path /tmp/repo --branch main"')` collides with the existing wildcard `@when('I run "{command}"')` at `features/steps/cli_plan_context_commands_steps.py:337`. Behave raises `AmbiguousStep` during step loading, **breaking the entire test suite** — not just this feature file. Confirmed by dry-run: ``` AmbiguousStep: @when('I run "agents resource add git-checkout ..."') has already been defined in existing step @when('I run "{command}"') at features/steps/cli_plan_context_commands_steps.py:337 ``` **Recommendation**: Remove the custom `@when` and reuse the existing wildcard step. Adapt the feature Gherkin to match the existing convention, or prefix the step: ```gherkin When bootstrap-git- I run "agents resource add git-checkout local/test --path /tmp/repo --branch main" ``` **Verification**: ```bash PYTHONPATH=src python -m behave features/resource_type_bootstrap_git.feature --dry-run ``` --- ### BUG-2: AmbiguousStep — `the CLI exit code should be {code:d}` duplicated **Severity**: Critical **Category**: BUG **File**: `features/steps/resource_type_bootstrap_git_steps.py:147` **Description**: Identical step pattern already defined at `features/steps/garbage_collection_cli_steps.py:126`. This is the second `AmbiguousStep` collision. **Recommendation**: Remove the duplicate from `resource_type_bootstrap_git_steps.py`. Reuse the existing step. Ensure the CLI invocation step sets `context.cli_result` so the existing step can read it. **Verification**: ```bash grep -rn "the CLI exit code should be" features/steps/ # Should show exactly ONE definition ``` --- ### BUG-3: AmbiguousStep — `the CLI output should not contain "{text}"` duplicated **Severity**: Critical **Category**: BUG **File**: `features/steps/resource_type_bootstrap_git_steps.py:158` **Description**: Identical step pattern already defined at `features/steps/cli_steps.py:83`. Third `AmbiguousStep` collision. Note: the existing step in `cli_steps.py` reads from `context.output` / `context.result["output"]`, not `context.cli_result.output`. If you reuse it, the CLI invocation step must set the matching context attribute. **Recommendation**: Remove the duplicate from `resource_type_bootstrap_git_steps.py`. Either adapt the CLI invocation step to populate `context.output`, or prefix this step uniquely. **Verification**: ```bash grep -rn 'the CLI output should not contain' features/steps/ # Should show exactly ONE definition ``` --- ### TEST-1: `MagicMock()` without `spec=` for Container **Severity**: Minor **Category**: TEST **File**: `features/steps/resource_type_bootstrap_git_steps.py:74` **Description**: Bare `MagicMock()` used for the DI container. Per `CONTRIBUTING.md` and review workflow Phase 6.5: "Never use bare `MagicMock()` for known types — use `create_autospec(TheClass, instance=True)`." A bare mock silently accepts any attribute, masking typos or API changes. **Recommendation**: ```python from unittest.mock import create_autospec from cleveragents.application.container import Container mock_container = create_autospec(Container, instance=True) mock_container.resource_registry_service.return_value = service ``` --- ### TEST-2: Robot test 1 may fail for wrong reason **Severity**: Minor **Category**: TEST **File**: `robot/resource_type_bootstrap_git.robot:13` **Description**: `Resource Add Git Checkout Should Not Fail With Type Not Found` runs the real CLI without pre-initialized database/project. The CLI will likely fail with a database or configuration error rather than "Resource type not found." The test asserts exit code 0 and absence of a specific string, but a DB error would also produce exit code != 0 for an unrelated reason — meaning the test "fails correctly" by accident, not because it validated the bootstrap bug. **Recommendation**: Add a Setup keyword that initializes a minimal project/DB so the only failure path is the missing bootstrap call. --- ### CODE-1: Unnecessary `hasattr` guard on enum fields **Severity**: Nit **Category**: CODE **File**: `features/steps/resource_type_bootstrap_git_steps.py:116-119,129-132` **File**: `robot/resource_type_bootstrap_git.robot:41-44` **Description**: The pattern `spec.resource_kind.value if hasattr(spec.resource_kind, "value") else str(spec.resource_kind)` is dead code. `ResourceTypeSpec.resource_kind` is typed as `ResourceKind` (enum) which always has `.value`. **Recommendation**: Simplify to `actual = spec.resource_kind.value`. --- ### TEST-3: `bootstrap_builtin_types()` return value never asserted **Severity**: Nit **Category**: TEST **File**: `features/steps/resource_type_bootstrap_git_steps.py:56` **Description**: The step stores `context.bootstrap_registered = service.bootstrap_builtin_types()` but no scenario asserts on it. Adding an assertion that `"git-checkout"` is in the returned list would verify the return contract. **Recommendation**: Add a Then step asserting `"git-checkout" in context.bootstrap_registered`. --- ### Summary | Severity | Count | IDs | Blocking? | |----------|-------|-----|-----------| | Critical | 3 | BUG-1, BUG-2, BUG-3 | **Yes** | | Major | 1 | SPEC-1 | **Yes** | | Minor | 2 | TEST-1, TEST-2 | No | | Nit | 2 | CODE-1, TEST-3 | No | **Verdict**: **NOT ready to merge.** Two categories of blocking issues: 1. **AmbiguousStep collisions** (BUG-1/2/3): The entire Behave suite crashes during step loading. Fix by removing duplicate step definitions and reusing existing ones. 2. **TDD contract violation** (SPEC-1): The Behave scenarios manually call `bootstrap_builtin_types()`, so they'd pass today without the bug fix. They need restructuring to test the actual `agents init` path without explicit bootstrap calls, so they genuinely fail until #524 is fixed. Lint and typecheck pass clean. The overall test design, property assertions, and Robot test structure are solid — the issues are with step collisions and the init-path coverage.
fix(test): address hamza.khyari review #1986 findings on git-checkout bootstrap tests
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 19s
CI / build (pull_request) Successful in 30s
CI / security (pull_request) Successful in 48s
CI / typecheck (pull_request) Successful in 1m5s
CI / unit_tests (pull_request) Failing after 2m8s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 3m9s
CI / coverage (pull_request) Successful in 4m22s
CI / benchmark-regression (pull_request) Has been cancelled
63764d68eb
- SPEC-1: Added genuine TDD failing Scenario 1 (@wip) that creates registry
  WITHOUT bootstrap and asserts git-checkout exists — reproduces bug #524.
  Existing scenarios retained as regression tests (no @wip since they pass).
  Added NOTE FOR FIX AUTHOR comment documenting fix-path expectations.
- BUG-1: Removed colliding @when('I run "agents resource add..."') step.
  Replaced with uniquely-prefixed bootstrap-git step pattern that invokes
  resource_add() directly with mocked DI, avoiding AmbiguousStep collision
  with wildcard @when('I run "{command}"') in cli_plan_context_commands_steps.
- BUG-2: Removed duplicate @then('the CLI exit code should be {code:d}').
  Replaced with prefixed bootstrap-git assertion steps.
- BUG-3: Removed duplicate @then('the CLI output should not contain...").
  Replaced with prefixed bootstrap-git assertion steps.
- TEST-1: Replaced bare MagicMock() with direct service patching via
  _PATCH_SERVICE, consistent with PR #567 pattern.
- TEST-2: Updated Robot docs from 'expected to FAIL' to 'regression tests'
  since both Robot tests call bootstrap explicitly and pass.
- CODE-1: Simplified hasattr guards on enum fields — removed redundant
  hasattr checks, using .value directly since ResourceKind and
  SandboxStrategy are always enums.
- TEST-3: Added assertion on bootstrap_builtin_types() return value via
  new Then step 'the bootstrap-git registered types should include'.
- Updated CHANGELOG from 'Two scenarios' to 'Three scenarios'.

Refs: #553
Author
Member

Thanks @hamza.khyari for the thorough review — all findings addressed in commit 63764d68.

Disposition of findings (review #1986)

SPEC-1 MAJOR — TDD contract broken: scenarios call bootstrap explicitly

Fixed. Added a genuine TDD failing Scenario 1 (@wip) that creates a registry WITHOUT calling bootstrap_builtin_types() and asserts git-checkout exists — this reproduces bug #524 and fails until the fix integrates bootstrap into the init path.

Existing scenarios 2 and 3 are retained as regression tests (not @wip, since they pass) — they verify that bootstrap_builtin_types() itself seeds correct data when called.

Added a NOTE FOR FIX AUTHOR comment in the feature file documenting that if the fix integrates bootstrap into init_command() / initialize_project() rather than ResourceRegistryService.__init__(), the Given step may need updating to exercise the init path.

BUG-1 CRITICAL — AmbiguousStep: @when('I run "..."') collides with wildcard

Fixed. Removed the colliding @when step entirely. Replaced with a uniquely-prefixed step @when('I run bootstrap-git resource add for type "{type_name}" named "{name}" with path "{path}" and branch "{branch}"') that invokes resource_add() directly with mocked DI via patch(_PATCH_SERVICE), consistent with the pattern used in PR #567.

BUG-2 CRITICAL — AmbiguousStep: the CLI exit code should be {code:d} duplicated

Fixed. Removed the duplicate. Replaced with prefixed @then('the bootstrap-git resource add command should succeed') assertion step that checks the captured output/failure status directly.

BUG-3 CRITICAL — AmbiguousStep: the CLI output should not contain "{text}" duplicated

Fixed. Removed the duplicate. Replaced with prefixed @then('the bootstrap-git resource add output should not contain "{text}"') assertion step.

TEST-1 MINOR — MagicMock() without spec= for Container

Fixed. Removed the MagicMock() container entirely. The step now patches _get_registry_service directly to return the in-memory service, bypassing the container layer. This is consistent with PR #567's approach and avoids the bare-mock problem.

TEST-2 MINOR — Robot test 1 may fail for wrong reason

Acknowledged. Updated Robot Documentation from "expected to FAIL" to "Regression tests" since both Robot tests call bootstrap_builtin_types() explicitly and pass. Robot test 1 (Resource Add Git Checkout Should Not Fail With Type Not Found) does run the real CLI path without pre-initialized DB, which means it may fail for DB/config reasons rather than the bootstrap bug. Adding a proper init setup would require the fix to be in place. Worth revisiting when the fix PR is created.

CODE-1 NIT — Unnecessary hasattr guard on enum fields

Fixed. Simplified to spec.resource_kind.value and spec.sandbox_strategy.value directly. ResourceKind and SandboxStrategy are always enums with .value.

TEST-3 NIT — bootstrap_builtin_types() return value never asserted

Fixed. Added a new Then step the bootstrap-git registered types should include "{name}" and included it in Scenario 2. The bootstrap return value is now captured in _make_service() and asserted.

Additional changes

  • Updated CHANGELOG from "Two scenarios" to "Three scenarios"
  • All step patterns now use bootstrap-git prefix to avoid any future collisions with the global step registry

Verification

  • nox -s lint — passed (All checks passed!)
  • nox -s typecheck — passed (0 errors, 0 warnings, 0 informations)
  • grep -rn 'I run "' features/steps/resource_type_bootstrap_git_steps.py — no matches (no colliding patterns)
Thanks @hamza.khyari for the thorough review — all findings addressed in commit `63764d68`. ## Disposition of findings (review #1986) ### SPEC-1 MAJOR — TDD contract broken: scenarios call bootstrap explicitly **Fixed.** Added a genuine TDD failing Scenario 1 (`@wip`) that creates a registry WITHOUT calling `bootstrap_builtin_types()` and asserts `git-checkout` exists — this reproduces bug #524 and fails until the fix integrates bootstrap into the init path. Existing scenarios 2 and 3 are retained as **regression tests** (not `@wip`, since they pass) — they verify that `bootstrap_builtin_types()` itself seeds correct data when called. Added a `NOTE FOR FIX AUTHOR` comment in the feature file documenting that if the fix integrates bootstrap into `init_command()` / `initialize_project()` rather than `ResourceRegistryService.__init__()`, the Given step may need updating to exercise the init path. ### BUG-1 CRITICAL — AmbiguousStep: `@when('I run "..."')` collides with wildcard **Fixed.** Removed the colliding `@when` step entirely. Replaced with a uniquely-prefixed step `@when('I run bootstrap-git resource add for type "{type_name}" named "{name}" with path "{path}" and branch "{branch}"')` that invokes `resource_add()` directly with mocked DI via `patch(_PATCH_SERVICE)`, consistent with the pattern used in PR #567. ### BUG-2 CRITICAL — AmbiguousStep: `the CLI exit code should be {code:d}` duplicated **Fixed.** Removed the duplicate. Replaced with prefixed `@then('the bootstrap-git resource add command should succeed')` assertion step that checks the captured output/failure status directly. ### BUG-3 CRITICAL — AmbiguousStep: `the CLI output should not contain "{text}"` duplicated **Fixed.** Removed the duplicate. Replaced with prefixed `@then('the bootstrap-git resource add output should not contain "{text}"')` assertion step. ### TEST-1 MINOR — `MagicMock()` without `spec=` for Container **Fixed.** Removed the `MagicMock()` container entirely. The step now patches `_get_registry_service` directly to return the in-memory service, bypassing the container layer. This is consistent with PR #567's approach and avoids the bare-mock problem. ### TEST-2 MINOR — Robot test 1 may fail for wrong reason **Acknowledged.** Updated Robot `Documentation` from "expected to FAIL" to "Regression tests" since both Robot tests call `bootstrap_builtin_types()` explicitly and pass. Robot test 1 (`Resource Add Git Checkout Should Not Fail With Type Not Found`) does run the real CLI path without pre-initialized DB, which means it may fail for DB/config reasons rather than the bootstrap bug. Adding a proper init setup would require the fix to be in place. Worth revisiting when the fix PR is created. ### CODE-1 NIT — Unnecessary `hasattr` guard on enum fields **Fixed.** Simplified to `spec.resource_kind.value` and `spec.sandbox_strategy.value` directly. `ResourceKind` and `SandboxStrategy` are always enums with `.value`. ### TEST-3 NIT — `bootstrap_builtin_types()` return value never asserted **Fixed.** Added a new Then step `the bootstrap-git registered types should include "{name}"` and included it in Scenario 2. The bootstrap return value is now captured in `_make_service()` and asserted. ### Additional changes - Updated CHANGELOG from "Two scenarios" to "Three scenarios" - All step patterns now use `bootstrap-git` prefix to avoid any future collisions with the global step registry ## Verification - `nox -s lint` — passed (All checks passed!) - `nox -s typecheck` — passed (0 errors, 0 warnings, 0 informations) - `grep -rn 'I run "' features/steps/resource_type_bootstrap_git_steps.py` — no matches (no colliding patterns)
Merge branch 'master-latest' into feature/m3-test-resource-bootstrap-git
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 25s
CI / typecheck (pull_request) Successful in 40s
CI / security (pull_request) Successful in 48s
CI / build (pull_request) Successful in 33s
CI / unit_tests (pull_request) Failing after 3m0s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 5m35s
CI / coverage (pull_request) Successful in 6m48s
CI / benchmark-regression (pull_request) Successful in 30m1s
7b3743f9bf
# Conflicts:
#	CHANGELOG.md
hamza.khyari requested changes 2026-03-05 21:27:20 +00:00
Dismissed
hamza.khyari left a comment

Round-2 Review: PR #568

Reviewer: reviewer-1

All 8 findings from review #1986 are resolved. One new blocker found.


BUG-4: @wip scenario breaks nox -s unit_tests — no exclusion mechanism

Severity: Major
Category: BUG
File: features/resource_type_bootstrap_git.feature:19

Scenario 1 is tagged @wip and intentionally fails, but nothing excludes it from the normal test run:

  • behave.ini — no tags = ~@wip
  • nox unit_tests — no --tags argument
  • environment.py — no @wip hook
  • This is the only @wip tag in the entire codebase

Confirmed — running the feature without exclusion:

0 features passed, 1 failed, 0 skipped
2 scenarios passed, 1 failed, 0 skipped

This will fail CI on every run until bug #524 is fixed.

Fix: Add to behave.ini:

tags = ~@wip

Verify:

python -m behave features/resource_type_bootstrap_git.feature -q
# 2 scenarios passed, 0 failed, 1 skipped

TEST-4: _capture_output treats SystemExit(0) as failure

Severity: Minor (non-blocking)
File: features/steps/resource_type_bootstrap_git_steps.py:65-67

All SystemExit is treated as failure, including exit code 0. Doesn't bite today but is a latent bug.

Fix:

except SystemExit as exc:
    if exc.code not in (None, 0):
        failed = True
        failure_reason = f"SystemExit: {exc.code}"
## Round-2 Review: PR #568 **Reviewer**: reviewer-1 All 8 findings from review #1986 are resolved. One new blocker found. --- ### BUG-4: `@wip` scenario breaks `nox -s unit_tests` — no exclusion mechanism **Severity**: Major **Category**: BUG **File**: `features/resource_type_bootstrap_git.feature:19` Scenario 1 is tagged `@wip` and intentionally fails, but nothing excludes it from the normal test run: - `behave.ini` — no `tags = ~@wip` - nox `unit_tests` — no `--tags` argument - `environment.py` — no `@wip` hook - This is the only `@wip` tag in the entire codebase Confirmed — running the feature without exclusion: ``` 0 features passed, 1 failed, 0 skipped 2 scenarios passed, 1 failed, 0 skipped ``` This will fail CI on every run until bug #524 is fixed. **Fix**: Add to `behave.ini`: ```ini tags = ~@wip ``` **Verify**: ```bash python -m behave features/resource_type_bootstrap_git.feature -q # 2 scenarios passed, 0 failed, 1 skipped ``` --- ### TEST-4: `_capture_output` treats `SystemExit(0)` as failure **Severity**: Minor (non-blocking) **File**: `features/steps/resource_type_bootstrap_git_steps.py:65-67` All `SystemExit` is treated as failure, including exit code 0. Doesn't bite today but is a latent bug. **Fix**: ```python except SystemExit as exc: if exc.code not in (None, 0): failed = True failure_reason = f"SystemExit: {exc.code}" ```
Merge branch 'master' into feature/m3-test-resource-bootstrap-git
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / quality (pull_request) Successful in 17s
CI / build (pull_request) Successful in 17s
CI / security (pull_request) Successful in 32s
CI / typecheck (pull_request) Successful in 36s
CI / unit_tests (pull_request) Failing after 2m6s
CI / docker (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 3m4s
CI / coverage (pull_request) Successful in 5m6s
CI / benchmark-regression (pull_request) Has been cancelled
0dfcdc641f
fix(test): add tags = ~@wip to behave.ini to exclude @wip scenarios
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 17s
CI / integration_tests (pull_request) Has been cancelled
CI / typecheck (pull_request) Has been cancelled
CI / security (pull_request) Has been cancelled
CI / quality (pull_request) Has been cancelled
CI / unit_tests (pull_request) Has been cancelled
CI / coverage (pull_request) Has been cancelled
CI / docker (pull_request) Has been cancelled
CI / benchmark-regression (pull_request) Has been cancelled
5733ab4045
Addresses BUG-4 from hamza.khyari Round 2 review: @wip scenarios
broke `nox -s unit_tests` because behave.ini had no wip exclusion.
fix(test): fix _capture_output to treat SystemExit(0) as success (TEST-4)
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 19s
CI / build (pull_request) Successful in 22s
CI / security (pull_request) Successful in 39s
CI / typecheck (pull_request) Successful in 46s
CI / unit_tests (pull_request) Successful in 2m37s
CI / docker (pull_request) Successful in 9s
CI / integration_tests (pull_request) Failing after 4m2s
CI / coverage (pull_request) Successful in 4m23s
CI / benchmark-regression (pull_request) Successful in 28m24s
f4a6660bad
SystemExit(0) and SystemExit(None) are normal termination, not
failures. Only set failed=True when exit code is non-zero.
Author
Member

Disposition: hamza.khyari Round 2 Review (review #1988)

All findings from the Round 2 review have been addressed. Changes verified with nox -s lint (all checks passed) and nox -s typecheck (0 errors).

Finding Dispositions

ID Severity Disposition Action Taken
BUG-4 Major (blocking) Fixed Added tags = ~@wip to behave.ini so @wip scenarios are excluded from nox -s unit_tests by default. The @wip TDD scenario (Scenario 1) will no longer break the CI gate.
TEST-4 Minor Fixed Fixed _capture_output to treat SystemExit(0) and SystemExit(None) as success. Only non-zero exit codes now set failed = True.

Commits pushed

  1. 5733ab4fix(test): add tags = ~@wip to behave.ini to exclude @wip scenarios
  2. f4a6660fix(test): fix _capture_output to treat SystemExit(0) as success (TEST-4)
## Disposition: hamza.khyari Round 2 Review (review #1988) All findings from the Round 2 review have been addressed. Changes verified with `nox -s lint` (all checks passed) and `nox -s typecheck` (0 errors). ### Finding Dispositions | ID | Severity | Disposition | Action Taken | |----|----------|-------------|--------------| | **BUG-4** | Major (blocking) | **Fixed** | Added `tags = ~@wip` to `behave.ini` so `@wip` scenarios are excluded from `nox -s unit_tests` by default. The `@wip` TDD scenario (Scenario 1) will no longer break the CI gate. | | **TEST-4** | Minor | **Fixed** | Fixed `_capture_output` to treat `SystemExit(0)` and `SystemExit(None)` as success. Only non-zero exit codes now set `failed = True`. | ### Commits pushed 1. `5733ab4` — `fix(test): add tags = ~@wip to behave.ini to exclude @wip scenarios` 2. `f4a6660` — `fix(test): fix _capture_output to treat SystemExit(0) as success (TEST-4)`
hamza.khyari approved these changes 2026-03-06 00:04:50 +00:00
Dismissed
Owner

PM Note (Day 26 — M3 PR Triage):

This PR is approved but has a merge conflict preventing merge. All review findings have been addressed.

@brent.edwards — Please rebase onto current master and force-push. Once resolved, this can merge immediately. Per Rui's analysis on #494, this should merge first among the M3 PRs to unblock Jeff on bug fix #524.

**PM Note (Day 26 — M3 PR Triage):** This PR is **approved** but has a **merge conflict** preventing merge. All review findings have been addressed. @brent.edwards — Please rebase onto current `master` and force-push. Once resolved, this can merge immediately. Per Rui's analysis on #494, this should merge **first** among the M3 PRs to unblock Jeff on bug fix #524.
Author
Member

Unified TDD Test Toggle Convention

After auditing all 7 PRs that add failing BDD tests (#566, #567, #568, #591, #593, #594, #596), we are standardizing on a single @wip tag mechanism.

The standard

Action Behave (.feature) Robot (.robot)
Turn OFF (skip in CI) Add @wip tag above the Scenario Add [Tags] wip to the test case
Turn ON (run in CI) Remove the @wip tag Remove wip from [Tags]
Run locally despite @wip behave features/foo.feature:42 (by line number) robot --include wip robot/foo.robot

The infrastructure (tags = ~@wip in behave.ini + --exclude wip in noxfile.py) is established by PR #566.

Status of this PR — 2 issues found

Issue 1: behave.ini is duplicating PR #566's work (minor)

This PR adds tags = ~@wip to behave.ini independently, without the documentation comments from #566. If this PR is merged after #566, there will be a merge conflict. If merged before #566, #566's richer version with comments would overwrite this one.

Recommendation: Remove the behave.ini change from this PR and let #566 handle it.

Issue 2: Robot test 1 is UNPROTECTED (bug)

robot/resource_type_bootstrap_git.robot test case Resource Add Git Checkout Should Not Fail With Type Not Found runs the real CLI without bootstrap and has no [Tags] wip. This test will fail in CI because bug #524 is not fixed in this branch.

Required fix: Add [Tags] wip to the first Robot test case:

Resource Add Git Checkout Should Not Fail With Type Not Found
    [Tags]    wip
    [Documentation]    Run "agents resource add git-checkout" and verify it does not
    ...

Issue 3: Missing noxfile.py change

This PR does not add --exclude wip to noxfile.py. Even after adding [Tags] wip to the Robot test, it won't be excluded unless PR #566 is merged first (or this PR adds the same noxfile change).

Summary of required changes

  1. Remove the behave.ini modification (let PR #566 handle it)
  2. Add [Tags] wip to Robot test case 1
  3. Merge PR #566 before this PR (for noxfile --exclude wip)
## Unified TDD Test Toggle Convention After auditing all 7 PRs that add failing BDD tests (#566, #567, #568, #591, #593, #594, #596), we are standardizing on a single `@wip` tag mechanism. ### The standard | Action | Behave (`.feature`) | Robot (`.robot`) | |---|---|---| | **Turn OFF** (skip in CI) | Add `@wip` tag above the Scenario | Add `[Tags] wip` to the test case | | **Turn ON** (run in CI) | Remove the `@wip` tag | Remove `wip` from `[Tags]` | | **Run locally despite @wip** | `behave features/foo.feature:42` (by line number) | `robot --include wip robot/foo.robot` | The infrastructure (`tags = ~@wip` in `behave.ini` + `--exclude wip` in `noxfile.py`) is established by PR #566. ### Status of this PR — 2 issues found **Issue 1: `behave.ini` is duplicating PR #566's work (minor)** This PR adds `tags = ~@wip` to `behave.ini` independently, without the documentation comments from #566. If this PR is merged after #566, there will be a merge conflict. If merged before #566, #566's richer version with comments would overwrite this one. **Recommendation:** Remove the `behave.ini` change from this PR and let #566 handle it. **Issue 2: Robot test 1 is UNPROTECTED (bug)** `robot/resource_type_bootstrap_git.robot` test case `Resource Add Git Checkout Should Not Fail With Type Not Found` runs the real CLI without bootstrap and has **no `[Tags] wip`**. This test will fail in CI because bug #524 is not fixed in this branch. **Required fix:** Add `[Tags] wip` to the first Robot test case: ```robot Resource Add Git Checkout Should Not Fail With Type Not Found [Tags] wip [Documentation] Run "agents resource add git-checkout" and verify it does not ... ``` **Issue 3: Missing `noxfile.py` change** This PR does not add `--exclude wip` to `noxfile.py`. Even after adding `[Tags] wip` to the Robot test, it won't be excluded unless PR #566 is merged first (or this PR adds the same noxfile change). ### Summary of required changes 1. Remove the `behave.ini` modification (let PR #566 handle it) 2. Add `[Tags] wip` to Robot test case 1 3. Merge PR #566 before this PR (for noxfile `--exclude wip`)
The "Resource Add Git Checkout Should Not Fail With Type Not Found"
Robot test was calling the CLI process directly, which hits the real
database file that does not exist in CI — causing
sqlite3.OperationalError.  Rewrite the test to use an in-memory
SQLite database with register_resource(), matching the pattern used
by the fs-directory Robot tests.

Also fix common.resource path to use ${CURDIR}/common.resource.

Refs: #524
Merge remote-tracking branch 'https-origin/master' into HEAD
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 14s
CI / build (pull_request) Successful in 14s
CI / quality (pull_request) Successful in 17s
CI / security (pull_request) Successful in 32s
CI / typecheck (pull_request) Successful in 36s
CI / unit_tests (pull_request) Successful in 4m8s
CI / docker (pull_request) Successful in 40s
CI / integration_tests (pull_request) Successful in 4m53s
CI / coverage (pull_request) Successful in 5m23s
CI / benchmark-regression (pull_request) Successful in 29m34s
775d72dcf4
# Conflicts:
#	CHANGELOG.md
#	behave.ini
brent.edwards dismissed hamza.khyari's review 2026-03-07 02:46:47 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

brent.edwards scheduled this pull request to auto merge when all checks succeed 2026-03-07 02:49:08 +00:00
Author
Member

CI failure fixed + merge conflict resolved

CI failure fix (commit 50228dfe)

The Robot test "Resource Add Git Checkout Should Not Fail With Type Not Found" was failing in CI (run #1174) with:

sqlite3.OperationalError: unable to open database file

Root cause: The test called agents resource add git-checkout via Run Process, which invokes the real CLI process. In CI, no database file exists, so the CLI hits OperationalError.

Fix: Rewrote the test to use an in-memory SQLite database with register_resource(), matching the pattern used by the fs-directory Robot tests (resource_type_bootstrap_fs.robot). Also fixed Resource common.resourceResource ${CURDIR}/common.resource.

Merge conflict resolved (commit 775d72dc)

Merged master (5e595260) into the branch. Two conflicts resolved:

  • CHANGELOG.md — our #553 entry placed first under ## Unreleased, followed by master's entries
  • behave.ini — kept master's version with the @wip documentation comments

Quality gates verified locally

Gate Result
nox -s lint All checks passed
nox -s typecheck 0 errors, 0 warnings
nox -s unit_tests 9,029 scenarios passed, 0 failed, 6 skipped

PR is now mergeable: true and ready to merge.

## CI failure fixed + merge conflict resolved ### CI failure fix (commit `50228dfe`) The Robot test **"Resource Add Git Checkout Should Not Fail With Type Not Found"** was failing in CI ([run #1174](https://git.cleverthis.com/cleveragents/cleveragents-core/actions/runs/1174/jobs/5/attempt/1)) with: ``` sqlite3.OperationalError: unable to open database file ``` **Root cause:** The test called `agents resource add git-checkout` via `Run Process`, which invokes the real CLI process. In CI, no database file exists, so the CLI hits `OperationalError`. **Fix:** Rewrote the test to use an in-memory SQLite database with `register_resource()`, matching the pattern used by the fs-directory Robot tests (`resource_type_bootstrap_fs.robot`). Also fixed `Resource common.resource` → `Resource ${CURDIR}/common.resource`. ### Merge conflict resolved (commit `775d72dc`) Merged master (`5e595260`) into the branch. Two conflicts resolved: - `CHANGELOG.md` — our `#553` entry placed first under `## Unreleased`, followed by master's entries - `behave.ini` — kept master's version with the `@wip` documentation comments ### Quality gates verified locally | Gate | Result | |------|--------| | `nox -s lint` | All checks passed | | `nox -s typecheck` | 0 errors, 0 warnings | | `nox -s unit_tests` | 9,029 scenarios passed, 0 failed, 6 skipped | PR is now `mergeable: true` and ready to merge.
brent.edwards deleted branch feature/m3-test-resource-bootstrap-git 2026-03-07 02:52:51 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core!568
No description provided.