test(cli): add regression-guard tests for Container.resolve() crash #1053
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Blocks
#648 TDD: Write failing test for #647 — Container.resolve() crash in plan tree/explain/correct
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core!1053
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "tdd/m3-container-resolve-crash"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This PR supersedes #670, which was raised from the older branch
tdd/container-resolve-crash. It incorporates the review fixes from #670for the #648 scope while using the corrected branch naming flow.
Description
Adds and hardens regression coverage for bug #647 across both Behave and Robot
frameworks using real DI/container wiring. Includes review-driven improvements
to assertion strength, cleanup robustness, and wording consistency for post-fix
regression-guard semantics.
Type of Change
Quality Checklist
Anyunless justified)nox -s typecheckpasses with no errorsnox -s lintpasses with no errorsfeatures/)robot/) if applicablenox -s coverage_report)nox -s security_scan)nox -s dead_code)Testing
Test Commands Run
Related Issues
Closes #648
Implementation Notes
tdd/m3-container-resolve-crash.#648-scope changes and excludes collateral stabilization files from the old PR #670 path.plan correcttest invocation retains explicit--planwith inline comment because current implementation requires active-plan context in helper subprocess flows.Reviewer Clarifications (Hamza)
@tdd_expected_fail). The commit text came from pre-supersede transition context; this PR description reflects the final post-fix semantics.#648."not a devcontainer-instance") is a test-alignment stabilization to canonical CLI output after rebase; it does not change product behavior.Code Review — PR #1053
Scope: 10 files, +711/−2 (single commit). Lint PASS, Typecheck PASS.
Verified Correct
container.decision_service())@tdd_expected_failcorrectly omitted — tests run as regression guardsMEMORY_ENGINES.pop(),reset_container(), env var removal,Settings.reset()cr647-step prefix prevents BehaveAmbiguousStepcollisionsSettings.reset()addition is consistent withreset_container()pattern and replaces brittle direct_instance = NonewritesFindings
Branch: tdd/container-resolve-crash. Actual branch istdd/m3-container-resolve-crash. PR body explains the supersede but Metadata is stale.settings_configuration.feature:172Givenused afterWhenin the action phase (Given the environment variable "CLEVERAGENTS_ENV" is set to "staging"). Semantically should beAnd.GiventoAndfor correct Gherkin structure.container_resolve_crash_steps.py:247-262Thensteps delegate to a shared function that branches oncontext.cr647_command. Functional but couples step identity toWhen-set state.devcontainer_cleanup.feature:69"rebuild requires devcontainer"→"not a devcontainer-instance". Unrelated to #648.settings_steps.py:6-9sys.path.insert(0, str(_SRC))— unnecessary since Behave configuressrc/on the path viaenvironment.py. Inconsistent with other step files.sys.pathinsertion.Round 2 Follow-Up — 2 new Minor findings
Deep second pass verified:
ContextSnapshotpassed as instance (not dict).ResourceRef.resource_idvalidates non-empty only (no ULID pattern), sostr(ULID())is fine.confidence_score0.95/0.90 within documented 0.0–1.0 range.MEMORY_ENGINEScaching hazards. Cleanup properly deletes DB + WAL/SHM via glob.run_cli()explicitly passesenv_extra, so subprocess env isolation is correct.rc == 2as a distinct failure mode.MEMORY_ENGINES(step cleanup +after_scenario) is safe — SQLAlchemydispose()is idempotent.New Findings
container_resolve_crash_steps.py:54before_scenario's per-scenario temp DB URL withsqlite:///:memory:. The temp file is never created;after_scenariotries to unlink it butsuppress(OSError)silently handles the miss. Functional but messy.@mock_onlyto skip per-scenario DB setup, or set the in-memory URL viacontextinstead ofos.environ.container_resolve_crash_steps.py:56-57reset_container()discards the mock AI provider override set bybefore_scenario. Real providers become active. Harmless here becausetree/explainonly read DB data, andcorrectuses--dry-run. But removing--dry-runwould hit real LLM endpoints.--dry-runis required to prevent LLM access since mock AI is bypassed by the real-container setup.Updated totals (Round 1 + Round 2):
@tdd_expected_failbut code omits it — message is staleGivenused afterWhenin settings scenarioThensteps delegate to shared function, coupling toWhen-set statebefore_scenarioDB URL, orphaning temp filereset_container()discards mock AI; test implicitly depends on--dry-runsys.pathinsertion in settings_steps.pycheck comments
Hi Hamza, thanks for the detailed review. Please find responses for the three process/scope items below.
SPEC-1 — Commit message mentions
@tdd_expected_failWhat we did:
@tdd_expected_failremovedSPEC-2 — Branch name divergence with issue metadata
Status: Resolved in process flow.
Justification / action taken:
tdd/container-resolve-crashtotdd/m3-container-resolve-crashand raised the replacement PR to align with naming standards.CODE-2 —
features/devcontainer_cleanup.featureout-of-scope assertion text changeJustification:
This was a test-stabilization compatibility fix, not a behavior change. After rebase, the CLI error output had standardized wording (
not a devcontainer-instance), while the old assertion expected legacy text (rebuild requires devcontainer).Keeping the old assertion caused a false negative unrelated to #648 logic, so the assertion string was aligned to canonical CLI output to keep test signal accurate.
TEST-1, TEST-2, ENV-1, ENV-2 have been fixed and code pushed.
Round 3 — Scope Audit + Bug Hunt
BUG-1 (Major): Explain assertion checks the wrong decision's content
Behave:
container_resolve_crash_steps.py:238-241Robot:
helper_container_resolve_crash.py:170-172plan explainis invoked with the child decision (chosen_option = "FastAPI"), but the assertion checks for the root decision's content:The child's
chosen_optionis"FastAPI", not"A REST API". The assertion passes by accident because the fallback'"chosen"'matches the JSON key"chosen_option"in the output. If the JSON format changes to not include the substring"chosen", both assertions silently break.Fix: Replace
"A REST API"with"FastAPI"in both the Behave step and Robot helper.SCOPE-1 (Major): 3 of 10 files are out of scope for issue #648
Per CONTRIBUTING.md Rule 1 (Atomicity) and Rule 3 (Single Responsibility):
features/container_resolve_crash.featurefeatures/steps/container_resolve_crash_steps.pyrobot/container_resolve_crash.robotrobot/helper_container_resolve_crash.pyCHANGELOG.mdCONTRIBUTORS.mdsrc/cleveragents/config/settings.pySettings.reset()— new production public API used by test cleanupfeatures/settings_configuration.featureSettings.reset()— not in #648 ACfeatures/steps/settings_steps.pysys.pathinsertion for abovefeatures/devcontainer_cleanup.featurePer CONTRIBUTING.md, this commit does "Add TDD tests for #647 and add
Settings.reset()method and add settings reset Behave test and fix devcontainer assertion text" — four concerns that should be three separate issues.Recommendation: Remove the three out-of-scope files from this commit. If
Settings.reset()and its test are needed, file a separate issue. The devcontainer assertion fix needs its own issue.Verified Correct (no bugs):
get_container()→providers.Factory(DecisionService)→providers.Factory(UnitOfWork)→providers.Callable(get_database_url)→os.environ→MEMORY_ENGINEScache hitenv_extracontext.add_cleanup()runs beforeafter_scenario; double-dispose ofMEMORY_ENGINESis idempotentcatch_exceptions=Truecorrectly captures exceptions for assertion in Then stepsUpdated totals (Round 1 + Round 2 + Round 3):
"A REST API"instead of child's"FastAPI"— passes by accident@tdd_expected_failbut code omits it — message is staleGivenused afterWhenin settings scenarioThensteps delegate to shared function, coupling toWhen-set statebefore_scenarioDB URL, orphaning temp filereset_container()discards mock AI; test implicitly depends on--dry-runsys.pathinsertion in settings_steps.pyCode Review — PR #1053
test(cli): TDD failing tests for Container.resolve() crashGood work using a real DI container instead of
MagicMock— this is the key insight, sinceMagicMockauto-creates attributes and would have hidden theresolve()crash. The three scenarios coveringplan tree,plan explain, andplan correctprovide thorough regression coverage.Approved with notes:
Production code in a test PR —
Settings.reset()classmethod added tosrc/cleveragents/config/settings.pyis production code. While small and well-scoped, a test-only PR ideally shouldn't modify production source. This is documented as "test-alignment stabilization" which is reasonable, but worth noting for future reference.Scope creep — The devcontainer assertion text change (
"rebuild requires devcontainer"→"not a devcontainer-instance") and the CONTRIBUTORS.md update are collateral changes. The PR body explains these as rebase-related stabilizations, which is acceptable.Issue #648 acceptance criteria — The issue AC specifies
@tdd_expected_failtags, but the PR correctly omits them since bug #647 appears to be already fixed. Consider updating the issue to document this decision so it's not flagged by future reviewers.Existing comments addressed — Good to see the PR went through prior review (5 comments). The supersession of PR #670 is clearly documented.
Planning Agent — PR #1053 Review Assessment
@hamza.khyari — Excellent review work across 3 rounds. Your thoroughness here is exactly what this project needs. Let me address the findings:
BUG-1 (Major): Agreed — the explain assertion checking
"A REST API"instead of"FastAPI"is a real bug in the test. It passes by accident via the'"chosen"'fallback, which is fragile. @aditya must fix this before merge. The assertion should check for"FastAPI"to match the child decision'schosen_option.SCOPE-1 (Major): I agree with the finding that 3 files are out of scope. Per CONTRIBUTING.md Section "Atomic Commits" — the
Settings.reset()addition, its test, and the devcontainer assertion fix are separate concerns. However, given that this PR has been in review for an extended period and theSettings.reset()is a small utility method used by test cleanup, I'm ruling as follows:features/devcontainer_cleanup.feature) must be removed from this PR — it is unrelated to #648.Settings.reset()method and its Behave test can remain in this PR since the test cleanup in the core deliverable depends on it. File a follow-up issue to documentSettings.reset()as a public API addition.sys.pathinsertion (CODE-3) must be removed — it is unnecessary per environment.py configuration.SPEC-1: @aditya has addressed this in the PR description. The commit message body should be amended to accurately reflect regression-guard semantics.
SPEC-2: Addressed — branch name updated.
TEST-1, ENV-1, ENV-2: @aditya reports these are fixed.
Action required for merge:
features/devcontainer_cleanup.featurechangesys.pathinsertion from settings_steps.pyOnce these are done, this PR should be mergeable. I've assigned @hamza.khyari and @brent.edwards as reviewers. Two approvals required per CONTRIBUTING.md.
33a4daf50303353f0e37New commits pushed, approval review dismissed automatically according to repository settings
PR #1053 — Review Response (Hamza Round 3 + Freemo Assessment)
Freemo Action Items — All Resolved
1. BUG-1 (Major) — Explain assertion checks wrong decision content
Finding:
plan explainis invoked with the child decision (chosen_option = "FastAPI"), but assertions checked for the root decision content ("A REST API"). The test passed accidentally via the"chosen"JSON key fallback.Resolution: Replaced
"A REST API"with"FastAPI"in both assertion sites:features/steps/container_resolve_crash_steps.py— Behave Then step now asserts"FastAPI" in result.outputrobot/helper_container_resolve_crash.py—_assert_explain_outputnow checks"FastAPI" not in outputThe fragile
'"chosen"'fallback has been removed entirely.2. SCOPE-1 / CODE-2 — Remove
features/devcontainer_cleanup.featureFinding: The devcontainer assertion text change is unrelated to #648 and violates CONTRIBUTING.md atomicity rules.
Resolution: Reverted
features/devcontainer_cleanup.featureto itsorigin/masterstate. The file no longer appears in the PR diff (9 files, down from 10).3. CODE-3 — Remove
sys.pathinsertion fromsettings_steps.pyFinding:
sys.path.insert(0, str(_SRC))insettings_steps.pyis unnecessary sincefeatures/environment.pyalready configuressrc/on the path.Resolution: The
sys.pathinsertion block (includingimport sys,_SRCvariable, and# noqa: E402) has been fully removed. The net PR diff forsettings_steps.pycontains nosys.pathreferences — only the@whendecorator addition andSettings.reset()step definitions.4. SPEC-1 — Commit message body reflects stale
@tdd_expected_failsemanticsFinding: Commit body said "Tests tagged
@tdd_expected_fail" but the code correctly omits the tag since bug #647 is fixed.Resolution: All three commits have been squashed into a single commit with an accurate message:
No mention of
@tdd_expected_fail. Message accurately describes regression-guard semantics.Previously Addressed Items (Rounds 1–3)
tdd/m3-container-resolve-crash; tracked in issue #648GivenafterWhenchanged toAndinsettings_configuration.featurecontext.cr647_commandcoupling; each Then step now owns its assertionsenvironment.pyinstead of overwriting--dry-rundependency to prevent live LLM accessFreemo Rulings Acknowledged
@tdd_expected_failomission decision.Final PR State
tdd/m3-container-resolve-crashPlanning review (Day 42):
This is a Priority/Critical, Must-have PR for v3.2.0 and has @freemo approval. Resolving the outstanding review would clear it for merge consideration.
Day 43 Planning Review — Status Assessment and Action Items
Summary
This TDD PR for bug #647 has gone through 3 thorough review rounds by @hamza.khyari, with all findings addressed by @aditya in a single squashed commit. The PR is mergeable (no conflicts) and represents critical-path work for v3.2.0.
Comment Response
@hamza.khyari (Rounds 1-3)
Your review work on this PR is exemplary — the BUG-1 finding (explain assertion checking wrong decision content, passing by accident) is exactly the kind of subtle correctness issue that prevents test suite degradation. All findings were valid and well-documented.
@aditya (Response)
Your responses are thorough and address all actionable items. The commit message amendment correctly reflects regression-guard semantics. The BUG-1 fix (replacing
"A REST API"with"FastAPI") and the removal of the fragile'"chosen"'fallback are correct.@freemo (Day 42)
The outstanding REQUEST_CHANGES from @hamza.khyari appears to have been addressed by @aditya's Round 3 response. The review needs to be formally resolved.
Current Blocking Items
Action Required
Closes #648is in the PR description (not just the commit footer).This is the #1 blocking TDD PR for v3.2.0. Bug #647 (Container.resolve crash) cannot be fully resolved until this TDD PR is merged. The stale REQUEST_CHANGES review has been blocking this PR for 5 days — this must be resolved today.
03353f0e37dca032cb6eCode Review — PR #1053
test(cli): add TDD failing tests for Container.resolve() crashFindings
1. Bug (Medium): Robot helper seeds decisions without creating a plan record
robot/helper_container_resolve_crash.py:78—_setup_decisions()createsplan_id = str(ULID())and seeds decisions viaDecisionService.record_decision(), but never creates an actual plan throughPlanLifecycleService. The Behave test (features/steps/container_resolve_crash_steps.py:69-77) correctly creates a real plan viacreate_action()+use_action().The Robot test relies on the CLI subprocess querying decisions by
plan_idwhere decision rows exist but no plan record does. If any CLI command adds a plan-existence check before listing decisions, the Robot tests will break for a different reason than the bug under test.Fix: Mirror the Behave pattern — create an action + plan via
PlanLifecycleServicebefore seeding decisions.2. Observation (Low):
Settings.reset()not used inenvironment.pyThe new
Settings.reset()classmethod (settings.py:633-640) doescls._instance = None, which is exactly whatfeatures/environment.py:631does directly. Consider updatingenvironment.py:631to callSettings.reset()for consistency — otherwise there are two paths for the same operation.3. Observation (Low):
get_container()call is a no-opfeatures/steps/container_resolve_crash_steps.py:117callsget_container()and the comment says "Get the real container so command paths use DI wiring" — but the returned container is never used. It's theCliRunner.invokecall that creates the container used by the CLI. The comment is misleading.4. Observation (Low):
catch_exceptions=Truemasks setup failuresAll three
cli_runner.invokecalls usecatch_exceptions=True. This is intentional for verifying the bug'sAttributeError, but also means any non-AttributeErrorsetup failure (import error, missing table) would be silently captured and fail theisinstanceassertion for the wrong reason.5. Observation (Low):
globimport vspathlibconventionrobot/helper_container_resolve_crash.pyimportsglob.globfor WAL file cleanup. The codebase convention (e.g.,features/environment.py:692-694) uses explicit suffix iteration orpathlib.Path.glob().Summary
Finding #1 should be fixed. The others are minor consistency notes.
dca032cb6ef517dd5a94New commits pushed, approval review dismissed automatically according to repository settings
Thanks for the detailed review and re-check.
Response to findings
1) Bug (Medium): Robot helper seeds decisions without creating a plan record
Agreed. This was a valid realism gap in the Robot helper setup.
Fixed by aligning the helper with the same lifecycle pattern used in Behave:
PlanLifecycleService.create_action(...)PlanLifecycleService.use_action(...)plan_idThis keeps the test scoped to bug #647 while avoiding fragility from synthetic plan IDs.
2) Observation (Low):
Settings.reset()not used inenvironment.pyAcknowledged. This is a valid consistency note.
For this PR, I kept scope strictly to the targeted review bug fix above and did not include collateral cleanup changes.
3) Observation (Low):
get_container()call is a no-opAcknowledged. The note is correct; this is a clarity/comment cleanup item rather than a functional issue.
Not changed in this PR to keep scope tight to the targeted fix.
4) Observation (Low):
catch_exceptions=Truemay mask setup failuresAcknowledged. Current assertions still fail non-
AttributeErrorexceptions, but this is a fair test-diagnostics caution.No behavior change made here in order to avoid expanding scope beyond the requested fix.
5) Observation (Low):
globimport vspathlibconventionAcknowledged as style/consistency feedback.
Left unchanged in this PR because it is non-blocking and outside the immediate bug-fix scope.
Summary
PR Review: !1053 (Ticket #648)
Verdict: ✅ Approve
All 6 acceptance criteria from ticket #648 are satisfied. All prior review findings from 4 rounds (Hamza, Freemo) have been verified as fixed in the current code. The tests correctly use real DI containers (not MagicMock) to catch the exact class of bug that existing M3 tests missed. The remaining findings are minor quality suggestions and nits — none affect correctness, completeness, or spec compliance.
Critical Issues
None.
Major Issues
None.
Minor Issues
1. Behave step leaks SQLAlchemy engine — not disposed in cleanup
features/steps/container_resolve_crash_steps.py:62-63, 129-143uow = UnitOfWork(database_url)and seeds decisions. Afterinit_database()disposes the initial engine, subsequentrecord_decision()calls recreate an engine via theengineproperty. Sinceenvironment.py'sbefore_scenariosets a file-based SQLite URL, this engine is not stored inMEMORY_ENGINESand the cleanup closure never disposes it — it relies on garbage collection. The Robot helper correctly capturesctx.uowand callsctx.uow.engine.dispose()in its cleanup.uowin the cleanup closure and dispose its engine explicitly, matching the Robot helper's pattern.2. Dead step registration —
@then("cr647- the command should execute successfully")never matchedfeatures/steps/container_resolve_crash_steps.py:218step_cr647_command_should_succeedis decorated with@then(...)registering it as a Behave step, but no scenario uses this step text. It's only called programmatically by the three specific Then steps. The decorator adds a dead entry to Behave's step registry.@thendecorator and rename to_assert_command_succeeded(context)to signal it's a helper, not a step.3.
plan correctassertions weaker thanplan tree/plan explainfeatures/steps/container_resolve_crash_steps.py:276-284,robot/helper_container_resolve_crash.py:179-185plan correctassertion only validatesdecision_idin the output. By contrast,plan explainvalidates bothdecision_idAND"FastAPI". There's no command-specific content validation forplan correct(e.g., dry-run output, revert mode, guidance text).plan correct-specific assertion (e.g., check for"revert","dry-run", or"Django"guidance text) to strengthen the regression guard.4.
get_container()call is a no-op with misleading commentfeatures/steps/container_resolve_crash_steps.py:125-126_ = get_container()discards the return value. The comment says "Get the real container so command paths use DI wiring," butCliRunner.invokecreates its own container. The side effect (warming the singleton cache) may be meaningful, but the comment misrepresents the purpose.# Pre-initialize the container singleton so CLI commands get DI wiring from the seeded database.5.
Settings.reset()has no test-only documentation guardsrc/cleveragents/config/settings.py:633-640Settings.reset()is public with no indication it's test-only. If production code accidentally calls it, it would wipe the singleton mid-flight. The risk is low since the codebase already has 50+ instances ofSettings._instance = Nonein tests, and this encapsulates the pattern. But a documentation guard would be beneficial... warning:: **Test use only.** Do not call in production code paths.to the docstring.6. Missing type annotations on 4 new settings step functions
features/steps/settings_steps.py:613, 623, 629, 637contextandexpectedparameters. Per CONTRIBUTING.md, new code should have explicit types. Thecontainer_resolve_crash_steps.pyfile in this same PR correctly usescontext: Contextand-> None.context: Context,expected: str, and-> Nonetype hints.7. CONTRIBUTORS.md new entry not in alphabetical order
CONTRIBUTORS.md:9Aditya Chhabrais appended at the end. The existing list follows alphabetical order (Brent, Hamza, Jeffrey, Luis, Rui). "Aditya" should appear first.8. PR title does not match commit message
test(cli): add TDD failing tests for Container.resolve() crashtest(cli): add regression-guard tests for Container.resolve() crashtest(cli): add regression-guard tests for Container.resolve() crash.9.
plan correcttest uses undocumented--planflag (spec drift)container_resolve_crash_steps.py:202,helper_container_resolve_crash.py:270--plantoplan correct, but the spec doesn't define this flag. The code correctly includes an inline comment acknowledging this. The test is correct for the current implementation but documents a spec-vs-implementation drift.--planto spec, or make the CLI infer the plan from decision ID).Nits
10.
glob.globinstead ofpathlibconventionrobot/helper_container_resolve_crash.py:15, 147for suffix in ("", "-journal", "-wal", "-shm"): Path(ctx.db_path + suffix).unlink(missing_ok=True).11. Robot test cases repeat identical boilerplate
robot/container_resolve_crash.robot:24-55Run And Verify Helperkeyword to deduplicate the 3 test cases. Acceptable as-is for 3 tests.12. Feature description awkward line break
features/container_resolve_crash.feature:7-8mock get_container()on the same line.13. Redundant
TYPE_CHECKINGimportfeatures/steps/container_resolve_crash_steps.py:23-24Decisionis imported underTYPE_CHECKINGat module level but is also imported inside the function at line 47.TYPE_CHECKINGguard since the in-function import provides the type within the relevant scope.Summary
This is a well-crafted TDD regression test PR. The core deliverables are solid: three Behave BDD scenarios and three Robot Framework integration tests that exercise
plan tree,plan explain, andplan correctthrough a real DI container with seeded decisions — correctly catching the class of bug that existing M3 tests missed with theirMagicMockapproach. All 6 acceptance criteria are met, all 4 rounds of prior review feedback have been addressed, and theSettings.reset()addition is a clean encapsulation of an existing test pattern.The 9 minor findings are quality improvements (resource cleanup, type annotations, naming, ordering) — not correctness issues. The 4 nits are stylistic. The PR is ready to merge after addressing the minor items (or acknowledging them as follow-up work).
test(cli): add TDD failing tests for Container.resolve() crashto test(cli): add regression-guard tests for Container.resolve() crashf517dd5a9436c36bc0eeNew commits pushed, approval review dismissed automatically according to repository settings
Response to @hurui200320 Review (Round 5)
Thanks for the thorough review and the approval. All 9 minor findings and 4 nits have been addressed in the amended commit. Here's the summary:
Fixed (Minor Issues)
#1 — Behave step leaks SQLAlchemy engine
Fixed. The cleanup closure now explicitly captures
uowand callsuow.engine.dispose()before theMEMORY_ENGINESpop, matching the Robot helper's pattern. (container_resolve_crash_steps.py:131)#2 — Dead
@thendecorator on shared assertion helperFixed. Removed the
@then("cr647- the command should execute successfully")decorator and renamed the function to_assert_command_succeeded()to signal it's a private helper, not a step. Updated all three callers.#3 —
plan correctassertions weaker than tree/explainFixed. Both Behave and Robot assertions now validate command-specific content:
"revert"or"dry"must appear in the output (case-insensitive), strengthening the regression guard beyond just decision-id presence. (container_resolve_crash_steps.py:286-289,helper_container_resolve_crash.py:187-192)#4 —
get_container()call is a no-op with misleading commentFixed. Updated the comment to:
# Pre-initialize the container singleton so CLI commands get DI wiring from the seeded database.This accurately describes the singleton-warming side effect.#5 —
Settings.reset()has no test-only documentation guardFixed. Added
.. warning:: **Test use only.** Do not call in production code paths — resetting the singleton mid-flight can cause inconsistent configuration state.to the docstring. (settings.py:640-642)#6 — Missing type annotations on 4 new settings step functions
Fixed. Added
context: Context,expected: str, and-> Nonetype hints to all four new step functions insettings_steps.py. Also addedfrom __future__ import annotationsandfrom behave.runner import Contextimports.#7 — CONTRIBUTORS.md not in alphabetical order
Fixed. Moved
Aditya Chhabrato its alphabetically correct position (first) and removed a duplicateRui Huentry.#9 —
plan correctuses undocumented--planflagAcknowledged. Both the Behave step and Robot helper already had inline comments documenting this spec drift. Updated the Robot helper comment to be more explicit about the spec-vs-implementation gap. Will track separately.
Fixed (Nits)
#10 —
glob.globinstead ofpathlibconventionFixed. Replaced
from glob import glob+glob(f"{ctx.db_path}*")with explicit suffix iteration:for suffix in ("", "-journal", "-wal", "-shm"): Path(ctx.db_path + suffix).unlink(missing_ok=True). (helper_container_resolve_crash.py:147-149)#11 — Robot test cases repeat boilerplate
Acknowledged as acceptable for 3 test cases per your own recommendation. No change.
#12 — Feature description awkward line break
Fixed. Reflowed
mock get_container()to keep it on the same line. (container_resolve_crash.feature:7-8)#13 — Redundant
TYPE_CHECKINGimport forDecisionFixed. Removed the module-level
TYPE_CHECKINGguard andDecisionimport. AddedDecisionto the in-function import fromcleveragents.domain.models.core.decisionwhere it's actually used.Quality Gates
nox -e lint: PASSnox -e typecheck: PASS (0 errors)nox -e unit_tests: All PR-related features pass (3 pre-existing failures in unrelated checkpoint/git_tools features)nox -e integration_tests: Container Resolve Crash robot test passes (17 pre-existing failures in unrelated features)nox -e coverage_report: 98.22% (above 97% threshold)PR title also updated to match commit message.
Closing the pr, all the issues resolved successfully, the pr has been approved by both @hamza.khyari and @hurui200320 .
hurui200320 referenced this pull request2026-03-24 12:16:49 +00:00