test(e2e): TDD behavioral test proving ACMS indexing pipeline is not wired into CLI (bug #1028) #1124

2026-03-23T05:06:26Z

hurui200320 commented

2026-03-23 05:06:26 +00:00

Summary

This PR adds a Robot Framework E2E test suite (robot/e2e/tdd_acms_behavioral_validation.robot) that proves bug #1028 exists — the ACMS indexing pipeline is not wired into the CLI, so ContextTierService starts empty on every invocation.

Changes

New file: robot/e2e/tdd_acms_behavioral_validation.robot — 4 E2E test cases tagged tdd_expected_fail, tdd_bug, tdd_bug_1028, E2E
Modified: robot/e2e/common_e2e.resource — Extracted shared keywords (Run CLI, Extract JSON From Stdout, Link Resource To Project, Create Synthetic Codebase) from both m5_acceptance.robot and tdd_acms_behavioral_validation.robot to eliminate ~97 lines of duplication. Create Synthetic Codebase is parameterized with project_label.
Modified: robot/e2e/m5_acceptance.robot — Removed duplicated keywords now provided by common_e2e.resource.
CHANGELOG.md: Added entry under ## Unreleased documenting the TDD tests for #1029.

Test Cases

Test 1: Context Simulate Returns Non-Empty Tier Data — asserts fragment_count > 0 (fails, proving bug)
Test 2: Context Inspect Shows Indexed Resources — asserts tier metrics total > 0 (fails, proving bug)
Test 3: Budget Enforcement Excludes Oversized Files — asserts fragment_count > 0 with max_file_size policy (fails, proving bug). Includes TODO comment for post-fix exclusion assertion.
Test 4: Large Project Indexes Without Timeout — generates 10K+ files, asserts fragment_count > 0 (fails, proving bug). Includes explicit developer responsibility note about git-tracking of generated files.

All tests pass CI through result inversion by the tdd_expected_fail_listener.py — failing assertions (bug confirmed) are inverted to PASS.

Documentation & Robustness Improvements

Suite-level documentation expanded with known limitation section explaining tdd_expected_fail result inversion scope.
Suite setup error messages include (rc=${var.rc}). Check DEBUG logs above. for debugging consistency with m5_acceptance.robot.
Redundant exit code assertions after Run CLI calls removed (Run CLI already validates rc internally).
Run CLI keyword documentation includes API key security notes.
Budget enforcement test (Test 3) includes TODO(bugfix/...) comment for the bug-fix developer.
Large project test (Test 4) includes explicit NOTE assigning responsibility to the bug-fix developer to evaluate filesystem vs. git-tracked content indexing.

Review Fix Round

Addressed all findings from Luis's review (review #2691):

C1 (CRITICAL): Restored the #845 CorrectionService changelog entry (50 lines) accidentally deleted during merge conflict resolution.
L1 (LOW): Extracted ~97 lines of duplicated keywords into common_e2e.resource (parameterized Create Synthetic Codebase, shared Run CLI, Extract JSON From Stdout, Link Resource To Project).
M1 (MEDIUM): Strengthened NOTE comment about 10K files not being git-committed — explicit developer MUST responsibility.
L3 (LOW): Made suite setup error messages verbose with (rc=...). Check DEBUG logs above.
L2 (LOW): Removed redundant exit code assertions after Run CLI calls.
M2 (MEDIUM): Acknowledged — partial assertion is acceptable for TDD capture phase (TODO documents the gap).
I1 (INFO): Acknowledged — tdd_expected_fail masking is documented and mitigated.

Motivation

Per the Bug Fix Workflow in CONTRIBUTING.md, this TDD issue (#1029) is the prerequisite for bug fix #1028. The tests capture the buggy behavior so that when the fix is implemented, removing the tdd_expected_fail tag will cause the tests to pass normally.

Quality Gates

Gate	Result
lint	PASS
typecheck	PASS
unit_tests	PASS
integration_tests	PASS
e2e_tests	PASS (41 tests, 4 TDD)
coverage_report	PASS (>=97%)

Closes #1029

## Summary This PR adds a Robot Framework E2E test suite (`robot/e2e/tdd_acms_behavioral_validation.robot`) that proves bug #1028 exists — the ACMS indexing pipeline is not wired into the CLI, so `ContextTierService` starts empty on every invocation. ### Changes - **New file**: `robot/e2e/tdd_acms_behavioral_validation.robot` — 4 E2E test cases tagged `tdd_expected_fail`, `tdd_bug`, `tdd_bug_1028`, `E2E` - **Modified**: `robot/e2e/common_e2e.resource` — Extracted shared keywords (`Run CLI`, `Extract JSON From Stdout`, `Link Resource To Project`, `Create Synthetic Codebase`) from both `m5_acceptance.robot` and `tdd_acms_behavioral_validation.robot` to eliminate ~97 lines of duplication. `Create Synthetic Codebase` is parameterized with `project_label`. - **Modified**: `robot/e2e/m5_acceptance.robot` — Removed duplicated keywords now provided by `common_e2e.resource`. - **CHANGELOG.md**: Added entry under `## Unreleased` documenting the TDD tests for #1029. ### Test Cases - **Test 1**: Context Simulate Returns Non-Empty Tier Data — asserts `fragment_count > 0` (fails, proving bug) - **Test 2**: Context Inspect Shows Indexed Resources — asserts tier metrics total > 0 (fails, proving bug) - **Test 3**: Budget Enforcement Excludes Oversized Files — asserts `fragment_count > 0` with `max_file_size` policy (fails, proving bug). Includes TODO comment for post-fix exclusion assertion. - **Test 4**: Large Project Indexes Without Timeout — generates 10K+ files, asserts `fragment_count > 0` (fails, proving bug). Includes explicit developer responsibility note about git-tracking of generated files. All tests pass CI through result inversion by the `tdd_expected_fail_listener.py` — failing assertions (bug confirmed) are inverted to PASS. ### Documentation & Robustness Improvements - Suite-level documentation expanded with **known limitation** section explaining `tdd_expected_fail` result inversion scope. - Suite setup error messages include `(rc=${var.rc}). Check DEBUG logs above.` for debugging consistency with `m5_acceptance.robot`. - Redundant exit code assertions after `Run CLI` calls removed (Run CLI already validates rc internally). - `Run CLI` keyword documentation includes API key security notes. - Budget enforcement test (Test 3) includes `TODO(bugfix/...)` comment for the bug-fix developer. - Large project test (Test 4) includes explicit NOTE assigning responsibility to the bug-fix developer to evaluate filesystem vs. git-tracked content indexing. ### Review Fix Round Addressed all findings from Luis's review (review #2691): - **C1 (CRITICAL)**: Restored the #845 `CorrectionService` changelog entry (50 lines) accidentally deleted during merge conflict resolution. - **L1 (LOW)**: Extracted ~97 lines of duplicated keywords into `common_e2e.resource` (parameterized `Create Synthetic Codebase`, shared `Run CLI`, `Extract JSON From Stdout`, `Link Resource To Project`). - **M1 (MEDIUM)**: Strengthened NOTE comment about 10K files not being git-committed — explicit developer MUST responsibility. - **L3 (LOW)**: Made suite setup error messages verbose with `(rc=...). Check DEBUG logs above.` - **L2 (LOW)**: Removed redundant exit code assertions after `Run CLI` calls. - **M2 (MEDIUM)**: Acknowledged — partial assertion is acceptable for TDD capture phase (TODO documents the gap). - **I1 (INFO)**: Acknowledged — `tdd_expected_fail` masking is documented and mitigated. ### Motivation Per the Bug Fix Workflow in CONTRIBUTING.md, this TDD issue (#1029) is the prerequisite for bug fix #1028. The tests capture the buggy behavior so that when the fix is implemented, removing the `tdd_expected_fail` tag will cause the tests to pass normally. ### Quality Gates | Gate | Result | |------|--------| | lint | PASS | | typecheck | PASS | | unit_tests | PASS | | integration_tests | PASS | | e2e_tests | PASS (41 tests, 4 TDD) | | coverage_report | PASS (>=97%) | Closes #1029

hurui200320 added this to the v3.4.0 milestone 2026-03-23 05:06:32 +00:00

hurui200320 added the

Type

Testing

label 2026-03-23 05:06:34 +00:00

hurui200320 force-pushed tdd/m5-acms-cli-indexing-pipeline-wiring from 39bfe4c3a1 to a2f91cdb27

2026-03-23 06:18:39 +00:00

Compare

hurui200320 force-pushed tdd/m5-acms-cli-indexing-pipeline-wiring from a2f91cdb27 to 640e90a2a5

2026-03-24 05:41:37 +00:00

Compare

CoreRasurae requested review from CoreRasurae 2026-03-24 10:03:25 +00:00

CoreRasurae requested changes 2026-03-24 10:16:07 +00:00

CoreRasurae left a comment

Code Review Report — PR #1124 (TDD Bug #1028 / Issue #1029)

Branch: tdd/m5-acms-cli-indexing-pipeline-wiring
Commit: 640e90a by Rui Hu
Review scope: Branch diff vs origin/master (2 files changed: CHANGELOG.md, robot/e2e/tdd_acms_behavioral_validation.robot)
Methodology: 3 iterative full-spectrum review cycles covering bug detection, test flaws, test coverage, performance, security, spec compliance, and code quality. Findings stabilised after cycle 2.

Positive Observations

All 4 test cases are correctly aligned with the issue #1029 specification (test case names, tag conventions, and behavioral assertions match exactly).
Tag convention (tdd_expected_fail, tdd_bug, tdd_bug_1028, E2E) follows CONTRIBUTING.md > TDD Bug Test Tags rules and will pass the listener's _validate_tdd_tags() checks.
The tdd_expected_fail_listener.py IS registered in the e2e_tests nox session (line 723 of noxfile.py), so result inversion will work correctly.
CLI commands used (project context simulate, project context inspect, project context set with --include-path, --max-file-size, --max-total-size) all match the specification command synopsis (lines 237–271 of docs/specification.md).
The known limitation of tdd_expected_fail masking setup failures is explicitly documented in the suite header and mitigated by m5_acceptance.robot.
Security: NO_COLOR=1 is set, stdout/stderr are not embedded in assertion messages, and API key logging risk is mitigated by DEBUG-level logging.
Commit message matches the issue metadata exactly, branch name matches, and ISSUES CLOSED: #1029 is correct per the TDD workflow.

Findings by Severity

CRITICAL — Bug Detection

C1. CHANGELOG regression: #845 CorrectionService entry accidentally deleted

File: CHANGELOG.md, lines 72–121 of merge-base content
Description: The branch removes 50 lines from the ## Unreleased section — the entire #845 CorrectionService subtree isolation entry. This entry documents numerous fixes (BFS cycle detection, dry-run enforcement, terminal-state guards, mode validation, status restoration, etc.) from PR #845 merged on 2026-03-19.
Evidence: git diff origin/master...HEAD -- CHANGELOG.md shows the merge-base CHANGELOG has 946 lines; the branch HEAD has 900 lines (946 − 50 deleted + 4 added = 900).
Root cause: Likely a merge conflict resolution error when adding the 4-line #1029 entry at the top of the Unreleased section. The #845 entry, which was adjacent to the conflict region, was dropped during resolution.
Impact: If merged, the #845 changelog documentation will be permanently lost from release notes.
Action required: Restore the #845 entry from origin/master's CHANGELOG.md before merging.

MEDIUM — Test Flaws

M1. Large Project test: 10K generated files are not committed to the git repo

File: robot/e2e/tdd_acms_behavioral_validation.robot, lines 317–338
Description: The "Large Project Indexes Without Timeout" test generates 10,000 .py files on the filesystem but does not git add + git commit them. The workspace resource is registered as git-checkout type (line 83). If the bug #1028 fix routes indexing through the git sandbox (which only exposes git-tracked content), these files will not be visible and the test will fail for the wrong reason (files not tracked vs. pipeline not wired).
Mitigating factors:
- The current walk_and_index in repo_indexing_utils.py uses os.walk() (filesystem), not git ls-files, so this may not manifest.
- Lines 331–338 document this risk with commented-out git add/commit commands ready to uncomment.
- m5_acceptance.robot has the same pattern (neither suite commits generated files).
Recommendation: Consider uncommitting the git add/commit lines now (or adding a comment making it clear this is a deliberate risk acceptance for the TDD phase, and the bug-fix developer must evaluate it). Current phrasing ("If the ACMS indexing pipeline reads only git-tracked content...") is ambiguous about responsibility.

M2. Budget enforcement test is a partial assertion

File: tdd_acms_behavioral_validation.robot, lines 260–301
Description: The "Budget Enforcement Excludes Oversized Files" test only asserts fragment_count > 0. It does not verify that large_file.py (>1 KiB) is excluded from the fragment list while smaller files are included. The test title claims it tests exclusion, but the assertion only tests inclusion.
Mitigating factor: The TODO comment on lines 295–298 documents this gap and defers the second assertion to the bug-fix branch.
Recommendation: Acceptable for TDD capture phase. Ensure the bug-fix developer is aware this test needs a second assertion for the large_file.py exclusion check.

LOW — Code Quality

L1. ~97 lines of keyword/setup code duplicated from m5_acceptance.robot

File: tdd_acms_behavioral_validation.robot, keyword section (lines 46–167)
Description: Four keywords are character-for-character identical to m5_acceptance.robot: Run CLI (18 lines), Extract JSON From Stdout (15 lines), Link Resource To Project (6 lines), Suite Teardown (3 lines). Create Synthetic Codebase and Suite Setup differ only in trivial string literals (~55 lines). Total: ~97 duplicated lines out of ~105 keyword-definition lines.
Recommendation: Extract the common keywords into common_e2e.resource (which already exists and is imported by both files). The keywords can be parameterized with suite-specific names (resource name, workspace label) passed as arguments. This would reduce maintenance burden when CLI interfaces change.

L2. Redundant exit code assertions

File: tdd_acms_behavioral_validation.robot, lines 201, 238, 275, 341
Description: Should Be Equal As Integers ${result.rc} 0 is called immediately after Run CLI which already validates the exit code internally via Should Be Equal As Integers ${result.rc} ${expected_rc} (default 0). These assertions are always true if Run CLI didn't fail.
Impact: No functional issue; adds minor visual clutter.

L3. Suite setup error messages less verbose than m5_acceptance.robot

File: tdd_acms_behavioral_validation.robot, lines 66, 71, 73
Description: Error messages use msg=git init failed vs. m5_acceptance.robot's msg=git init failed (rc=${git_init.rc}). Check DEBUG logs above. The less-verbose messages omit the actual return code, making debugging harder if git commands fail in CI.
Recommendation: Add (rc=${variable.rc}) and "Check DEBUG logs" to match the m5_acceptance.robot pattern for consistency.

INFORMATIONAL — Known Limitations (documented, no action required)

I1. tdd_expected_fail listener masks unrelated failures

If suite setup fails or a CLI infrastructure error occurs, individual test failures are inverted to PASS by the listener, producing false-positive "bug confirmed" results.
Explicitly documented in suite header (lines 20–31) and test case comments (lines 170–182).
Mitigated by m5_acceptance.robot running the same CLI plumbing without tdd_expected_fail inversion.

Verdict

Request changes — the CHANGELOG regression (C1) must be fixed before merge. The #845 entry needs to be restored from origin/master. All other findings are medium/low severity and can be addressed at the reviewer's discretion.

Review performed using 3 iterative full-spectrum cycles (bug detection, test flaws, test coverage, performance, security, spec compliance, code quality). Findings stabilised after cycle 2 with no new issues found in cycle 3.

## Code Review Report — PR #1124 (TDD Bug #1028 / Issue #1029) **Branch:** `tdd/m5-acms-cli-indexing-pipeline-wiring` **Commit:** `640e90a` by Rui Hu **Review scope:** Branch diff vs `origin/master` (2 files changed: `CHANGELOG.md`, `robot/e2e/tdd_acms_behavioral_validation.robot`) **Methodology:** 3 iterative full-spectrum review cycles covering bug detection, test flaws, test coverage, performance, security, spec compliance, and code quality. Findings stabilised after cycle 2. --- ### Positive Observations - All 4 test cases are correctly aligned with the issue #1029 specification (test case names, tag conventions, and behavioral assertions match exactly). - Tag convention (`tdd_expected_fail`, `tdd_bug`, `tdd_bug_1028`, `E2E`) follows `CONTRIBUTING.md > TDD Bug Test Tags` rules and will pass the listener's `_validate_tdd_tags()` checks. - The `tdd_expected_fail_listener.py` IS registered in the `e2e_tests` nox session (line 723 of `noxfile.py`), so result inversion will work correctly. - CLI commands used (`project context simulate`, `project context inspect`, `project context set` with `--include-path`, `--max-file-size`, `--max-total-size`) all match the specification command synopsis (lines 237–271 of `docs/specification.md`). - The known limitation of `tdd_expected_fail` masking setup failures is explicitly documented in the suite header and mitigated by `m5_acceptance.robot`. - Security: `NO_COLOR=1` is set, stdout/stderr are not embedded in assertion messages, and API key logging risk is mitigated by DEBUG-level logging. - Commit message matches the issue metadata exactly, branch name matches, and `ISSUES CLOSED: #1029` is correct per the TDD workflow. --- ### Findings by Severity --- #### CRITICAL — Bug Detection **C1. CHANGELOG regression: #845 CorrectionService entry accidentally deleted** - **File:** `CHANGELOG.md`, lines 72–121 of merge-base content - **Description:** The branch removes **50 lines** from the `## Unreleased` section — the entire `#845` `CorrectionService` subtree isolation entry. This entry documents numerous fixes (BFS cycle detection, dry-run enforcement, terminal-state guards, mode validation, status restoration, etc.) from PR #845 merged on 2026-03-19. - **Evidence:** `git diff origin/master...HEAD -- CHANGELOG.md` shows the merge-base CHANGELOG has 946 lines; the branch HEAD has 900 lines (946 − 50 deleted + 4 added = 900). - **Root cause:** Likely a merge conflict resolution error when adding the 4-line `#1029` entry at the top of the Unreleased section. The `#845` entry, which was adjacent to the conflict region, was dropped during resolution. - **Impact:** If merged, the `#845` changelog documentation will be **permanently lost** from release notes. - **Action required:** Restore the `#845` entry from `origin/master`'s `CHANGELOG.md` before merging. --- #### MEDIUM — Test Flaws **M1. Large Project test: 10K generated files are not committed to the git repo** - **File:** `robot/e2e/tdd_acms_behavioral_validation.robot`, lines 317–338 - **Description:** The "Large Project Indexes Without Timeout" test generates 10,000 `.py` files on the filesystem but does **not** `git add` + `git commit` them. The workspace resource is registered as `git-checkout` type (line 83). If the bug #1028 fix routes indexing through the git sandbox (which only exposes git-tracked content), these files will not be visible and the test will fail for the **wrong reason** (files not tracked vs. pipeline not wired). - **Mitigating factors:** - The current `walk_and_index` in `repo_indexing_utils.py` uses `os.walk()` (filesystem), not `git ls-files`, so this may not manifest. - Lines 331–338 document this risk with commented-out `git add/commit` commands ready to uncomment. - `m5_acceptance.robot` has the same pattern (neither suite commits generated files). - **Recommendation:** Consider uncommitting the `git add/commit` lines now (or adding a comment making it clear this is a **deliberate** risk acceptance for the TDD phase, and the bug-fix developer **must** evaluate it). Current phrasing ("If the ACMS indexing pipeline reads only git-tracked content...") is ambiguous about responsibility. **M2. Budget enforcement test is a partial assertion** - **File:** `tdd_acms_behavioral_validation.robot`, lines 260–301 - **Description:** The "Budget Enforcement Excludes Oversized Files" test only asserts `fragment_count > 0`. It does **not** verify that `large_file.py` (>1 KiB) is excluded from the fragment list while smaller files are included. The test title claims it tests exclusion, but the assertion only tests inclusion. - **Mitigating factor:** The TODO comment on lines 295–298 documents this gap and defers the second assertion to the bug-fix branch. - **Recommendation:** Acceptable for TDD capture phase. Ensure the bug-fix developer is aware this test needs a second assertion for the `large_file.py` exclusion check. --- #### LOW — Code Quality **L1. ~97 lines of keyword/setup code duplicated from `m5_acceptance.robot`** - **File:** `tdd_acms_behavioral_validation.robot`, keyword section (lines 46–167) - **Description:** Four keywords are **character-for-character identical** to `m5_acceptance.robot`: `Run CLI` (18 lines), `Extract JSON From Stdout` (15 lines), `Link Resource To Project` (6 lines), `Suite Teardown` (3 lines). `Create Synthetic Codebase` and `Suite Setup` differ only in trivial string literals (~55 lines). Total: ~97 duplicated lines out of ~105 keyword-definition lines. - **Recommendation:** Extract the common keywords into `common_e2e.resource` (which already exists and is imported by both files). The keywords can be parameterized with suite-specific names (resource name, workspace label) passed as arguments. This would reduce maintenance burden when CLI interfaces change. **L2. Redundant exit code assertions** - **File:** `tdd_acms_behavioral_validation.robot`, lines 201, 238, 275, 341 - **Description:** `Should Be Equal As Integers ${result.rc} 0` is called immediately after `Run CLI` which already validates the exit code internally via `Should Be Equal As Integers ${result.rc} ${expected_rc}` (default 0). These assertions are always true if `Run CLI` didn't fail. - **Impact:** No functional issue; adds minor visual clutter. **L3. Suite setup error messages less verbose than `m5_acceptance.robot`** - **File:** `tdd_acms_behavioral_validation.robot`, lines 66, 71, 73 - **Description:** Error messages use `msg=git init failed` vs. `m5_acceptance.robot`'s `msg=git init failed (rc=${git_init.rc}). Check DEBUG logs above.` The less-verbose messages omit the actual return code, making debugging harder if git commands fail in CI. - **Recommendation:** Add `(rc=${variable.rc})` and "Check DEBUG logs" to match the `m5_acceptance.robot` pattern for consistency. --- #### INFORMATIONAL — Known Limitations (documented, no action required) **I1. `tdd_expected_fail` listener masks unrelated failures** - If suite setup fails or a CLI infrastructure error occurs, individual test failures are inverted to PASS by the listener, producing false-positive "bug confirmed" results. - Explicitly documented in suite header (lines 20–31) and test case comments (lines 170–182). - Mitigated by `m5_acceptance.robot` running the same CLI plumbing without `tdd_expected_fail` inversion. --- ### Verdict **Request changes** — the CHANGELOG regression (C1) **must** be fixed before merge. The #845 entry needs to be restored from `origin/master`. All other findings are medium/low severity and can be addressed at the reviewer's discretion. --- *Review performed using 3 iterative full-spectrum cycles (bug detection, test flaws, test coverage, performance, security, spec compliance, code quality). Findings stabilised after cycle 2 with no new issues found in cycle 3.*

CHANGELOG.md Outdated

						
				@@ -3,2 +3,4 @@

				## Unreleased

				- Added TDD bug-capture E2E tests for bug #1028 — ACMS indexing pipeline not

				  wired into CLI.  Four Robot Framework E2E tests prove ContextTierService starts

CoreRasurae commented

2026-03-24 10:16:07 +00:00

CRITICAL (C1): This diff removes the entire #845 CorrectionService changelog entry (50 lines). The merge-base (9e316b1) contains this entry at lines 72-121, but it is absent from the branch HEAD. This is likely a merge conflict resolution error — when adding the 4-line #1029 entry, the adjacent #845 block was accidentally dropped.

Action required: Restore the #845 entry from origin/master before merging. You can extract it with:

git show origin/master:CHANGELOG.md

and re-insert the block between the (#331) entry and the deferred physical resource types entry.

**CRITICAL (C1):** This diff removes the entire #845 CorrectionService changelog entry (50 lines). The merge-base (`9e316b1`) contains this entry at lines 72-121, but it is absent from the branch HEAD. This is likely a merge conflict resolution error — when adding the 4-line #1029 entry, the adjacent #845 block was accidentally dropped. **Action required:** Restore the #845 entry from `origin/master` before merging. You can extract it with: ``` git show origin/master:CHANGELOG.md ``` and re-insert the block between the `(#331)` entry and the deferred physical resource types entry.

robot/e2e/tdd_acms_behavioral_validation.robot Outdated

						
				@@ -0,0 +63,4 @@

				    Create Synthetic Codebase    ${ws}

				    # Initialize a git repository

				    ${git_init}=    Run Process    git    init    cwd=${ws}    timeout=60s    on_timeout=kill

				    Should Be Equal As Integers    ${git_init.rc}    0    msg=git init failed

CoreRasurae commented

2026-03-24 10:16:07 +00:00

LOW (L3): Error message is msg=git init failed — less verbose than m5_acceptance.robot's equivalent msg=git init failed (rc=${git_init.rc}). Check DEBUG logs above. Consider adding the return code and log hint for debugging consistency.

**LOW (L3):** Error message is `msg=git init failed` — less verbose than `m5_acceptance.robot`'s equivalent `msg=git init failed (rc=${git_init.rc}). Check DEBUG logs above.` Consider adding the return code and log hint for debugging consistency.

robot/e2e/tdd_acms_behavioral_validation.robot Outdated

						
				@@ -0,0 +116,4 @@

				    ${large_content}=    Evaluate    "# auto-generated large file\\n" + ("x = 1\\n" * 250)

				    Create File    ${base_dir}${/}large_file.py    ${large_content}

				Run CLI

CoreRasurae commented

2026-03-24 10:16:07 +00:00

LOW (L1): This Run CLI keyword is character-for-character identical to the one in m5_acceptance.robot (lines 109-126). Along with Extract JSON From Stdout, Link Resource To Project, Create Synthetic Codebase, and the Suite Setup/Teardown, approximately 97 lines of keyword code are duplicated between the two files. Consider extracting the common keywords into common_e2e.resource (already imported by both files) with parameterized suite-specific names.

**LOW (L1):** This `Run CLI` keyword is character-for-character identical to the one in `m5_acceptance.robot` (lines 109-126). Along with `Extract JSON From Stdout`, `Link Resource To Project`, `Create Synthetic Codebase`, and the Suite Setup/Teardown, approximately **97 lines** of keyword code are duplicated between the two files. Consider extracting the common keywords into `common_e2e.resource` (already imported by both files) with parameterized suite-specific names.

robot/e2e/tdd_acms_behavioral_validation.robot Outdated

						
				@@ -0,0 +257,4 @@

				    Should Be True    ${total_indexed} > 0

				    ...    msg=Bug #1028: total indexed fragments is ${total_indexed} (expected > 0). tier_metrics: hot=${hot_count}, warm=${warm_count}, cold=${cold_count}. ACMS indexing pipeline is not wired into CLI.

				Budget Enforcement Excludes Oversized Files

CoreRasurae commented

2026-03-24 10:16:07 +00:00

MEDIUM (M2): This test title says "Excludes Oversized Files" but the assertion only checks fragment_count > 0 (that some files are indexed). It does not verify that large_file.py is actually excluded. The TODO on lines 295-298 documents this gap — ensure the bug-fix developer adds the exclusion assertion when wiring the pipeline.

**MEDIUM (M2):** This test title says "Excludes Oversized Files" but the assertion only checks `fragment_count > 0` (that *some* files are indexed). It does not verify that `large_file.py` is actually *excluded*. The TODO on lines 295-298 documents this gap — ensure the bug-fix developer adds the exclusion assertion when wiring the pipeline.

robot/e2e/tdd_acms_behavioral_validation.robot Outdated

						
				@@ -0,0 +328,4 @@

				    ${gen}=    Run Process    ${PYTHON}    ${gen_script}    ${scale_dir}

				    ...    timeout=120s    on_timeout=kill

				    Should Be Equal As Integers    ${gen.rc}    0    msg=10K file generation failed

				    # NOTE: The 10K generated files are written to the filesystem but NOT

CoreRasurae commented

2026-03-24 10:16:07 +00:00

MEDIUM (M1): The 10K generated files are written to the filesystem but NOT committed to the workspace git repo. The resource is registered as git-checkout type (line 83). If the bug-fix routes indexing through the git sandbox, these untracked files won't be visible and the test will fail for the wrong reason.

The NOTE comment here is good and documents the risk, but consider making the responsibility assignment more explicit — e.g.: "The bug-fix developer MUST evaluate whether the fix uses filesystem or git-tracked content and uncomment these lines if needed."

**MEDIUM (M1):** The 10K generated files are written to the filesystem but NOT committed to the workspace git repo. The resource is registered as `git-checkout` type (line 83). If the bug-fix routes indexing through the git sandbox, these untracked files won't be visible and the test will fail for the wrong reason. The NOTE comment here is good and documents the risk, but consider making the responsibility assignment more explicit — e.g.: "The bug-fix developer MUST evaluate whether the fix uses filesystem or git-tracked content and uncomment these lines if needed."

hurui200320 force-pushed tdd/m5-acms-cli-indexing-pipeline-wiring from 640e90a2a5 to 5602c9a87c

2026-03-24 11:12:30 +00:00

Compare

hurui200320 referenced this pull request

2026-03-24 11:12:59 +00:00

TDD: ACMS indexing pipeline not wired into CLI — ContextTierService starts empty (bug #1028) #1029

hurui200320 commented

2026-03-24 11:13:40 +00:00

Response to Review #2691 (Luis)

Thanks for the thorough review! All findings addressed in commit 5602c9a8. Summary:

C1 (CRITICAL) — CHANGELOG #845 entry deleted ✅ Fixed

Reset CHANGELOG.md to origin/master and re-added the #1029 entry. The #845 CorrectionService entry (50 lines) is now fully restored.

M1 (MEDIUM) — 10K files not committed to git ✅ Fixed

Replaced the ambiguous NOTE with an explicit responsibility assignment: "The bug-fix developer MUST evaluate whether the fix routes indexing through the git sandbox (which only exposes git-tracked content) or the filesystem."

M2 (MEDIUM) — Budget enforcement partial assertion ℹ️ Acknowledged

Agreed — the TODO comment already documents the gap, and the assertion is appropriate for the TDD capture phase. The bug-fix developer will add the large_file.py exclusion assertion when wiring the pipeline.

L1 (LOW) — ~97 lines duplicated from m5_acceptance.robot ✅ Fixed

Extracted Run CLI, Extract JSON From Stdout, Link Resource To Project, and Create Synthetic Codebase into common_e2e.resource. Create Synthetic Codebase is parameterized with ${project_label} for suite-specific literals. Both m5_acceptance.robot and tdd_acms_behavioral_validation.robot now use the shared keywords.

L2 (LOW) — Redundant exit code assertions ✅ Fixed

Removed all 12 instances of redundant Should Be Equal As Integers ${result.rc} 0 that followed Run CLI calls.

L3 (LOW) — Suite setup error messages less verbose ✅ Fixed

All git command error messages now include (rc=${var.rc}). Check DEBUG logs above. matching the m5_acceptance.robot pattern.

I1 (INFO) — `tdd_expected_fail` listener masks ℹ️ Acknowledged

Already documented. No action needed.

## Response to Review #2691 (Luis) Thanks for the thorough review! All findings addressed in commit `5602c9a8`. Summary: ### C1 (CRITICAL) — CHANGELOG #845 entry deleted ✅ Fixed Reset `CHANGELOG.md` to `origin/master` and re-added the `#1029` entry. The #845 `CorrectionService` entry (50 lines) is now fully restored. ### M1 (MEDIUM) — 10K files not committed to git ✅ Fixed Replaced the ambiguous NOTE with an explicit responsibility assignment: _"The bug-fix developer MUST evaluate whether the fix routes indexing through the git sandbox (which only exposes git-tracked content) or the filesystem."_ ### M2 (MEDIUM) — Budget enforcement partial assertion ℹ️ Acknowledged Agreed — the TODO comment already documents the gap, and the assertion is appropriate for the TDD capture phase. The bug-fix developer will add the `large_file.py` exclusion assertion when wiring the pipeline. ### L1 (LOW) — ~97 lines duplicated from m5_acceptance.robot ✅ Fixed Extracted `Run CLI`, `Extract JSON From Stdout`, `Link Resource To Project`, and `Create Synthetic Codebase` into `common_e2e.resource`. `Create Synthetic Codebase` is parameterized with `${project_label}` for suite-specific literals. Both `m5_acceptance.robot` and `tdd_acms_behavioral_validation.robot` now use the shared keywords. ### L2 (LOW) — Redundant exit code assertions ✅ Fixed Removed all 12 instances of redundant `Should Be Equal As Integers ${result.rc} 0` that followed `Run CLI` calls. ### L3 (LOW) — Suite setup error messages less verbose ✅ Fixed All git command error messages now include `(rc=${var.rc}). Check DEBUG logs above.` matching the `m5_acceptance.robot` pattern. ### I1 (INFO) — `tdd_expected_fail` listener masks ℹ️ Acknowledged Already documented. No action needed.

hurui200320 added a new dependency 2026-03-24 11:25:54 +00:00

#1029 TDD: ACMS indexing pipeline not wired into CLI — ContextTierService starts empty (bug #1028)

hurui200320 referenced this pull request

2026-03-24 11:46:01 +00:00

TDD: ACMS indexing pipeline not wired into CLI — ContextTierService starts empty (bug #1028) #1029

freemo approved these changes 2026-03-24 15:29:11 +00:00

Dismissed

freemo left a comment

Review: APPROVED

TDD E2E behavioral test proving ACMS indexing pipeline is not wired into CLI (bug #1028). Clean test structure with proper tags and clear documentation of what the test proves.

## Review: APPROVED TDD E2E behavioral test proving ACMS indexing pipeline is not wired into CLI (bug #1028). Clean test structure with proper tags and clear documentation of what the test proves.

hurui200320 force-pushed tdd/m5-acms-cli-indexing-pipeline-wiring from 5602c9a87c to 90d18f06d1

2026-03-25 05:39:39 +00:00

Compare

hurui200320 dismissed freemo's review 2026-03-25 05:39:39 +00:00

Reason:

New commits pushed, approval review dismissed automatically according to repository settings

hurui200320 added 1 commit 2026-03-25 05:49:52 +00:00

fix: nox -s build

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 20s

Details

CI / security (pull_request) Successful in 56s

Details

CI / lint (pull_request) Successful in 3m20s

Details

CI / quality (pull_request) Successful in 3m55s

Details

CI / typecheck (pull_request) Successful in 3m56s

Details

CI / e2e_tests (pull_request) Failing after 5m6s

Details

CI / integration_tests (pull_request) Successful in 5m40s

Details

CI / coverage (pull_request) Successful in 11m20s

Details

CI / unit_tests (pull_request) Failing after 18m44s

Details

CI / benchmark-regression (pull_request) Has been cancelled

Details

CI / docker (pull_request) Has been cancelled

Details

CI / status-check (pull_request) Has been cancelled

Details

f56f2052a6

hurui200320 scheduled this pull request to auto merge when all checks succeed 2026-03-25 05:50:53 +00:00

hurui200320 added 1 commit 2026-03-25 06:10:30 +00:00

fix: switch to openai llm for e2e test since currently anthrpoic api has no balance

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 25s

Details

CI / build (pull_request) Successful in 24s

Details

CI / quality (pull_request) Successful in 42s

Details

CI / typecheck (pull_request) Successful in 54s

Details

CI / integration_tests (pull_request) Successful in 6m42s

Details

CI / security (pull_request) Successful in 4m14s

Details

CI / unit_tests (pull_request) Successful in 7m30s

Details

CI / e2e_tests (pull_request) Successful in 7m45s

Details

CI / docker (pull_request) Successful in 2m4s

Details

CI / coverage (pull_request) Successful in 11m19s

Details

CI / status-check (pull_request) Successful in 2s

Details

CI / benchmark-regression (pull_request) Successful in 54m1s

Details

dc4c6e6280

hurui200320 merged commit 83c22b8391 into master

2026-03-25 11:13:37 +00:00

hurui200320 deleted branch tdd/m5-acms-cli-indexing-pipeline-wiring

2026-03-25 11:13:38 +00:00

hurui200320 referenced this issue from a commit

2026-03-25 11:13:39 +00:00

test(e2e): TDD behavioral test proving ACMS indexing pipeline is not wired into CLI (bug #1028) (#1124)

Sign in to join this conversation.

3 Participants

Notifications

Due Date

No due date set.

Blocks

#1029 TDD: ACMS indexing pipeline not wired into CLI — ContextTierService starts empty (bug #1028)

cleveragents/cleveragents-core

Reference: cleveragents/cleveragents-core#1124

test(e2e): TDD behavioral test proving ACMS indexing pipeline is not wired into CLI (bug #1028) #1124

Summary

Changes

Test Cases

Documentation & Robustness Improvements

Review Fix Round

Motivation

Quality Gates

Code Review Report — PR #1124 (TDD Bug #1028 / Issue #1029)

Positive Observations

Findings by Severity

CRITICAL — Bug Detection

MEDIUM — Test Flaws

LOW — Code Quality

INFORMATIONAL — Known Limitations (documented, no action required)

Verdict

Response to Review #2691 (Luis)

C1 (CRITICAL) — CHANGELOG #845 entry deleted ✅ Fixed

M1 (MEDIUM) — 10K files not committed to git ✅ Fixed

M2 (MEDIUM) — Budget enforcement partial assertion ℹ️ Acknowledged

L1 (LOW) — ~97 lines duplicated from m5_acceptance.robot ✅ Fixed

L2 (LOW) — Redundant exit code assertions ✅ Fixed

L3 (LOW) — Suite setup error messages less verbose ✅ Fixed

I1 (INFO) — tdd_expected_fail listener masks ℹ️ Acknowledged

Review: APPROVED

I1 (INFO) — `tdd_expected_fail` listener masks ℹ️ Acknowledged