test(acms): TDD failing tests for context tier runtime logic (bug #821) #1058

Merged
brent.edwards merged 2 commits from tdd/m5-context-tier-runtime into master 2026-03-19 22:24:42 +00:00
Member

Summary

TDD expected-fail tests proving bug #821 exists: ContextTierService has data models for hot/warm/cold tiers but no runtime logic for automatic promotion, demotion, or eviction.

  • Promotion on access: Accessing a cold-tier fragment repeatedly via get() does NOT auto-promote it — get() updates access_count/last_accessed but never calls promote()
  • Demotion on staleness: No staleness enforcement method exists — tried enforce_staleness(), apply_tier_policy(), tick(), etc. — none are implemented
  • Eviction on budget overflow: store() does NOT enforce TierBudget.max_tokens_hot — the hot tier grows without bound

Files Added

File Purpose
features/tdd_context_tier_runtime.feature 3 Behave scenarios tagged @tdd_expected_fail @tdd_bug @tdd_bug_821 @mock_only
features/steps/tdd_context_tier_runtime_steps.py Type-annotated step definitions exercising real ContextTierService
robot/tdd_context_tier_runtime.robot 3 Robot Framework integration tests tagged tdd_expected_fail
robot/helper_tdd_context_tier_runtime.py Helper script for Robot tests with 3 subcommands

Verification

  • nox -s lint — passed
  • nox -s typecheck — passed (0 errors)
  • nox -s unit_tests -- features/tdd_context_tier_runtime.feature3 scenarios passed (all assertions fail as expected, @tdd_expected_fail inverts to CI pass)

ISSUES CLOSED: #840

## Summary TDD expected-fail tests proving bug #821 exists: `ContextTierService` has data models for hot/warm/cold tiers but **no runtime logic** for automatic promotion, demotion, or eviction. - **Promotion on access**: Accessing a cold-tier fragment repeatedly via `get()` does NOT auto-promote it — `get()` updates `access_count`/`last_accessed` but never calls `promote()` - **Demotion on staleness**: No staleness enforcement method exists — tried `enforce_staleness()`, `apply_tier_policy()`, `tick()`, etc. — none are implemented - **Eviction on budget overflow**: `store()` does NOT enforce `TierBudget.max_tokens_hot` — the hot tier grows without bound ### Files Added | File | Purpose | |------|---------| | `features/tdd_context_tier_runtime.feature` | 3 Behave scenarios tagged `@tdd_expected_fail @tdd_bug @tdd_bug_821 @mock_only` | | `features/steps/tdd_context_tier_runtime_steps.py` | Type-annotated step definitions exercising real `ContextTierService` | | `robot/tdd_context_tier_runtime.robot` | 3 Robot Framework integration tests tagged `tdd_expected_fail` | | `robot/helper_tdd_context_tier_runtime.py` | Helper script for Robot tests with 3 subcommands | ### Verification - `nox -s lint` — passed - `nox -s typecheck` — passed (0 errors) - `nox -s unit_tests -- features/tdd_context_tier_runtime.feature` — **3 scenarios passed** (all assertions fail as expected, `@tdd_expected_fail` inverts to CI pass) ISSUES CLOSED: #840
freemo approved these changes 2026-03-19 04:56:00 +00:00
freemo left a comment

Code Review — PR #1058 test(acms): TDD failing tests for context tier runtime logic (bug #821)

Clean TDD test PR following the project's bug fix workflow correctly. Three scenarios (promotion on access, demotion on staleness, eviction on budget overflow) effectively prove bug #821 exists. TDD tags @tdd_expected_fail @tdd_bug @tdd_bug_821 are correctly applied.

The step definitions are thorough — the staleness scenario probes for multiple method names (enforce_staleness(), apply_tier_policy(), tick(), etc.) which clearly demonstrates the absence of any runtime enforcement method. Robot integration tests mirror the Behave scenarios for full coverage.

Approved. No issues found.

## Code Review — PR #1058 `test(acms): TDD failing tests for context tier runtime logic (bug #821)` Clean TDD test PR following the project's bug fix workflow correctly. Three scenarios (promotion on access, demotion on staleness, eviction on budget overflow) effectively prove bug #821 exists. TDD tags `@tdd_expected_fail @tdd_bug @tdd_bug_821` are correctly applied. The step definitions are thorough — the staleness scenario probes for multiple method names (`enforce_staleness()`, `apply_tier_policy()`, `tick()`, etc.) which clearly demonstrates the absence of any runtime enforcement method. Robot integration tests mirror the Behave scenarios for full coverage. **Approved.** No issues found.
freemo added this to the v3.4.0 milestone 2026-03-19 05:28:34 +00:00
brent.edwards force-pushed tdd/m5-context-tier-runtime from 30a8828770
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 18s
CI / build (pull_request) Successful in 18s
CI / quality (pull_request) Successful in 29s
CI / typecheck (pull_request) Successful in 46s
CI / security (pull_request) Successful in 52s
CI / unit_tests (pull_request) Successful in 3m4s
CI / integration_tests (pull_request) Successful in 3m39s
CI / e2e_tests (pull_request) Successful in 3m55s
CI / docker (pull_request) Successful in 56s
CI / coverage (pull_request) Successful in 6m46s
CI / benchmark-regression (pull_request) Successful in 37m36s
to 1125e49d4f
Some checks failed
CI / lint (pull_request) Successful in 15s
CI / typecheck (pull_request) Successful in 53s
CI / security (pull_request) Successful in 50s
CI / quality (pull_request) Successful in 36s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 27s
CI / unit_tests (pull_request) Successful in 4m8s
CI / e2e_tests (pull_request) Successful in 5m36s
CI / docker (pull_request) Successful in 1m46s
CI / coverage (pull_request) Failing after 13m37s
CI / integration_tests (pull_request) Failing after 17m51s
CI / benchmark-regression (pull_request) Successful in 40m58s
2026-03-19 20:58:19 +00:00
Compare
Merge branch 'master' into tdd/m5-context-tier-runtime
All checks were successful
CI / lint (pull_request) Successful in 26s
CI / security (pull_request) Successful in 37s
CI / quality (pull_request) Successful in 43s
CI / typecheck (pull_request) Successful in 1m15s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 22s
CI / unit_tests (pull_request) Successful in 2m17s
CI / integration_tests (pull_request) Successful in 2m48s
CI / docker (pull_request) Successful in 2m8s
CI / e2e_tests (pull_request) Successful in 6m34s
CI / coverage (pull_request) Successful in 4m47s
CI / benchmark-regression (pull_request) Successful in 38m11s
b5b45a6ff4
brent.edwards deleted branch tdd/m5-context-tier-runtime 2026-03-19 22:24:44 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!1058
No description provided.