fix(acms): implement context tier runtime promotion/demotion/eviction #1150

Merged
CoreRasurae merged 1 commit from bugfix/m5-context-tier-runtime into master 2026-03-25 19:04:12 +00:00
Member

Summary

Implements the missing runtime logic for the ACMS context tier service (ContextTierService). Previously, only data models (TieredFragment, TierBudget, etc.) and manual promote()/demote()/evict_lru() methods existed. The tier service lacked:

  • Automatic promotion based on access patterns
  • Time-based staleness enforcement (demotion of stale fragments)
  • Budget enforcement when storing new fragments in the hot tier
  • Event emission for tier transitions

This PR adds all four capabilities, completing the context tier lifecycle as specified in the architecture document.

Changes

Runtime Logic

  • Auto-promotion on access: get() now auto-promotes fragments one tier up when access_count reaches promotion_threshold (configurable, default: 5)
  • Staleness enforcement: New enforce_staleness() method demotes hot fragments older than hot_ttl (default: 24h) to warm, and warm fragments older than warm_ttl to cold
  • Budget enforcement on store: store() evicts LRU hot-tier fragments when max_tokens_hot budget is exceeded
  • Event emission: Added TIER_PROMOTED, TIER_DEMOTED, TIER_EVICTED event types; all transitions emit DomainEvent via EventBus

Configuration

  • context_tier_promotion_threshold (default: 5)
  • context_tier_hot_ttl_hours (default: 24)
  • context_tier_warm_ttl_hours (default: 24)

DI Wiring

  • context_tier_service now receives event_bus from the DI container

TDD Tag Removal

  • Removed @tdd_expected_fail from Behave (features/tdd_context_tier_runtime.feature) and Robot Framework (robot/tdd_context_tier_runtime.robot) TDD tests — all 3 TDD scenarios now pass normally

Files Changed

File Change
src/cleveragents/application/services/context_tiers.py MODIFIED — runtime logic added
src/cleveragents/infrastructure/events/types.py MODIFIED — 3 new event types
src/cleveragents/config/settings.py MODIFIED — 3 new tier settings
src/cleveragents/application/container.py MODIFIED — event_bus injected
features/tdd_context_tier_runtime.feature MODIFIED — tag removal
robot/tdd_context_tier_runtime.robot MODIFIED — tag removal
features/context_tier_runtime.feature NEW — 12 BDD scenarios
features/steps/context_tier_runtime_steps.py NEW — step definitions
robot/context_tier_runtime.robot NEW — 4 integration tests
robot/helper_context_tier_runtime.py NEW — Robot helper
benchmarks/bench_context_tier_runtime.py NEW — 4 ASV benchmarks

Quality Gates

Session Result
lint PASS
typecheck PASS — 0 errors
unit_tests PASS — 12,307 scenarios, 0 failures
integration_tests PASS — 1,706 tests, 0 failures
coverage_report PASS — 98% (>= 97% threshold)

Closes #821

ISSUES CLOSED: #821

## Summary Implements the missing runtime logic for the ACMS context tier service (`ContextTierService`). Previously, only data models (`TieredFragment`, `TierBudget`, etc.) and manual `promote()`/`demote()`/`evict_lru()` methods existed. The tier service lacked: - Automatic promotion based on access patterns - Time-based staleness enforcement (demotion of stale fragments) - Budget enforcement when storing new fragments in the hot tier - Event emission for tier transitions This PR adds all four capabilities, completing the context tier lifecycle as specified in the architecture document. ## Changes ### Runtime Logic - **Auto-promotion on access**: `get()` now auto-promotes fragments one tier up when `access_count` reaches `promotion_threshold` (configurable, default: 5) - **Staleness enforcement**: New `enforce_staleness()` method demotes hot fragments older than `hot_ttl` (default: 24h) to warm, and warm fragments older than `warm_ttl` to cold - **Budget enforcement on store**: `store()` evicts LRU hot-tier fragments when `max_tokens_hot` budget is exceeded - **Event emission**: Added `TIER_PROMOTED`, `TIER_DEMOTED`, `TIER_EVICTED` event types; all transitions emit `DomainEvent` via `EventBus` ### Configuration - `context_tier_promotion_threshold` (default: 5) - `context_tier_hot_ttl_hours` (default: 24) - `context_tier_warm_ttl_hours` (default: 24) ### DI Wiring - `context_tier_service` now receives `event_bus` from the DI container ### TDD Tag Removal - Removed `@tdd_expected_fail` from Behave (`features/tdd_context_tier_runtime.feature`) and Robot Framework (`robot/tdd_context_tier_runtime.robot`) TDD tests — all 3 TDD scenarios now pass normally ### Files Changed | File | Change | |------|--------| | `src/cleveragents/application/services/context_tiers.py` | MODIFIED — runtime logic added | | `src/cleveragents/infrastructure/events/types.py` | MODIFIED — 3 new event types | | `src/cleveragents/config/settings.py` | MODIFIED — 3 new tier settings | | `src/cleveragents/application/container.py` | MODIFIED — event_bus injected | | `features/tdd_context_tier_runtime.feature` | MODIFIED — tag removal | | `robot/tdd_context_tier_runtime.robot` | MODIFIED — tag removal | | `features/context_tier_runtime.feature` | NEW — 12 BDD scenarios | | `features/steps/context_tier_runtime_steps.py` | NEW — step definitions | | `robot/context_tier_runtime.robot` | NEW — 4 integration tests | | `robot/helper_context_tier_runtime.py` | NEW — Robot helper | | `benchmarks/bench_context_tier_runtime.py` | NEW — 4 ASV benchmarks | ## Quality Gates | Session | Result | |---------|--------| | `lint` | PASS | | `typecheck` | PASS — 0 errors | | `unit_tests` | PASS — 12,307 scenarios, 0 failures | | `integration_tests` | PASS — 1,706 tests, 0 failures | | `coverage_report` | PASS — 98% (>= 97% threshold) | Closes #821 ISSUES CLOSED: #821
CoreRasurae added this to the v3.4.0 milestone 2026-03-24 15:07:29 +00:00
freemo approved these changes 2026-03-24 15:25:52 +00:00
Dismissed
freemo left a comment

Review: APPROVED with Comments

This is a well-structured, thoroughly tested PR that implements the full runtime tier management logic (auto-promotion, staleness demotion, budget eviction, event emission) for the ACMS context tier service. Test coverage across all three levels (BDD with 13 scenarios, Robot with 4 integration tests, and ASV benchmarks) is excellent — well above average for this repo. The double-demotion prevention in enforce_staleness() is a subtle correctness detail handled well.

Items to Address

  1. # type: ignore comments in test files. Multiple # type: ignore[arg-type] annotations appear in context_tier_runtime_steps.py, helper_context_tier_runtime.py, and bench_context_tier_runtime.py. Per CONTRIBUTING.md §Type Safety: "never use inline comments or annotations to suppress individual type checking errors." The guideline makes no exception for test files. Fix the test doubles' signatures to match the EventBus protocol properly (e.g., _EventCollector.subscribe should use Callable[[DomainEvent], None] instead of handler: Any).

  2. context_tiers.py exceeds 500-line limit. The file grows from ~372 to ~647 lines. Per CONTRIBUTING.md §Modular Design: "Keep files under 500 lines." Consider extracting the runtime policy logic (enforce_staleness, _maybe_auto_promote, _enforce_hot_budget, _emit_tier_event) into a separate mixin or module.

  3. Missing fragment_id validation on promote() and demote(). The store() method validates fragment_id is non-empty, but promote() and demote() do not. Per CONTRIBUTING.md §Argument Validation: "All public and protected class methods must validate arguments as the first guard." Since this PR touches these methods, adding the guard would be consistent.

  4. Auto-promotion to hot does not trigger budget enforcement. When _maybe_auto_promote promotes warm → hot, the promote() method adds to self._hot but does not call _enforce_hot_budget(). Budget enforcement only runs via store(). This means auto-promotion can silently exceed the hot-tier token budget. Clarify if this is intentional per the spec, or add budget enforcement after promotion.

Minor Notes

  1. _enforce_hot_budget() recalculates sum(f.token_count for f in self._hot.values()) on every iteration. A running total would be O(1) per eviction instead of O(n). The benchmark file exists to catch regressions, so not critical.

  2. Two separate eviction code paths exist: evict_lru() and _enforce_hot_budget() each do their own eviction with separate event emission. Consider consolidating to reduce duplication.

  3. Ensure the commit message footer includes ISSUES CLOSED: #821.

  4. The _resolve helper name is quite generic. Something like _re_fetch_after_promotion would convey intent more clearly.

## Review: APPROVED with Comments This is a well-structured, thoroughly tested PR that implements the full runtime tier management logic (auto-promotion, staleness demotion, budget eviction, event emission) for the ACMS context tier service. Test coverage across all three levels (BDD with 13 scenarios, Robot with 4 integration tests, and ASV benchmarks) is excellent — well above average for this repo. The double-demotion prevention in `enforce_staleness()` is a subtle correctness detail handled well. ### Items to Address 1. **`# type: ignore` comments in test files.** Multiple `# type: ignore[arg-type]` annotations appear in `context_tier_runtime_steps.py`, `helper_context_tier_runtime.py`, and `bench_context_tier_runtime.py`. Per CONTRIBUTING.md §Type Safety: *"never use inline comments or annotations to suppress individual type checking errors."* The guideline makes no exception for test files. Fix the test doubles' signatures to match the `EventBus` protocol properly (e.g., `_EventCollector.subscribe` should use `Callable[[DomainEvent], None]` instead of `handler: Any`). 2. **`context_tiers.py` exceeds 500-line limit.** The file grows from ~372 to ~647 lines. Per CONTRIBUTING.md §Modular Design: *"Keep files under 500 lines."* Consider extracting the runtime policy logic (`enforce_staleness`, `_maybe_auto_promote`, `_enforce_hot_budget`, `_emit_tier_event`) into a separate mixin or module. 3. **Missing `fragment_id` validation on `promote()` and `demote()`.** The `store()` method validates `fragment_id` is non-empty, but `promote()` and `demote()` do not. Per CONTRIBUTING.md §Argument Validation: *"All public and protected class methods must validate arguments as the first guard."* Since this PR touches these methods, adding the guard would be consistent. 4. **Auto-promotion to hot does not trigger budget enforcement.** When `_maybe_auto_promote` promotes warm → hot, the `promote()` method adds to `self._hot` but does not call `_enforce_hot_budget()`. Budget enforcement only runs via `store()`. This means auto-promotion can silently exceed the hot-tier token budget. Clarify if this is intentional per the spec, or add budget enforcement after promotion. ### Minor Notes 5. `_enforce_hot_budget()` recalculates `sum(f.token_count for f in self._hot.values())` on every iteration. A running total would be O(1) per eviction instead of O(n). The benchmark file exists to catch regressions, so not critical. 6. Two separate eviction code paths exist: `evict_lru()` and `_enforce_hot_budget()` each do their own eviction with separate event emission. Consider consolidating to reduce duplication. 7. Ensure the commit message footer includes `ISSUES CLOSED: #821`. 8. The `_resolve` helper name is quite generic. Something like `_re_fetch_after_promotion` would convey intent more clearly.
Author
Member

Code Review Report — PR #1150 (Issue #821)

Reviewer: Automated Code Review (2 full review cycles, all categories)
Scope: All code changes in bugfix/m5-context-tier-runtime plus close interactions with surrounding code
Reference: docs/specification.md, Forgejo issue #821


Summary

The PR implements the missing ACMS context tier runtime logic (auto-promotion, staleness enforcement, budget eviction, event emission). The overall design is sound and the code is well-structured. However, two critical correctness bugs, several high-severity issues, and a number of medium/low issues were found across two complete review cycles.

Issue count by severity: 2 Critical, 5 High, 7 Medium, 5 Low


Critical — Must Fix Before Merge

C1. access_count not reset after promotion — fragments skip the warm tier

Category: Bug / Logic Error
File: src/cleveragents/application/services/context_tiers.py:494-512

_maybe_auto_promote() checks access_count >= promotion_threshold but promote() does not reset access_count. After a cold fragment reaches the threshold (default 5), it is promoted to warm. On the very next get() call, _touch() increments access_count to 6, which is still >= 5, triggering immediate warm→hot promotion. The warm tier is effectively bypassed after only 1 additional access.

Trace:

get() x5 → access_count=5, cold→warm  ✓
get() x1 → access_count=6, warm→hot   ← unexpected chain promotion

Fix: Reset access_count to 0 after each successful promotion in _maybe_auto_promote(), or track "accesses since last promotion" separately:

def _maybe_auto_promote(self, fragment_id, fragment):
    if fragment.access_count >= self._promotion_threshold:
        promoted = self.promote(fragment_id)
        if promoted is not None:
            promoted.access_count = 0  # reset counter after promotion

C2. promote() bypasses hot-tier budget enforcement

Category: Bug / Constraint Violation
File: src/cleveragents/application/services/context_tiers.py:270-280

When promote() moves a fragment warm→hot, it does not call _enforce_hot_budget(). Only store() enforces the budget. Since _maybe_auto_promote() calls promote(), auto-promoted fragments can push the hot tier over max_tokens_hot, violating the budget constraint that store() carefully enforces.

Fix: Call _enforce_hot_budget() after inserting into self._hot inside promote():

if fragment_id in self._warm:
    frag = self._warm.pop(fragment_id)
    promoted = frag.model_copy(update={"tier": ContextTier.HOT})
    self._hot[fragment_id] = promoted
    self._enforce_hot_budget()  # ← add this
    self._emit_tier_event(...)
    return promoted

High — Should Fix Before Merge

H1. O(n²) hot-tier budget enforcement loop

Category: Performance
File: src/cleveragents/application/services/context_tiers.py:528-558

_enforce_hot_budget() recalculates sum() (O(n)) and min() (O(n)) on every iteration of the while loop, which can run up to O(n) times — total O(n²). With default budget (8000 tokens, ~160 fragments at 50 tokens each), this is tolerable. At scale it degrades.

Fix: Calculate total once, subtract evicted tokens incrementally:

total = sum(f.token_count for f in self._hot.values())
while total > budget_tokens and self._hot:
    oldest = min(self._hot, key=lambda fid: self._hot[fid].last_accessed)
    total -= self._hot[oldest].token_count
    del self._hot[oldest]
    self._emit_tier_event(...)

H2. No thread-safety for shared singleton

Category: Bug / Concurrency
File: src/cleveragents/application/services/context_tiers.py:60-108, container.py:611

ContextTierService is registered as a Singleton in the DI container. All callers share the same instance. Internal dicts (_hot, _warm, _cold) and metric counters have no locking. Operations like get() (read + mutate + promote), store() (write + evict), and enforce_staleness() (read + demote) are not atomic. If plan execution is concurrent, data races can occur.

Recommendation: Either add threading.Lock for mutation paths, or document that the service is single-threaded only.

H3. Missing error handling in _emit_tier_event()

Category: Bug / Robustness
File: src/cleveragents/application/services/context_tiers.py:560-579

self._event_bus.emit(event) has no try/except. If the event bus raises, the exception propagates to the caller (promote, demote, enforce_staleness, _enforce_hot_budget). In enforce_staleness(), this could leave the service in a partially-demoted state — some fragments demoted, the loop interrupted midway.

Fix:

try:
    self._event_bus.emit(event)
except Exception:
    logger.warning("tier.event_emission_failed", event_type=event_type.value,
                   fragment_id=fragment_id, exc_info=True)

H4. Warm/cold tier budget enforcement not implemented

Category: Bug / Incomplete Implementation
File: src/cleveragents/application/services/context_tiers.py:114-141

TierBudget defines max_decisions_warm (500) and max_decisions_cold (5000), but store() only enforces max_tokens_hot. Fragments can accumulate in warm and cold tiers without limit. While issue #821 only specifies hot-tier budget enforcement, the presence of defined-but-unenforced budget fields is misleading and could cause unbounded memory growth.

Recommendation: Either enforce warm/cold budgets or add a TODO/docstring noting they are intentionally unenforced in this PR.

H5. Missing cold-tier TTL/archival

Category: Spec Compliance
File: src/cleveragents/application/services/context_tiers.py:357-410

The specification at line 30556 defines context.tiers.cold.retention-days (default: 90 days) for cold-tier expiry/archival. enforce_staleness() only handles hot→warm and warm→cold transitions but never purges or archives cold fragments. Cold storage grows without bound.

Recommendation: Acknowledge this as a known limitation or implement cold-tier expiry.


Medium — Worth Addressing

M1. Settings env var names don't match specification

Category: Spec Compliance
File: src/cleveragents/config/settings.py:280-298

Setting Spec Env Var (line 30555-30556) Code Env Var
warm TTL CLEVERAGENTS_CTX_WARM_HOURS CLEVERAGENTS_CONTEXT_TIER_WARM_TTL_HOURS
hot TTL (implied from warm pattern) CLEVERAGENTS_CONTEXT_TIER_HOT_TTL_HOURS

Users following the specification will use the wrong env var. Consider adding the spec env var as an additional AliasChoices entry, or updating the spec.

M2. demote() does not refresh last_accessed — cascading rapid demotion

Category: Bug / Design
File: src/cleveragents/application/services/context_tiers.py:284-316

When enforce_staleness() demotes a stale hot fragment to warm, demote() preserves the old last_accessed timestamp. On the next enforce_staleness() call, the fragment is still stale and immediately demoted warm→cold. This causes rapid cascading demotion across two consecutive enforcement passes, effectively making the warm tier a transient waypoint rather than a meaningful retention layer.

Recommendation: Consider resetting last_accessed to now() on demotion so the fragment gets a fresh retention window in the new tier.

M3. Tests access private attributes directly

Category: Test Quality
Files: features/steps/context_tier_runtime_steps.py, robot/helper_context_tier_runtime.py, benchmarks/bench_context_tier_runtime.py

7+ occurrences of direct private attribute access: svc._budget, svc._find_fragment(), svc._hot. This creates tight coupling between tests and implementation internals. Refactoring the internal storage (e.g., switching to a sorted container) would break all these tests.

Recommendation: Add public methods for test inspection (e.g., find_fragment(), hot_token_total()), or accept the coupling with a documented rationale.

M4. _EventCollector in tests doesn't match EventBus protocol

Category: Test Quality
File: features/steps/context_tier_runtime_steps.py:57-65

_EventCollector.subscribe() accepts handler: Any instead of Callable[[DomainEvent], None] as the protocol requires. Constructor calls are suppressed with # type: ignore[arg-type]. This means tests don't validate that the service interacts correctly with a properly-typed event bus.

M5. Robot helper uses weak assertion for auto-promotion

Category: Test Quality
File: robot/helper_context_tier_runtime.py:88

assert result.tier != ContextTier.COLD  # weak

This only verifies the fragment left the cold tier, not that it's specifically in the warm tier. Due to bug C1 (chain promotion), if the threshold logic changes, this assertion wouldn't catch the regression.

Recommendation: assert result.tier == ContextTier.WARM

M6. _resolve() fallback path is unreachable dead code

Category: Code Quality
File: src/cleveragents/application/services/context_tiers.py:514-526

The fallback return in _resolve() can never execute in normal operation — after promotion, the fragment always exists in a known tier. The comment acknowledges this ("should not happen"). This dead code path lacks coverage.

M7. No test for chain promotion behavior (cold→warm→hot)

Category: Test Coverage
File: features/context_tier_runtime.feature

No scenario tests what happens after the first promotion. A test like "access 6 times, verify still in warm after 6th" would expose bug C1. The current tests stop after exactly threshold accesses.


Low — Nice to Have

L1. Multiple independent datetime.now() calls within single operation

Category: Code Quality
File: src/cleveragents/application/services/context_tiers.py:478 + DomainEvent constructor

_touch() and event emission both call datetime.now(tz=UTC) independently. Within a single get()promote() → emit event flow, timestamps can differ by microseconds. Consider caching a shared timestamp for consistency.

L2. No test for custom TTL values read from settings

Category: Test Coverage
File: features/context_tier_runtime.feature

All staleness tests use default 24h TTL. No scenario verifies that custom context_tier_hot_ttl_hours or context_tier_warm_ttl_hours settings take effect.

L3. _COMMANDS dict typed as dict[str, object]

Category: Code Quality
File: robot/helper_context_tier_runtime.py:142

The dict should be typed as dict[str, Callable[[], None]] to avoid the assert callable(cmd) guard.

L4. Staleness result test checks count but not IDs

Category: Test Quality
File: features/context_tier_runtime.feature:55-57

"Staleness enforcement returns list of demoted fragment IDs" asserts len(result) == 2 but doesn't verify the actual fragment IDs. A stronger assertion would confirm the specific fragments were demoted.

L5. _touch() is @staticmethod but mutates its argument

Category: Code Quality
File: src/cleveragents/application/services/context_tiers.py:475-480

Static methods are conventionally side-effect-free. _touch() mutates the passed TieredFragment in-place (setting last_accessed and incrementing access_count). While functional, this violates the principle of least surprise.


Review Methodology

This review was conducted over 2 complete global cycles, each examining all 5 categories (bugs, security, performance, test coverage, spec compliance) across all 11 changed files. Cycle 2 confirmed all findings from cycle 1 and found no additional issues.

Files reviewed in scope: context_tiers.py, types.py, settings.py, container.py, context_tier_runtime.feature, context_tier_runtime_steps.py, context_tier_runtime.robot, helper_context_tier_runtime.py, bench_context_tier_runtime.py, tdd_context_tier_runtime.feature, tdd_context_tier_runtime.robot

Surrounding code examined: tiers.py (domain model), protocol.py (EventBus), models.py (DomainEvent), scoped_tiers.py (mixin), specification.md (ACMS sections at lines 30555-30556, 44571-44575, 25698-25719)

# Code Review Report — PR #1150 (Issue #821) **Reviewer**: Automated Code Review (2 full review cycles, all categories) **Scope**: All code changes in `bugfix/m5-context-tier-runtime` plus close interactions with surrounding code **Reference**: `docs/specification.md`, Forgejo issue #821 --- ## Summary The PR implements the missing ACMS context tier runtime logic (auto-promotion, staleness enforcement, budget eviction, event emission). The overall design is sound and the code is well-structured. However, two critical correctness bugs, several high-severity issues, and a number of medium/low issues were found across two complete review cycles. **Issue count by severity**: 2 Critical, 5 High, 7 Medium, 5 Low --- ## Critical — Must Fix Before Merge ### C1. `access_count` not reset after promotion — fragments skip the warm tier **Category**: Bug / Logic Error **File**: `src/cleveragents/application/services/context_tiers.py:494-512` `_maybe_auto_promote()` checks `access_count >= promotion_threshold` but `promote()` does not reset `access_count`. After a cold fragment reaches the threshold (default 5), it is promoted to warm. On the **very next** `get()` call, `_touch()` increments `access_count` to 6, which is still `>= 5`, triggering **immediate** warm→hot promotion. The warm tier is effectively bypassed after only 1 additional access. **Trace**: ``` get() x5 → access_count=5, cold→warm ✓ get() x1 → access_count=6, warm→hot ← unexpected chain promotion ``` **Fix**: Reset `access_count` to `0` after each successful promotion in `_maybe_auto_promote()`, or track "accesses since last promotion" separately: ```python def _maybe_auto_promote(self, fragment_id, fragment): if fragment.access_count >= self._promotion_threshold: promoted = self.promote(fragment_id) if promoted is not None: promoted.access_count = 0 # reset counter after promotion ``` ### C2. `promote()` bypasses hot-tier budget enforcement **Category**: Bug / Constraint Violation **File**: `src/cleveragents/application/services/context_tiers.py:270-280` When `promote()` moves a fragment warm→hot, it does **not** call `_enforce_hot_budget()`. Only `store()` enforces the budget. Since `_maybe_auto_promote()` calls `promote()`, auto-promoted fragments can push the hot tier over `max_tokens_hot`, violating the budget constraint that `store()` carefully enforces. **Fix**: Call `_enforce_hot_budget()` after inserting into `self._hot` inside `promote()`: ```python if fragment_id in self._warm: frag = self._warm.pop(fragment_id) promoted = frag.model_copy(update={"tier": ContextTier.HOT}) self._hot[fragment_id] = promoted self._enforce_hot_budget() # ← add this self._emit_tier_event(...) return promoted ``` --- ## High — Should Fix Before Merge ### H1. O(n²) hot-tier budget enforcement loop **Category**: Performance **File**: `src/cleveragents/application/services/context_tiers.py:528-558` `_enforce_hot_budget()` recalculates `sum()` (O(n)) and `min()` (O(n)) on **every** iteration of the while loop, which can run up to O(n) times — total O(n²). With default budget (8000 tokens, ~160 fragments at 50 tokens each), this is tolerable. At scale it degrades. **Fix**: Calculate total once, subtract evicted tokens incrementally: ```python total = sum(f.token_count for f in self._hot.values()) while total > budget_tokens and self._hot: oldest = min(self._hot, key=lambda fid: self._hot[fid].last_accessed) total -= self._hot[oldest].token_count del self._hot[oldest] self._emit_tier_event(...) ``` ### H2. No thread-safety for shared singleton **Category**: Bug / Concurrency **File**: `src/cleveragents/application/services/context_tiers.py:60-108`, `container.py:611` `ContextTierService` is registered as a `Singleton` in the DI container. All callers share the same instance. Internal dicts (`_hot`, `_warm`, `_cold`) and metric counters have **no locking**. Operations like `get()` (read + mutate + promote), `store()` (write + evict), and `enforce_staleness()` (read + demote) are not atomic. If plan execution is concurrent, data races can occur. **Recommendation**: Either add `threading.Lock` for mutation paths, or document that the service is single-threaded only. ### H3. Missing error handling in `_emit_tier_event()` **Category**: Bug / Robustness **File**: `src/cleveragents/application/services/context_tiers.py:560-579` `self._event_bus.emit(event)` has no `try/except`. If the event bus raises, the exception propagates to the caller (`promote`, `demote`, `enforce_staleness`, `_enforce_hot_budget`). In `enforce_staleness()`, this could leave the service in a **partially-demoted** state — some fragments demoted, the loop interrupted midway. **Fix**: ```python try: self._event_bus.emit(event) except Exception: logger.warning("tier.event_emission_failed", event_type=event_type.value, fragment_id=fragment_id, exc_info=True) ``` ### H4. Warm/cold tier budget enforcement not implemented **Category**: Bug / Incomplete Implementation **File**: `src/cleveragents/application/services/context_tiers.py:114-141` `TierBudget` defines `max_decisions_warm` (500) and `max_decisions_cold` (5000), but `store()` only enforces `max_tokens_hot`. Fragments can accumulate in warm and cold tiers without limit. While issue #821 only specifies hot-tier budget enforcement, the presence of defined-but-unenforced budget fields is misleading and could cause unbounded memory growth. **Recommendation**: Either enforce warm/cold budgets or add a TODO/docstring noting they are intentionally unenforced in this PR. ### H5. Missing cold-tier TTL/archival **Category**: Spec Compliance **File**: `src/cleveragents/application/services/context_tiers.py:357-410` The specification at line 30556 defines `context.tiers.cold.retention-days` (default: 90 days) for cold-tier expiry/archival. `enforce_staleness()` only handles hot→warm and warm→cold transitions but never purges or archives cold fragments. Cold storage grows without bound. **Recommendation**: Acknowledge this as a known limitation or implement cold-tier expiry. --- ## Medium — Worth Addressing ### M1. Settings env var names don't match specification **Category**: Spec Compliance **File**: `src/cleveragents/config/settings.py:280-298` | Setting | Spec Env Var (line 30555-30556) | Code Env Var | |---------|------|------| | warm TTL | `CLEVERAGENTS_CTX_WARM_HOURS` | `CLEVERAGENTS_CONTEXT_TIER_WARM_TTL_HOURS` | | hot TTL | *(implied from warm pattern)* | `CLEVERAGENTS_CONTEXT_TIER_HOT_TTL_HOURS` | Users following the specification will use the wrong env var. Consider adding the spec env var as an additional `AliasChoices` entry, or updating the spec. ### M2. `demote()` does not refresh `last_accessed` — cascading rapid demotion **Category**: Bug / Design **File**: `src/cleveragents/application/services/context_tiers.py:284-316` When `enforce_staleness()` demotes a stale hot fragment to warm, `demote()` preserves the old `last_accessed` timestamp. On the **next** `enforce_staleness()` call, the fragment is still stale and immediately demoted warm→cold. This causes rapid cascading demotion across two consecutive enforcement passes, effectively making the warm tier a transient waypoint rather than a meaningful retention layer. **Recommendation**: Consider resetting `last_accessed` to `now()` on demotion so the fragment gets a fresh retention window in the new tier. ### M3. Tests access private attributes directly **Category**: Test Quality **Files**: `features/steps/context_tier_runtime_steps.py`, `robot/helper_context_tier_runtime.py`, `benchmarks/bench_context_tier_runtime.py` 7+ occurrences of direct private attribute access: `svc._budget`, `svc._find_fragment()`, `svc._hot`. This creates tight coupling between tests and implementation internals. Refactoring the internal storage (e.g., switching to a sorted container) would break all these tests. **Recommendation**: Add public methods for test inspection (e.g., `find_fragment()`, `hot_token_total()`), or accept the coupling with a documented rationale. ### M4. `_EventCollector` in tests doesn't match `EventBus` protocol **Category**: Test Quality **File**: `features/steps/context_tier_runtime_steps.py:57-65` `_EventCollector.subscribe()` accepts `handler: Any` instead of `Callable[[DomainEvent], None]` as the protocol requires. Constructor calls are suppressed with `# type: ignore[arg-type]`. This means tests don't validate that the service interacts correctly with a properly-typed event bus. ### M5. Robot helper uses weak assertion for auto-promotion **Category**: Test Quality **File**: `robot/helper_context_tier_runtime.py:88` ```python assert result.tier != ContextTier.COLD # weak ``` This only verifies the fragment left the cold tier, not that it's specifically in the warm tier. Due to bug C1 (chain promotion), if the threshold logic changes, this assertion wouldn't catch the regression. **Recommendation**: `assert result.tier == ContextTier.WARM` ### M6. `_resolve()` fallback path is unreachable dead code **Category**: Code Quality **File**: `src/cleveragents/application/services/context_tiers.py:514-526` The `fallback` return in `_resolve()` can never execute in normal operation — after promotion, the fragment always exists in a known tier. The comment acknowledges this ("should not happen"). This dead code path lacks coverage. ### M7. No test for chain promotion behavior (cold→warm→hot) **Category**: Test Coverage **File**: `features/context_tier_runtime.feature` No scenario tests what happens **after** the first promotion. A test like "access 6 times, verify still in warm after 6th" would expose bug C1. The current tests stop after exactly `threshold` accesses. --- ## Low — Nice to Have ### L1. Multiple independent `datetime.now()` calls within single operation **Category**: Code Quality **File**: `src/cleveragents/application/services/context_tiers.py:478` + `DomainEvent` constructor `_touch()` and event emission both call `datetime.now(tz=UTC)` independently. Within a single `get()` → `promote()` → emit event flow, timestamps can differ by microseconds. Consider caching a shared timestamp for consistency. ### L2. No test for custom TTL values read from settings **Category**: Test Coverage **File**: `features/context_tier_runtime.feature` All staleness tests use default 24h TTL. No scenario verifies that custom `context_tier_hot_ttl_hours` or `context_tier_warm_ttl_hours` settings take effect. ### L3. `_COMMANDS` dict typed as `dict[str, object]` **Category**: Code Quality **File**: `robot/helper_context_tier_runtime.py:142` The dict should be typed as `dict[str, Callable[[], None]]` to avoid the `assert callable(cmd)` guard. ### L4. Staleness result test checks count but not IDs **Category**: Test Quality **File**: `features/context_tier_runtime.feature:55-57` "Staleness enforcement returns list of demoted fragment IDs" asserts `len(result) == 2` but doesn't verify the actual fragment IDs. A stronger assertion would confirm the specific fragments were demoted. ### L5. `_touch()` is `@staticmethod` but mutates its argument **Category**: Code Quality **File**: `src/cleveragents/application/services/context_tiers.py:475-480` Static methods are conventionally side-effect-free. `_touch()` mutates the passed `TieredFragment` in-place (setting `last_accessed` and incrementing `access_count`). While functional, this violates the principle of least surprise. --- ## Review Methodology This review was conducted over **2 complete global cycles**, each examining all 5 categories (bugs, security, performance, test coverage, spec compliance) across all 11 changed files. Cycle 2 confirmed all findings from cycle 1 and found no additional issues. **Files reviewed in scope**: `context_tiers.py`, `types.py`, `settings.py`, `container.py`, `context_tier_runtime.feature`, `context_tier_runtime_steps.py`, `context_tier_runtime.robot`, `helper_context_tier_runtime.py`, `bench_context_tier_runtime.py`, `tdd_context_tier_runtime.feature`, `tdd_context_tier_runtime.robot` **Surrounding code examined**: `tiers.py` (domain model), `protocol.py` (EventBus), `models.py` (DomainEvent), `scoped_tiers.py` (mixin), `specification.md` (ACMS sections at lines 30555-30556, 44571-44575, 25698-25719)
CoreRasurae force-pushed bugfix/m5-context-tier-runtime from 0cea9d626d
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 18s
CI / lint (pull_request) Successful in 3m17s
CI / quality (pull_request) Successful in 3m50s
CI / typecheck (pull_request) Successful in 3m58s
CI / security (pull_request) Successful in 4m10s
CI / integration_tests (pull_request) Successful in 6m45s
CI / unit_tests (pull_request) Successful in 7m12s
CI / docker (pull_request) Successful in 49s
CI / coverage (pull_request) Successful in 11m19s
CI / status-check (pull_request) Successful in 2s
CI / e2e_tests (pull_request) Failing after 20m18s
CI / benchmark-regression (pull_request) Failing after 43m46s
to e749c5a817
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 21s
CI / lint (pull_request) Successful in 3m16s
CI / quality (pull_request) Successful in 3m41s
CI / security (pull_request) Successful in 3m56s
CI / typecheck (pull_request) Successful in 4m28s
CI / coverage (pull_request) Successful in 10m0s
CI / integration_tests (pull_request) Failing after 20m19s
CI / unit_tests (pull_request) Successful in 24m18s
CI / e2e_tests (pull_request) Successful in 24m20s
CI / docker (pull_request) Successful in 1m6s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Successful in 51m23s
2026-03-24 17:24:28 +00:00
Compare
CoreRasurae dismissed freemo's review 2026-03-24 17:24:28 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

Author
Member

Review Fixes Applied — Resolution Report

All review findings were validated against docs/specification.md, issue #821 acceptance criteria, and CONTRIBUTING.md guidelines. Fixes were applied, tested, and amended into the feature commit (e749c5a8).


Fixes Applied (4 items)

ID Finding Fix Applied
C1 access_count not reset after promotion — fragments skip warm tier Reset access_count = 0 on the promoted fragment after each successful auto-promotion in _maybe_auto_promote(). Added Behave scenario "Promoted fragment requires fresh accesses for next promotion" to guard against regression.
C2 promote() bypasses hot-tier budget enforcement Added self._enforce_hot_budget() call inside promote() warm→hot path, after inserting into self._hot.
H1 O(n²) hot-tier budget enforcement loop Restructured _enforce_hot_budget() to compute total_tokens once and subtract evicted fragment tokens incrementally. Total cost is now O(n) instead of O(n²).
M1 Settings env var names don't match specification Added CLEVERAGENTS_CTX_WARM_HOURS as an additional AliasChoices entry for context_tier_warm_ttl_hours per specification line 30555.

Findings Not Applied (10 items) — with justification

ID Finding Reason Not Applied
H2 No thread-safety for shared singleton Out of scope. The Singleton registration and thread-safety model are pre-existing architectural decisions not introduced by this PR. Issue #821 does not require concurrency support. If needed, it should be tracked as a separate issue.
H3 Missing error handling in _emit_tier_event() Contradicts CONTRIBUTING.md. Section "Exception Propagation" (line 496) states: "CRITICAL: Do not suppress errors. Let exceptions propagate to top-level execution." Adding try/except around EventBus.emit() would violate this project guideline.
H4 Warm/cold tier budget enforcement not implemented Out of scope for #821. Issue #821 acceptance criteria only require "budget-based eviction from hot tier when acms.tier.hot_max_tokens is exceeded." Warm/cold budget enforcement is a separate concern.
H5 Missing cold-tier TTL/archival Out of scope for #821. The specification defines context.tiers.cold.retention-days (line 30556), but issue #821 does not include cold-tier expiry in its acceptance criteria. This should be a separate issue.
M2 demote() does not refresh last_accessed — cascading demotion Design-consistent with spec. The spec's context.tiers.warm.retention-hours refers to time since last real user access, not time since demotion. Refreshing last_accessed on demotion would artificially extend retention for fragments the user hasn't actually accessed, contradicting the staleness intent.
M3 Tests access private attributes directly Per review scope guidance, test-internal concerns regarding private field usage are acceptable — tests can be easily replaced or updated and do not hinder application maintainability.
M4 _EventCollector doesn't match EventBus protocol Test-internal concern. The _EventCollector is a test double that satisfies the structural protocol at runtime. The Any typing is a test convenience, not a production code defect.
M5 Robot helper uses weak assertion Test-internal concern. The assertion tier != COLD is adequate for verifying the happy path. The new Behave scenario (C1 fix) provides the stronger chain-promotion guard.
M6 _resolve() fallback path is unreachable dead code Per review scope guidance, code should not be removed if it may be needed by other components or is a defensive guard. The _resolve() fallback is a safety net that protects against future changes to promote().
L1–L5 Low-priority code quality and test quality items Risk/benefit. These are minor improvements that don't affect correctness. Applying them in this PR would expand the change scope beyond the bug fix.

Quality Gates (post-fix)

Session Result
lint PASS
typecheck PASS — 0 errors
unit_tests PASS — 12,308 scenarios, 0 failures
integration_tests PASS — 1,706 tests, 0 failures
# Review Fixes Applied — Resolution Report All review findings were validated against `docs/specification.md`, issue #821 acceptance criteria, and `CONTRIBUTING.md` guidelines. Fixes were applied, tested, and amended into the feature commit (`e749c5a8`). --- ## Fixes Applied (4 items) | ID | Finding | Fix Applied | |----|---------|-------------| | **C1** | `access_count` not reset after promotion — fragments skip warm tier | Reset `access_count = 0` on the promoted fragment after each successful auto-promotion in `_maybe_auto_promote()`. Added Behave scenario "Promoted fragment requires fresh accesses for next promotion" to guard against regression. | | **C2** | `promote()` bypasses hot-tier budget enforcement | Added `self._enforce_hot_budget()` call inside `promote()` warm→hot path, after inserting into `self._hot`. | | **H1** | O(n²) hot-tier budget enforcement loop | Restructured `_enforce_hot_budget()` to compute `total_tokens` once and subtract evicted fragment tokens incrementally. Total cost is now O(n) instead of O(n²). | | **M1** | Settings env var names don't match specification | Added `CLEVERAGENTS_CTX_WARM_HOURS` as an additional `AliasChoices` entry for `context_tier_warm_ttl_hours` per specification line 30555. | --- ## Findings Not Applied (10 items) — with justification | ID | Finding | Reason Not Applied | |----|---------|-------------------| | **H2** | No thread-safety for shared singleton | **Out of scope.** The `Singleton` registration and thread-safety model are pre-existing architectural decisions not introduced by this PR. Issue #821 does not require concurrency support. If needed, it should be tracked as a separate issue. | | **H3** | Missing error handling in `_emit_tier_event()` | **Contradicts CONTRIBUTING.md.** Section "Exception Propagation" (line 496) states: *"CRITICAL: Do not suppress errors. Let exceptions propagate to top-level execution."* Adding `try/except` around `EventBus.emit()` would violate this project guideline. | | **H4** | Warm/cold tier budget enforcement not implemented | **Out of scope for #821.** Issue #821 acceptance criteria only require "budget-based eviction from hot tier when `acms.tier.hot_max_tokens` is exceeded." Warm/cold budget enforcement is a separate concern. | | **H5** | Missing cold-tier TTL/archival | **Out of scope for #821.** The specification defines `context.tiers.cold.retention-days` (line 30556), but issue #821 does not include cold-tier expiry in its acceptance criteria. This should be a separate issue. | | **M2** | `demote()` does not refresh `last_accessed` — cascading demotion | **Design-consistent with spec.** The spec's `context.tiers.warm.retention-hours` refers to time since last real user access, not time since demotion. Refreshing `last_accessed` on demotion would artificially extend retention for fragments the user hasn't actually accessed, contradicting the staleness intent. | | **M3** | Tests access private attributes directly | **Per review scope guidance**, test-internal concerns regarding private field usage are acceptable — tests can be easily replaced or updated and do not hinder application maintainability. | | **M4** | `_EventCollector` doesn't match `EventBus` protocol | **Test-internal concern.** The `_EventCollector` is a test double that satisfies the structural protocol at runtime. The `Any` typing is a test convenience, not a production code defect. | | **M5** | Robot helper uses weak assertion | **Test-internal concern.** The assertion `tier != COLD` is adequate for verifying the happy path. The new Behave scenario (C1 fix) provides the stronger chain-promotion guard. | | **M6** | `_resolve()` fallback path is unreachable dead code | **Per review scope guidance**, code should not be removed if it may be needed by other components or is a defensive guard. The `_resolve()` fallback is a safety net that protects against future changes to `promote()`. | | **L1–L5** | Low-priority code quality and test quality items | **Risk/benefit.** These are minor improvements that don't affect correctness. Applying them in this PR would expand the change scope beyond the bug fix. | --- ## Quality Gates (post-fix) | Session | Result | |---------|--------| | `lint` | PASS | | `typecheck` | PASS — 0 errors | | `unit_tests` | PASS — 12,308 scenarios, 0 failures | | `integration_tests` | PASS — 1,706 tests, 0 failures |
Author
Member

Code Review Report — PR #1150 (Issue #821)

fix(acms): implement context tier runtime promotion/demotion/eviction

Review scope: all code changes in branch bugfix/m5-context-tier-runtime plus close connections to surrounding code.
Review method: 4 global cycles across all categories (bugs, test coverage, test flaws, performance, security, specification alignment).

Findings: 17 total — 1 Critical, 3 High, 5 Medium, 8 Low


CRITICAL (1)

B-CRIT-1: Data loss — fragment lost during warm-to-hot promotion under budget pressure

File: src/cleveragents/application/services/context_tiers.py:270-281

promote() pops the fragment from the warm tier (line 271), copies it into the hot tier (line 273), then calls _enforce_hot_budget() (line 274). If the just-promoted fragment has the oldest last_accessed timestamp among all hot-tier entries, _enforce_hot_budget() evicts it from hot. Since it was already removed from warm via pop(), the fragment is now gone from ALL tiers — silent data loss.

Additionally, the event ordering is incorrect: TIER_EVICTED fires inside _enforce_hot_budget() before TIER_PROMOTED is emitted on line 275, so subscribers see an eviction for a fragment that was never reported as promoted.

# context_tiers.py:270-281 — problematic sequence
if fragment_id in self._warm:
    frag = self._warm.pop(fragment_id)           # 1. Removed from warm
    promoted = frag.model_copy(update={"tier": ContextTier.HOT})
    self._hot[fragment_id] = promoted             # 2. Added to hot
    self._enforce_hot_budget()                    # 3. May evict `promoted` from hot
    self._emit_tier_event(                        # 4. Emits PROMOTED (too late)
        EventType.TIER_PROMOTED, ...
    )
    return promoted                               # 5. Returns detached object

Suggested fix: Emit TIER_PROMOTED before calling _enforce_hot_budget(), and either (a) exclude the just-promoted fragment from budget eviction, or (b) if it must be evicted, re-insert it into the warm tier instead of discarding it.


HIGH (3)

B-HIGH-1: Self-eviction — fragment immediately evicted when its token count exceeds entire hot budget

File: src/cleveragents/application/services/context_tiers.py:135-137

store() adds a fragment to the hot tier, then calls _enforce_hot_budget(). If fragment.token_count > max_tokens_hot, the eviction loop drains every hot fragment (including the just-stored one), leaving the hot tier empty. The caller receives no error — the fragment simply vanishes after being "stored."

Suggested fix: Before entering the eviction loop, check if the incoming fragment alone exceeds the budget. If so, either reject the store with a ValueError, log a warning and redirect to warm tier, or document this as expected behavior.


B-HIGH-2: Warm and cold tier budget limits defined in TierBudget but never enforced

File: src/cleveragents/application/services/context_tiers.py:114-141

TierBudget defines max_decisions_warm=500 and max_decisions_cold=5000, and the spec defines context.warm.max-decisions (spec line 30536) and context.cold.max-decisions (spec line 30537). However, store() only enforces max_tokens_hot. The warm and cold tiers grow without bound, regardless of the configured budget limits.

Suggested fix: Add _enforce_warm_budget() and _enforce_cold_budget() calls (or at minimum document that these limits are not yet enforced and file a follow-up issue).


P-HIGH-1: _enforce_hot_budget() eviction loop is O(n*k), not O(n) as claimed in the commit message

File: src/cleveragents/application/services/context_tiers.py:535-564

The commit message states "H1: Replaced O(n^2) budget enforcement loop with incremental token subtraction (O(n) total)." While the token tracking is indeed O(n+k) via incremental subtraction, the min() call on each eviction iteration (line 546-549) scans all remaining hot-tier entries — O(n) per call. For k evictions from n fragments, total cost is O(n*k), which degrades to O(n^2/2) when k approaches n.

while total_tokens > budget_tokens and self._hot:
    oldest_id = min(          # <-- O(n) per iteration
        self._hot.keys(),
        key=lambda fid: self._hot[fid].last_accessed,
    )

Suggested fix: Either correct the complexity claim in the commit message / code comments, or replace the plain dict with an OrderedDict / min-heap for O(1) / O(log n) eviction.


MEDIUM (5)

B-MED-1: No error handling in _emit_tier_event() — event bus exception breaks tier operation

File: src/cleveragents/application/services/context_tiers.py:566-585

If self._event_bus.emit() raises an exception (e.g., subscriber error, closed bus), it propagates up uncaught and aborts the tier operation in progress. This is inconsistent with the "best-effort" event pattern used elsewhere in the codebase (e.g., the CLI facade notifications use contextlib.suppress).

Suggested fix: Wrap the emit() call in try/except Exception with a warning-level log.


B-MED-2: Event ordering — TIER_EVICTED fires before TIER_PROMOTED during budget-triggered eviction

File: src/cleveragents/application/services/context_tiers.py:270-281

Even when the just-promoted fragment is not evicted, if budget enforcement evicts other hot-tier fragments, those TIER_EVICTED events fire before the TIER_PROMOTED event. A subscriber processing events in order sees evictions before the promotion that caused them, making causal reasoning about the event stream incorrect.

Suggested fix: Emit TIER_PROMOTED before calling _enforce_hot_budget().


T-MED-1: Missing test — single fragment exceeding entire hot budget

No Behave or Robot scenario tests storing a fragment with token_count > max_tokens_hot. This is the trigger condition for B-HIGH-1.


T-MED-2: Missing test — warm-to-hot promotion causing self-eviction

No test verifies the behavior when promoting a warm fragment to hot triggers budget eviction of the just-promoted fragment itself. This is the trigger condition for B-CRIT-1.


T-MED-3: Chain-promotion guard test is incomplete

File: features/context_tier_runtime.feature:34-39

The scenario "Promoted fragment requires fresh accesses for next promotion" verifies that 1 extra access after first promotion does not chain-promote. However, it never verifies that 5 new accesses (after the counter reset) do successfully promote to hot. This makes it a one-sided test of the guard.

Suggested fix: Extend the scenario to also verify that 5 fresh accesses complete the cold-to-warm-to-hot promotion chain.


T-MED-4: Event detail assertions are incomplete

File: features/context_tier_runtime.feature:86-102

Event assertion steps verify event_type and fragment_id in the event details, but do not validate from_tier and to_tier values. A bug that emits incorrect tier transition metadata would go undetected.


S-MED-1: Hot-tier TTL setting not defined in specification

File: src/cleveragents/config/settings.py:287-291

The specification defines context.tiers.warm.retention-hours (spec line 30555) but has no equivalent for hot-tier TTL. The setting context_tier_hot_ttl_hours was added based on the issue description, creating a divergence between the spec and the implementation. This should either be added to the spec or documented as an implementation extension.


LOW (8)

D-LOW-1: Robot helper _COMMANDS dict uses dict[str, object] type

File: robot/helper_context_tier_runtime.py:137

Uses dict[str, object] followed by assert callable(cmd) at runtime. Should be dict[str, Callable[[], None]] for static type safety.

D-LOW-2: No __all__ export in context_tiers.py

File: src/cleveragents/application/services/context_tiers.py

The module doesn't define __all__, so wildcard imports would export internal helpers (_budget_from_settings, _DEFAULT_PROMOTION_THRESHOLD, etc.).

T-LOW-1: Tests directly access private members

Files: features/steps/context_tier_runtime_steps.py, benchmarks/bench_context_tier_runtime.py

Tests use svc._find_fragment(), svc._hot, svc._budget = TierBudget(...). This creates tight coupling to internal implementation details, making tests fragile on refactoring.

T-LOW-2: Robot helper duplicates Behave step logic

Files: robot/helper_context_tier_runtime.py vs features/steps/context_tier_runtime_steps.py

Both files contain similar setup/assertion logic (fragment creation, service configuration, tier validation) with no shared abstractions.

P-LOW-1: No indexed data structure for LRU eviction

File: src/cleveragents/application/services/context_tiers.py:546-549

Hot tier uses a plain dict with min() for LRU selection. An OrderedDict or min-heap would improve eviction from O(n) to O(1) or O(log n) per eviction.

P-LOW-2: Budget enforcement always computes full sum

File: src/cleveragents/application/services/context_tiers.py:543

_enforce_hot_budget() always computes sum(f.token_count for f in self._hot.values()) even when the hot tier is well under budget. A maintained running total would make this O(1).

P-LOW-3: _enforce_hot_budget() called on every hot-tier store, even when budget is not threatened

File: src/cleveragents/application/services/context_tiers.py:136-137

Could short-circuit with a running total check before entering the eviction path.

S-LOW-1: Thread safety not documented for singleton service

File: src/cleveragents/application/container.py:611

ContextTierService is a DI Singleton with plain dict stores. Multi-step operations (pop-from-one-dict, add-to-another) are not atomic. While this is an existing design limitation (not introduced by this PR), the new runtime logic (auto-promotion, staleness enforcement) increases the surface area for potential race conditions. A docstring note about single-threaded usage expectations would be valuable.


Summary Table

Severity Bug Performance Test Coverage Test Flaw Spec Alignment Design Total
Critical 1 1
High 2 1 3
Medium 2 2 2 1 7
Low 3 2 1 2 8
Total 5 4 2 4 2 2 17

Recommendation: The Critical and High items (B-CRIT-1, B-HIGH-1, B-HIGH-2, P-HIGH-1) should be addressed before merge. The Medium items are recommended. The Low items are suggestions for incremental improvement.


Review performed on commit e749c5a — 4 global review cycles across bugs, test coverage, test flaws, performance, security, and specification alignment.

# Code Review Report — PR #1150 (Issue #821) **fix(acms): implement context tier runtime promotion/demotion/eviction** Review scope: all code changes in branch `bugfix/m5-context-tier-runtime` plus close connections to surrounding code. Review method: 4 global cycles across all categories (bugs, test coverage, test flaws, performance, security, specification alignment). **Findings: 17 total** — 1 Critical, 3 High, 5 Medium, 8 Low --- ## CRITICAL (1) ### B-CRIT-1: Data loss — fragment lost during warm-to-hot promotion under budget pressure **File:** `src/cleveragents/application/services/context_tiers.py:270-281` `promote()` pops the fragment from the warm tier (line 271), copies it into the hot tier (line 273), then calls `_enforce_hot_budget()` (line 274). If the just-promoted fragment has the **oldest** `last_accessed` timestamp among all hot-tier entries, `_enforce_hot_budget()` evicts it from hot. Since it was already removed from warm via `pop()`, the fragment is now **gone from ALL tiers** — silent data loss. Additionally, the event ordering is incorrect: `TIER_EVICTED` fires inside `_enforce_hot_budget()` **before** `TIER_PROMOTED` is emitted on line 275, so subscribers see an eviction for a fragment that was never reported as promoted. ```python # context_tiers.py:270-281 — problematic sequence if fragment_id in self._warm: frag = self._warm.pop(fragment_id) # 1. Removed from warm promoted = frag.model_copy(update={"tier": ContextTier.HOT}) self._hot[fragment_id] = promoted # 2. Added to hot self._enforce_hot_budget() # 3. May evict `promoted` from hot self._emit_tier_event( # 4. Emits PROMOTED (too late) EventType.TIER_PROMOTED, ... ) return promoted # 5. Returns detached object ``` **Suggested fix:** Emit `TIER_PROMOTED` *before* calling `_enforce_hot_budget()`, and either (a) exclude the just-promoted fragment from budget eviction, or (b) if it must be evicted, re-insert it into the warm tier instead of discarding it. --- ## HIGH (3) ### B-HIGH-1: Self-eviction — fragment immediately evicted when its token count exceeds entire hot budget **File:** `src/cleveragents/application/services/context_tiers.py:135-137` `store()` adds a fragment to the hot tier, then calls `_enforce_hot_budget()`. If `fragment.token_count > max_tokens_hot`, the eviction loop drains every hot fragment (including the just-stored one), leaving the hot tier empty. The caller receives no error — the fragment simply vanishes after being "stored." **Suggested fix:** Before entering the eviction loop, check if the incoming fragment alone exceeds the budget. If so, either reject the store with a `ValueError`, log a warning and redirect to warm tier, or document this as expected behavior. --- ### B-HIGH-2: Warm and cold tier budget limits defined in `TierBudget` but never enforced **File:** `src/cleveragents/application/services/context_tiers.py:114-141` `TierBudget` defines `max_decisions_warm=500` and `max_decisions_cold=5000`, and the spec defines `context.warm.max-decisions` (spec line 30536) and `context.cold.max-decisions` (spec line 30537). However, `store()` only enforces `max_tokens_hot`. The warm and cold tiers grow without bound, regardless of the configured budget limits. **Suggested fix:** Add `_enforce_warm_budget()` and `_enforce_cold_budget()` calls (or at minimum document that these limits are not yet enforced and file a follow-up issue). --- ### P-HIGH-1: `_enforce_hot_budget()` eviction loop is O(n*k), not O(n) as claimed in the commit message **File:** `src/cleveragents/application/services/context_tiers.py:535-564` The commit message states *"H1: Replaced O(n^2) budget enforcement loop with incremental token subtraction (O(n) total)."* While the token tracking is indeed O(n+k) via incremental subtraction, the `min()` call on **each** eviction iteration (line 546-549) scans all remaining hot-tier entries — O(n) per call. For k evictions from n fragments, total cost is **O(n*k)**, which degrades to O(n^2/2) when k approaches n. ```python while total_tokens > budget_tokens and self._hot: oldest_id = min( # <-- O(n) per iteration self._hot.keys(), key=lambda fid: self._hot[fid].last_accessed, ) ``` **Suggested fix:** Either correct the complexity claim in the commit message / code comments, or replace the plain dict with an `OrderedDict` / min-heap for O(1) / O(log n) eviction. --- ## MEDIUM (5) ### B-MED-1: No error handling in `_emit_tier_event()` — event bus exception breaks tier operation **File:** `src/cleveragents/application/services/context_tiers.py:566-585` If `self._event_bus.emit()` raises an exception (e.g., subscriber error, closed bus), it propagates up uncaught and aborts the tier operation in progress. This is inconsistent with the "best-effort" event pattern used elsewhere in the codebase (e.g., the CLI facade notifications use `contextlib.suppress`). **Suggested fix:** Wrap the `emit()` call in `try/except Exception` with a warning-level log. --- ### B-MED-2: Event ordering — `TIER_EVICTED` fires before `TIER_PROMOTED` during budget-triggered eviction **File:** `src/cleveragents/application/services/context_tiers.py:270-281` Even when the just-promoted fragment is *not* evicted, if budget enforcement evicts *other* hot-tier fragments, those `TIER_EVICTED` events fire before the `TIER_PROMOTED` event. A subscriber processing events in order sees evictions before the promotion that caused them, making causal reasoning about the event stream incorrect. **Suggested fix:** Emit `TIER_PROMOTED` before calling `_enforce_hot_budget()`. --- ### T-MED-1: Missing test — single fragment exceeding entire hot budget No Behave or Robot scenario tests storing a fragment with `token_count > max_tokens_hot`. This is the trigger condition for **B-HIGH-1**. --- ### T-MED-2: Missing test — warm-to-hot promotion causing self-eviction No test verifies the behavior when promoting a warm fragment to hot triggers budget eviction of the just-promoted fragment itself. This is the trigger condition for **B-CRIT-1**. --- ### T-MED-3: Chain-promotion guard test is incomplete **File:** `features/context_tier_runtime.feature:34-39` The scenario *"Promoted fragment requires fresh accesses for next promotion"* verifies that 1 extra access after first promotion does **not** chain-promote. However, it never verifies that 5 *new* accesses (after the counter reset) **do** successfully promote to hot. This makes it a one-sided test of the guard. **Suggested fix:** Extend the scenario to also verify that 5 fresh accesses complete the cold-to-warm-to-hot promotion chain. --- ### T-MED-4: Event detail assertions are incomplete **File:** `features/context_tier_runtime.feature:86-102` Event assertion steps verify `event_type` and `fragment_id` in the event details, but do not validate `from_tier` and `to_tier` values. A bug that emits incorrect tier transition metadata would go undetected. --- ### S-MED-1: Hot-tier TTL setting not defined in specification **File:** `src/cleveragents/config/settings.py:287-291` The specification defines `context.tiers.warm.retention-hours` (spec line 30555) but has **no equivalent for hot-tier TTL**. The setting `context_tier_hot_ttl_hours` was added based on the issue description, creating a divergence between the spec and the implementation. This should either be added to the spec or documented as an implementation extension. --- ## LOW (8) ### D-LOW-1: Robot helper `_COMMANDS` dict uses `dict[str, object]` type **File:** `robot/helper_context_tier_runtime.py:137` Uses `dict[str, object]` followed by `assert callable(cmd)` at runtime. Should be `dict[str, Callable[[], None]]` for static type safety. ### D-LOW-2: No `__all__` export in `context_tiers.py` **File:** `src/cleveragents/application/services/context_tiers.py` The module doesn't define `__all__`, so wildcard imports would export internal helpers (`_budget_from_settings`, `_DEFAULT_PROMOTION_THRESHOLD`, etc.). ### T-LOW-1: Tests directly access private members **Files:** `features/steps/context_tier_runtime_steps.py`, `benchmarks/bench_context_tier_runtime.py` Tests use `svc._find_fragment()`, `svc._hot`, `svc._budget = TierBudget(...)`. This creates tight coupling to internal implementation details, making tests fragile on refactoring. ### T-LOW-2: Robot helper duplicates Behave step logic **Files:** `robot/helper_context_tier_runtime.py` vs `features/steps/context_tier_runtime_steps.py` Both files contain similar setup/assertion logic (fragment creation, service configuration, tier validation) with no shared abstractions. ### P-LOW-1: No indexed data structure for LRU eviction **File:** `src/cleveragents/application/services/context_tiers.py:546-549` Hot tier uses a plain `dict` with `min()` for LRU selection. An `OrderedDict` or min-heap would improve eviction from O(n) to O(1) or O(log n) per eviction. ### P-LOW-2: Budget enforcement always computes full sum **File:** `src/cleveragents/application/services/context_tiers.py:543` `_enforce_hot_budget()` always computes `sum(f.token_count for f in self._hot.values())` even when the hot tier is well under budget. A maintained running total would make this O(1). ### P-LOW-3: `_enforce_hot_budget()` called on every hot-tier store, even when budget is not threatened **File:** `src/cleveragents/application/services/context_tiers.py:136-137` Could short-circuit with a running total check before entering the eviction path. ### S-LOW-1: Thread safety not documented for singleton service **File:** `src/cleveragents/application/container.py:611` `ContextTierService` is a DI Singleton with plain `dict` stores. Multi-step operations (pop-from-one-dict, add-to-another) are not atomic. While this is an existing design limitation (not introduced by this PR), the new runtime logic (auto-promotion, staleness enforcement) increases the surface area for potential race conditions. A docstring note about single-threaded usage expectations would be valuable. --- ## Summary Table | Severity | Bug | Performance | Test Coverage | Test Flaw | Spec Alignment | Design | Total | |----------|-----|-------------|---------------|-----------|----------------|--------|-------| | Critical | 1 | — | — | — | — | — | **1** | | High | 2 | 1 | — | — | — | — | **3** | | Medium | 2 | — | 2 | 2 | 1 | — | **7** | | Low | — | 3 | — | 2 | 1 | 2 | **8** | | **Total** | **5** | **4** | **2** | **4** | **2** | **2** | **17** | **Recommendation:** The Critical and High items (B-CRIT-1, B-HIGH-1, B-HIGH-2, P-HIGH-1) should be addressed before merge. The Medium items are recommended. The Low items are suggestions for incremental improvement. --- *Review performed on commit `e749c5a` — 4 global review cycles across bugs, test coverage, test flaws, performance, security, and specification alignment.*
CoreRasurae force-pushed bugfix/m5-context-tier-runtime from e749c5a817
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 21s
CI / lint (pull_request) Successful in 3m16s
CI / quality (pull_request) Successful in 3m41s
CI / security (pull_request) Successful in 3m56s
CI / typecheck (pull_request) Successful in 4m28s
CI / coverage (pull_request) Successful in 10m0s
CI / integration_tests (pull_request) Failing after 20m19s
CI / unit_tests (pull_request) Successful in 24m18s
CI / e2e_tests (pull_request) Successful in 24m20s
CI / docker (pull_request) Successful in 1m6s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Successful in 51m23s
to ccbd8e0962
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 25s
CI / lint (pull_request) Successful in 3m20s
CI / quality (pull_request) Successful in 3m52s
CI / typecheck (pull_request) Successful in 3m56s
CI / security (pull_request) Successful in 4m3s
CI / e2e_tests (pull_request) Failing after 14m28s
CI / integration_tests (pull_request) Failing after 14m28s
CI / coverage (pull_request) Failing after 15m31s
CI / unit_tests (pull_request) Failing after 19m28s
CI / benchmark-regression (pull_request) Successful in 58m51s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 1s
2026-03-24 19:50:05 +00:00
Compare
CoreRasurae left a comment

Code Review Report — PR #1150 (bugfix/m5-context-tier-runtime)

Issue: #821 — context tier service has no runtime promotion/demotion/eviction logic
Branch: bugfix/m5-context-tier-runtime
Commit: ccbd8e09
Reviewer scope: All code changes in this branch plus direct surrounding code (domain models, EventBus protocol, ScopedTierMixin, Settings)

Review Methodology

Three full review cycles were performed across all categories (bugs, test coverage, test flaws, performance, security, specification compliance), with each cycle deepening the analysis. Static analysis (Pyright, Ruff) and specification cross-referencing were also conducted.

Static analysis results: Pyright — 0 errors. Ruff — 0 violations. No security findings.


Findings Summary

Severity Bug Test Spec Total
Medium 2 2 0 4
Low 0 2 2 4
Total 2 4 2 8

No critical or high severity issues found. No blocking issues.


MEDIUM Severity

B-MED-1: Missing event emission for oversized fragment redirect in store()

File: src/cleveragents/application/services/context_tiers.py:148-160

When store() redirects an oversized fragment from hot to warm (because token_count > max_tokens_hot), no tier transition event is emitted. Only a logger.warning("tier.oversized_fragment_redirected", ...) captures this.

if fragment.token_count > self._budget.max_tokens_hot:
    logger.warning("tier.oversized_fragment_redirected", ...)
    redirected = fragment.model_copy(update={"tier": ContextTier.WARM})
    self._warm[fragment.fragment_id] = redirected
    # No _emit_tier_event() call here

Impact: Observability gap. Operators and event-driven consumers (dashboards, alerting) monitoring TIER_* events will miss these silent tier redirections. Every other tier transition in the service emits an event.

Suggestion: Emit a TIER_DEMOTED event with an additional detail field (e.g., "reason": "oversized_redirect") to distinguish it from standard demotions, or consider a dedicated event type.


B-MED-2: Budget eviction in _enforce_hot_budget() permanently deletes fragments instead of demoting to warm

File: src/cleveragents/application/services/context_tiers.py:590-610

When hot-tier budget is exceeded, _enforce_hot_budget() permanently deletes evicted fragments via del self._hot[oldest_id]. The evicted fragments are not demoted to the warm tier — they are permanently lost.

The B-CRIT-1 fix correctly protects the just-promoted fragment (restoring it to warm if budget-evicted). However, other hot-tier residents evicted to make room are destroyed.

while total_tokens > budget_tokens and self._hot:
    oldest_id = min(self._hot.keys(), key=lambda fid: self._hot[fid].last_accessed)
    evicted_tokens = self._hot[oldest_id].token_count
    del self._hot[oldest_id]          # Permanent deletion — not demoted to warm
    total_tokens -= evicted_tokens
    self._emit_tier_event(EventType.TIER_EVICTED, oldest_id, ...)

The specification (line 44569, §Three Storage Tiers with Temporal Alignment, and lines 25704-25713, §Plan Lifecycle ACMS Actions) describes a downward lifecycle flow: "Hot context archived to warm. Warm context ages to cold based on retention policy." This suggests fragments should flow down the tiers rather than be destroyed.

Impact: Context data loss in busy hot tiers. Unlike the pre-existing evict_lru() (which was only called manually), this eviction is now automatic — triggered by every store() and promote() — increasing the risk surface.

Suggestion: Consider demoting evicted fragments to warm (via self.demote(oldest_id)) instead of deleting. This would preserve the spec's downward lifecycle and prevent silent context loss. If permanent deletion is intentional, document this design decision explicitly.


T-MED-1: evict_lru() event emission is untested

File: src/cleveragents/application/services/context_tiers.py:385-392 (new code)

The PR adds _emit_tier_event(EventType.TIER_EVICTED, ...) inside evict_lru(). No test exercises this code path.

The Behave scenario "Budget eviction emits TIER_EVICTED event" tests event emission via store()_enforce_hot_budget(), which has its own separate event emission code at line 599 — it never calls evict_lru(). These are two independent eviction paths with duplicated event logic.

# evict_lru() — line 385 (UNTESTED event emission)
for fid in to_evict:
    del store[fid]
    self._emit_tier_event(EventType.TIER_EVICTED, fid, from_tier=tier, to_tier=None)

# _enforce_hot_budget() — line 599 (tested via store scenario)
del self._hot[oldest_id]
self._emit_tier_event(EventType.TIER_EVICTED, oldest_id, from_tier=ContextTier.HOT, to_tier=None)

Impact: The event emission in evict_lru() could silently break without any test catching it. Since evict_lru() is a public method, this is a coverage gap for new code in the PR.

Suggestion: Add a scenario (or extend an existing one) that calls evict_lru() directly with an event bus and asserts TIER_EVICTED events are emitted.


T-MED-2: Robot helper uses weak assertion for auto-promotion result

File: robot/helper_context_tier_runtime.py:89-91

cmd_auto_promote() asserts result.tier != ContextTier.COLD instead of result.tier == ContextTier.WARM:

assert result.tier != ContextTier.COLD, (
    f"Fragment still in cold after 5 accesses: tier={result.tier}"
)

After 5 accesses from cold with default promotion_threshold=5, the fragment should be in WARM (not HOT — the access counter resets after each promotion, preventing chain promotion). A hypothetical bug that incorrectly chain-promoted cold→warm→hot in a single threshold cycle would pass this assertion.

Impact: The assertion wouldn't catch a regression in the counter-reset logic (the C1 review fix).

Suggestion: Change to assert result.tier == ContextTier.WARM.


LOW Severity

T-LOW-1: Tests rely extensively on private API

Files: features/steps/context_tier_runtime_steps.py, robot/helper_context_tier_runtime.py

Tests use svc._find_fragment(), svc._hot.values(), and svc._budget = TierBudget(...) throughout. This is pragmatic — there's no public API to inspect tier placement without triggering get() side effects — but it couples tests to implementation details.

Impact: Minor. Tests may break on internal refactoring that preserves public behavior.

Suggestion: Acknowledged as a known trade-off. If a public inspect(fragment_id) or tier_of(fragment_id) method is ever added, tests should migrate to it.


T-LOW-2: Missing edge case test coverage for new public methods

Several edge cases in the new code are untested:

Untested path Expected behavior
get("nonexistent") → returns None Simple path, low risk
promote() on hot-tier fragment → returns None Already at top tier
promote() on nonexistent fragment → returns None Fragment not found
demote() on cold-tier fragment → returns None Already at bottom tier
demote() on nonexistent fragment → returns None Fragment not found
get() on hot-tier fragment Basic touch-and-return (no auto-promote)
enforce_staleness() with mixed stale hot AND stale warm Double-demotion prevention in a single pass

Impact: Low — these are simple code paths unlikely to break. But adding them would strengthen the test suite and improve coverage confidence.


S-LOW-1: Missing CLEVERAGENTS_CTX_HOT_HOURS env var alias for hot TTL (consistency)

File: src/cleveragents/config/settings.py:287-290

The context_tier_warm_ttl_hours field correctly includes a CLEVERAGENTS_CTX_WARM_HOURS alias matching the spec (line 30555, M1 review fix). However, context_tier_hot_ttl_hours only has the auto-generated CLEVERAGENTS_CONTEXT_TIER_HOT_TTL_HOURS:

context_tier_hot_ttl_hours: int = Field(
    default=24, ge=1,
    validation_alias=AliasChoices("CLEVERAGENTS_CONTEXT_TIER_HOT_TTL_HOURS"),
    # No short alias like CLEVERAGENTS_CTX_HOT_HOURS
)

context_tier_warm_ttl_hours: int = Field(
    default=24, ge=1,
    validation_alias=AliasChoices(
        "CLEVERAGENTS_CONTEXT_TIER_WARM_TTL_HOURS",
        "CLEVERAGENTS_CTX_WARM_HOURS",     # Spec alias present
    ),
)

Impact: Minor inconsistency. Users who learn the CTX_WARM_HOURS short form may expect CTX_HOT_HOURS to also work.

Suggestion: Add "CLEVERAGENTS_CTX_HOT_HOURS" as a second alias for context_tier_hot_ttl_hours.


S-LOW-2: Spec ambiguity — hot tier TTL vs "Until resource removed"

Files: src/cleveragents/application/services/context_tiers.py:421-433, docs/specification.md line 44569

The specification's tier table states hot tier retention is "Until resource removed", implying hot fragments persist indefinitely. The implementation introduces context_tier_hot_ttl_hours (default 24h) with enforce_staleness() that time-demotes hot fragments, which could appear to conflict.

This is reconciled by interpreting:

  • Spec "retention": data persists in some tier (the fragment is demoted, not deleted)
  • Implementation "TTL": application-level staleness policy for tier placement

Issue #821 explicitly requires hot TTL (acms.tier.hot_ttl), so this is by design.

Impact: Documentation clarity only. No functional issue.

Suggestion: Add a brief docstring note in enforce_staleness() explaining this reconciliation: hot TTL governs tier placement (demotion), not data retention (deletion), consistent with the spec's "Until resource removed" language.


Positive Observations

  • B-CRIT-1 fix is well-designed: the promote-then-check-then-restore pattern in promote() correctly prevents silent data loss on warm→hot promotion with tight budgets.
  • Double-demotion prevention via warm_before snapshot in enforce_staleness() is correct and well-documented.
  • Best-effort event emission with try/except in _emit_tier_event() properly isolates event bus failures from core tier operations.
  • Counter reset after auto-promotion correctly prevents chain promotion (cold→warm→hot in a single threshold cycle).
  • Comprehensive BDD coverage: 16 feature scenarios covering all major paths (promotion, demotion, staleness, budget, events, failing bus).
  • Static analysis: Pyright 0 errors, Ruff 0 violations across all changed files.
## Code Review Report — PR #1150 (`bugfix/m5-context-tier-runtime`) **Issue**: #821 — context tier service has no runtime promotion/demotion/eviction logic **Branch**: `bugfix/m5-context-tier-runtime` **Commit**: `ccbd8e09` **Reviewer scope**: All code changes in this branch plus direct surrounding code (domain models, EventBus protocol, ScopedTierMixin, Settings) ### Review Methodology Three full review cycles were performed across all categories (bugs, test coverage, test flaws, performance, security, specification compliance), with each cycle deepening the analysis. Static analysis (Pyright, Ruff) and specification cross-referencing were also conducted. **Static analysis results**: Pyright — 0 errors. Ruff — 0 violations. No security findings. --- ## Findings Summary | Severity | Bug | Test | Spec | Total | |----------|-----|------|------|-------| | **Medium** | 2 | 2 | 0 | **4** | | **Low** | 0 | 2 | 2 | **4** | | **Total** | **2** | **4** | **2** | **8** | No critical or high severity issues found. No blocking issues. --- ## MEDIUM Severity ### B-MED-1: Missing event emission for oversized fragment redirect in `store()` **File**: `src/cleveragents/application/services/context_tiers.py:148-160` When `store()` redirects an oversized fragment from hot to warm (because `token_count > max_tokens_hot`), no tier transition event is emitted. Only a `logger.warning("tier.oversized_fragment_redirected", ...)` captures this. ```python if fragment.token_count > self._budget.max_tokens_hot: logger.warning("tier.oversized_fragment_redirected", ...) redirected = fragment.model_copy(update={"tier": ContextTier.WARM}) self._warm[fragment.fragment_id] = redirected # No _emit_tier_event() call here ``` **Impact**: Observability gap. Operators and event-driven consumers (dashboards, alerting) monitoring `TIER_*` events will miss these silent tier redirections. Every other tier transition in the service emits an event. **Suggestion**: Emit a `TIER_DEMOTED` event with an additional detail field (e.g., `"reason": "oversized_redirect"`) to distinguish it from standard demotions, or consider a dedicated event type. --- ### B-MED-2: Budget eviction in `_enforce_hot_budget()` permanently deletes fragments instead of demoting to warm **File**: `src/cleveragents/application/services/context_tiers.py:590-610` When hot-tier budget is exceeded, `_enforce_hot_budget()` permanently deletes evicted fragments via `del self._hot[oldest_id]`. The evicted fragments are not demoted to the warm tier — they are permanently lost. The B-CRIT-1 fix correctly protects the *just-promoted* fragment (restoring it to warm if budget-evicted). However, **other** hot-tier residents evicted to make room are destroyed. ```python while total_tokens > budget_tokens and self._hot: oldest_id = min(self._hot.keys(), key=lambda fid: self._hot[fid].last_accessed) evicted_tokens = self._hot[oldest_id].token_count del self._hot[oldest_id] # Permanent deletion — not demoted to warm total_tokens -= evicted_tokens self._emit_tier_event(EventType.TIER_EVICTED, oldest_id, ...) ``` The specification (line 44569, §Three Storage Tiers with Temporal Alignment, and lines 25704-25713, §Plan Lifecycle ACMS Actions) describes a downward lifecycle flow: *"Hot context archived to warm. Warm context ages to cold based on retention policy."* This suggests fragments should flow down the tiers rather than be destroyed. **Impact**: Context data loss in busy hot tiers. Unlike the pre-existing `evict_lru()` (which was only called manually), this eviction is now automatic — triggered by every `store()` and `promote()` — increasing the risk surface. **Suggestion**: Consider demoting evicted fragments to warm (via `self.demote(oldest_id)`) instead of deleting. This would preserve the spec's downward lifecycle and prevent silent context loss. If permanent deletion is intentional, document this design decision explicitly. --- ### T-MED-1: `evict_lru()` event emission is untested **File**: `src/cleveragents/application/services/context_tiers.py:385-392` (new code) The PR adds `_emit_tier_event(EventType.TIER_EVICTED, ...)` inside `evict_lru()`. No test exercises this code path. The Behave scenario *"Budget eviction emits TIER_EVICTED event"* tests event emission via `store()` → `_enforce_hot_budget()`, which has its **own separate** event emission code at line 599 — it never calls `evict_lru()`. These are two independent eviction paths with duplicated event logic. ```python # evict_lru() — line 385 (UNTESTED event emission) for fid in to_evict: del store[fid] self._emit_tier_event(EventType.TIER_EVICTED, fid, from_tier=tier, to_tier=None) # _enforce_hot_budget() — line 599 (tested via store scenario) del self._hot[oldest_id] self._emit_tier_event(EventType.TIER_EVICTED, oldest_id, from_tier=ContextTier.HOT, to_tier=None) ``` **Impact**: The event emission in `evict_lru()` could silently break without any test catching it. Since `evict_lru()` is a public method, this is a coverage gap for new code in the PR. **Suggestion**: Add a scenario (or extend an existing one) that calls `evict_lru()` directly with an event bus and asserts `TIER_EVICTED` events are emitted. --- ### T-MED-2: Robot helper uses weak assertion for auto-promotion result **File**: `robot/helper_context_tier_runtime.py:89-91` `cmd_auto_promote()` asserts `result.tier != ContextTier.COLD` instead of `result.tier == ContextTier.WARM`: ```python assert result.tier != ContextTier.COLD, ( f"Fragment still in cold after 5 accesses: tier={result.tier}" ) ``` After 5 accesses from cold with default `promotion_threshold=5`, the fragment should be in **WARM** (not HOT — the access counter resets after each promotion, preventing chain promotion). A hypothetical bug that incorrectly chain-promoted cold→warm→hot in a single threshold cycle would **pass** this assertion. **Impact**: The assertion wouldn't catch a regression in the counter-reset logic (the C1 review fix). **Suggestion**: Change to `assert result.tier == ContextTier.WARM`. --- ## LOW Severity ### T-LOW-1: Tests rely extensively on private API **Files**: `features/steps/context_tier_runtime_steps.py`, `robot/helper_context_tier_runtime.py` Tests use `svc._find_fragment()`, `svc._hot.values()`, and `svc._budget = TierBudget(...)` throughout. This is pragmatic — there's no public API to inspect tier placement without triggering `get()` side effects — but it couples tests to implementation details. **Impact**: Minor. Tests may break on internal refactoring that preserves public behavior. **Suggestion**: Acknowledged as a known trade-off. If a public `inspect(fragment_id)` or `tier_of(fragment_id)` method is ever added, tests should migrate to it. --- ### T-LOW-2: Missing edge case test coverage for new public methods Several edge cases in the new code are untested: | Untested path | Expected behavior | |---|---| | `get("nonexistent")` → returns `None` | Simple path, low risk | | `promote()` on hot-tier fragment → returns `None` | Already at top tier | | `promote()` on nonexistent fragment → returns `None` | Fragment not found | | `demote()` on cold-tier fragment → returns `None` | Already at bottom tier | | `demote()` on nonexistent fragment → returns `None` | Fragment not found | | `get()` on hot-tier fragment | Basic touch-and-return (no auto-promote) | | `enforce_staleness()` with mixed stale hot AND stale warm | Double-demotion prevention in a single pass | **Impact**: Low — these are simple code paths unlikely to break. But adding them would strengthen the test suite and improve coverage confidence. --- ### S-LOW-1: Missing `CLEVERAGENTS_CTX_HOT_HOURS` env var alias for hot TTL (consistency) **File**: `src/cleveragents/config/settings.py:287-290` The `context_tier_warm_ttl_hours` field correctly includes a `CLEVERAGENTS_CTX_WARM_HOURS` alias matching the spec (line 30555, M1 review fix). However, `context_tier_hot_ttl_hours` only has the auto-generated `CLEVERAGENTS_CONTEXT_TIER_HOT_TTL_HOURS`: ```python context_tier_hot_ttl_hours: int = Field( default=24, ge=1, validation_alias=AliasChoices("CLEVERAGENTS_CONTEXT_TIER_HOT_TTL_HOURS"), # No short alias like CLEVERAGENTS_CTX_HOT_HOURS ) context_tier_warm_ttl_hours: int = Field( default=24, ge=1, validation_alias=AliasChoices( "CLEVERAGENTS_CONTEXT_TIER_WARM_TTL_HOURS", "CLEVERAGENTS_CTX_WARM_HOURS", # Spec alias present ), ) ``` **Impact**: Minor inconsistency. Users who learn the `CTX_WARM_HOURS` short form may expect `CTX_HOT_HOURS` to also work. **Suggestion**: Add `"CLEVERAGENTS_CTX_HOT_HOURS"` as a second alias for `context_tier_hot_ttl_hours`. --- ### S-LOW-2: Spec ambiguity — hot tier TTL vs "Until resource removed" **Files**: `src/cleveragents/application/services/context_tiers.py:421-433`, `docs/specification.md` line 44569 The specification's tier table states hot tier retention is *"Until resource removed"*, implying hot fragments persist indefinitely. The implementation introduces `context_tier_hot_ttl_hours` (default 24h) with `enforce_staleness()` that time-demotes hot fragments, which could appear to conflict. This is reconciled by interpreting: - **Spec "retention"**: data persists in *some* tier (the fragment is demoted, not deleted) - **Implementation "TTL"**: application-level staleness policy for tier *placement* Issue #821 explicitly requires hot TTL (`acms.tier.hot_ttl`), so this is by design. **Impact**: Documentation clarity only. No functional issue. **Suggestion**: Add a brief docstring note in `enforce_staleness()` explaining this reconciliation: hot TTL governs tier placement (demotion), not data retention (deletion), consistent with the spec's "Until resource removed" language. --- ## Positive Observations - **B-CRIT-1 fix** is well-designed: the promote-then-check-then-restore pattern in `promote()` correctly prevents silent data loss on warm→hot promotion with tight budgets. - **Double-demotion prevention** via `warm_before` snapshot in `enforce_staleness()` is correct and well-documented. - **Best-effort event emission** with try/except in `_emit_tier_event()` properly isolates event bus failures from core tier operations. - **Counter reset** after auto-promotion correctly prevents chain promotion (cold→warm→hot in a single threshold cycle). - **Comprehensive BDD coverage**: 16 feature scenarios covering all major paths (promotion, demotion, staleness, budget, events, failing bus). - **Static analysis**: Pyright 0 errors, Ruff 0 violations across all changed files.
CoreRasurae force-pushed bugfix/m5-context-tier-runtime from ccbd8e0962
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 25s
CI / lint (pull_request) Successful in 3m20s
CI / quality (pull_request) Successful in 3m52s
CI / typecheck (pull_request) Successful in 3m56s
CI / security (pull_request) Successful in 4m3s
CI / e2e_tests (pull_request) Failing after 14m28s
CI / integration_tests (pull_request) Failing after 14m28s
CI / coverage (pull_request) Failing after 15m31s
CI / unit_tests (pull_request) Failing after 19m28s
CI / benchmark-regression (pull_request) Successful in 58m51s
CI / docker (pull_request) Has been skipped
CI / status-check (pull_request) Failing after 1s
to 275641d642
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Failing after 12s
CI / lint (pull_request) Successful in 3m18s
CI / quality (pull_request) Successful in 3m40s
CI / typecheck (pull_request) Successful in 3m52s
CI / security (pull_request) Successful in 4m2s
CI / e2e_tests (pull_request) Failing after 4m41s
CI / integration_tests (pull_request) Successful in 6m2s
CI / unit_tests (pull_request) Successful in 7m10s
CI / docker (pull_request) Successful in 56s
CI / coverage (pull_request) Successful in 11m10s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Successful in 1h1m51s
2026-03-25 10:58:19 +00:00
Compare
CoreRasurae left a comment

Code Review Report — PR #1150 (Fourth Review Round)

Issue: #821 — context tier service has no runtime promotion/demotion/eviction logic
Branch: bugfix/m5-context-tier-runtime
Commit: 275641d (HEAD)
Review scope: All code changes in the branch plus close connections to surrounding code (domain models, EventBus protocol, ScopedTierMixin, Settings, specification)

Review Methodology

Four global review cycles were performed across all categories (bugs, test coverage, test flaws, performance, security, specification compliance) across all 12 changed files plus surrounding code. Previous review rounds (freemo's APPROVED review, two self-review reports, and the resolution report) were cross-referenced to avoid duplicating already-addressed or explicitly-deferred findings.

Note: Many issues from prior rounds were correctly fixed (C1 chain-promotion, C2 budget enforcement in promote, B-CRIT-1 data loss fallback, B-HIGH-1 oversized redirect, B-MED-1/2 event ordering and try/except, D-LOW-1/2 type hints and __all__, S-LOW-1/2 env var alias and docstring). Several others were explicitly deferred with valid justification (H2 thread-safety, H4 warm/cold budget, H5 cold-tier retention). This report contains only genuinely new findings not previously raised.


Findings Summary

Severity Bug Test Coverage Total
High 1 0 1
Medium 0 2 2
Low 1 0 1
Total 2 2 4

No critical findings. No blocking issues.


HIGH (1)

B-HIGH-1: access_count preserved through demotion enables immediate re-promotion after staleness enforcement

File: src/cleveragents/application/services/context_tiers.pydemote() (lines 339-364) and _maybe_auto_promote() (lines 550-574)

Note: This is distinct from the previously-raised M2 (which concerned last_accessed timestamp preservation). This finding concerns access_count preservation.

When enforce_staleness() demotes a fragment (e.g., hot → warm), demote() creates a model_copy() that preserves ALL fields except tier — including access_count. If the fragment had accumulated access_count >= promotion_threshold (default: 5) during its time in the hot tier, the very next get() call will:

  1. _touch() increments access_count (e.g., 100 → 101)
  2. _maybe_auto_promote() checks: 101 >= 5immediately re-promotes to hot
  3. access_count is reset to 0 (per the C1 fix)

Consequence: Staleness enforcement is effectively a no-op for any fragment that has been accessed more than promotion_threshold times in its lifetime. A single access after demotion undoes the staleness enforcement entirely.

Trace:

Fragment in HOT, access_count=100 (accumulated over time)
enforce_staleness() → demote() → WARM, access_count=100 (preserved)
get() → _touch() → access_count=101
_maybe_auto_promote(): 101 >= 5 → promote() → HOT, access_count=0

Suggested fix: Reset access_count to 0 in demote() (or in enforce_staleness() after calling demote()), so that demoted fragments must accumulate fresh accesses to earn re-promotion — mirroring the counter-reset logic already applied on promotion:

def demote(self, fragment_id: str) -> TieredFragment | None:
    if fragment_id in self._hot:
        frag = self._hot.pop(fragment_id)
        demoted = frag.model_copy(update={"tier": ContextTier.WARM, "access_count": 0})
        self._warm[fragment_id] = demoted
        ...

MEDIUM (2)

T-MED-1: No test for staleness demotion followed by immediate re-promotion (B-HIGH-1 scenario)

File: features/context_tier_runtime.feature

No Behave scenario exercises the following sequence:

  1. Fragment accumulates many accesses in a tier (access_count >> threshold)
  2. Fragment is demoted via enforce_staleness()
  3. Fragment is accessed via get() immediately after demotion
  4. Assert: fragment should NOT be immediately re-promoted (or assert current behavior if by design)

The existing scenario "Promoted fragment requires fresh accesses for next promotion" only tests the cold→warm→hot chain with counter reset on upward transitions. It does not test what happens to access_count across a downward transition (demotion).

Suggested fix: Add a scenario such as:

Scenario: Demoted fragment requires fresh accesses before re-promotion
  Given a context tier service with default settings
  And a fragment "frag-pop" stored in the hot tier with 50 tokens
  When I access "frag-pop" via get 10 times
  And I invoke enforce_staleness on the service with a very short hot TTL
  And I access "frag-pop" via get 1 times
  Then fragment "frag-pop" should be in the warm tier

T-MED-2: No test for minimum promotion threshold edge case (threshold=1)

File: src/cleveragents/config/settings.py:281-285, features/context_tier_runtime.feature

The context_tier_promotion_threshold setting has ge=1, meaning a threshold of 1 is valid. With threshold=1, every single get() call on a warm/cold fragment would trigger auto-promotion. No test verifies this edge case:

  • Does threshold=1 correctly promote on the very first access?
  • Does the counter reset still work correctly (preventing chain promotion in a single get)?

Suggested fix: Add a scenario with a custom threshold of 1 that verifies promotion on first access without chain-promoting through both tiers.


LOW (1)

B-LOW-1: Tier transition events lack project_name from the fragment

File: src/cleveragents/application/services/context_tiers.py:626-657

_emit_tier_event() creates a DomainEvent with only fragment_id, from_tier, and to_tier in the details dict. The DomainEvent model has a project_name field (line 62 of models.py), and TieredFragment carries project_name (used for scope enforcement). However, tier events don't populate project_name on the event itself.

For multi-project deployments, event consumers (dashboards, alerting, audit logs) cannot filter tier events by project without looking up the fragment separately.

Suggested fix: Pass project_name from the fragment to the DomainEvent constructor:

event = DomainEvent(
    event_type=event_type,
    project_name=fragment.project_name if fragment else None,
    details=details,
)

This would require _emit_tier_event() to accept the fragment (or project_name) as a parameter.


Positive Observations

  • The B-CRIT-1 fix (promote fallback to warm on budget eviction) is well-designed and correctly handles the edge case.
  • The double-demotion prevention via warm_before snapshot in enforce_staleness() is a subtle correctness detail handled well.
  • Best-effort event emission with try/except properly isolates event bus failures from core operations.
  • The _resolve() helper correctly re-fetches the fragment after possible auto-promotion to return the latest state.
  • Comprehensive test coverage across three frameworks (22 Behave scenarios, 4 Robot tests, 4 ASV benchmarks).
  • Clean static analysis: the code passes Pyright and Ruff with zero violations.

Review performed on commit 275641d — 4 global review cycles across bugs, test coverage, test flaws, performance, security, and specification alignment. Previous review findings cross-referenced to report only genuinely new issues.

## Code Review Report — PR #1150 (Fourth Review Round) **Issue**: #821 — context tier service has no runtime promotion/demotion/eviction logic **Branch**: `bugfix/m5-context-tier-runtime` **Commit**: `275641d` (HEAD) **Review scope**: All code changes in the branch plus close connections to surrounding code (domain models, EventBus protocol, ScopedTierMixin, Settings, specification) ### Review Methodology Four global review cycles were performed across all categories (bugs, test coverage, test flaws, performance, security, specification compliance) across all 12 changed files plus surrounding code. Previous review rounds (freemo's APPROVED review, two self-review reports, and the resolution report) were cross-referenced to avoid duplicating already-addressed or explicitly-deferred findings. **Note**: Many issues from prior rounds were correctly fixed (C1 chain-promotion, C2 budget enforcement in promote, B-CRIT-1 data loss fallback, B-HIGH-1 oversized redirect, B-MED-1/2 event ordering and try/except, D-LOW-1/2 type hints and `__all__`, S-LOW-1/2 env var alias and docstring). Several others were explicitly deferred with valid justification (H2 thread-safety, H4 warm/cold budget, H5 cold-tier retention). This report contains only **genuinely new** findings not previously raised. --- ## Findings Summary | Severity | Bug | Test Coverage | Total | |----------|-----|---------------|-------| | **High** | 1 | 0 | **1** | | **Medium** | 0 | 2 | **2** | | **Low** | 1 | 0 | **1** | | **Total** | **2** | **2** | **4** | No critical findings. No blocking issues. --- ## HIGH (1) ### B-HIGH-1: `access_count` preserved through demotion enables immediate re-promotion after staleness enforcement **File**: `src/cleveragents/application/services/context_tiers.py` — `demote()` (lines 339-364) and `_maybe_auto_promote()` (lines 550-574) **Note**: This is distinct from the previously-raised M2 (which concerned `last_accessed` timestamp preservation). This finding concerns `access_count` preservation. When `enforce_staleness()` demotes a fragment (e.g., hot → warm), `demote()` creates a `model_copy()` that preserves ALL fields except `tier` — including `access_count`. If the fragment had accumulated `access_count >= promotion_threshold` (default: 5) during its time in the hot tier, the very next `get()` call will: 1. `_touch()` increments `access_count` (e.g., 100 → 101) 2. `_maybe_auto_promote()` checks: `101 >= 5` → **immediately re-promotes** to hot 3. `access_count` is reset to 0 (per the C1 fix) **Consequence**: Staleness enforcement is effectively a no-op for any fragment that has been accessed more than `promotion_threshold` times in its lifetime. A single access after demotion undoes the staleness enforcement entirely. **Trace**: ``` Fragment in HOT, access_count=100 (accumulated over time) enforce_staleness() → demote() → WARM, access_count=100 (preserved) get() → _touch() → access_count=101 _maybe_auto_promote(): 101 >= 5 → promote() → HOT, access_count=0 ``` **Suggested fix**: Reset `access_count` to `0` in `demote()` (or in `enforce_staleness()` after calling `demote()`), so that demoted fragments must accumulate fresh accesses to earn re-promotion — mirroring the counter-reset logic already applied on promotion: ```python def demote(self, fragment_id: str) -> TieredFragment | None: if fragment_id in self._hot: frag = self._hot.pop(fragment_id) demoted = frag.model_copy(update={"tier": ContextTier.WARM, "access_count": 0}) self._warm[fragment_id] = demoted ... ``` --- ## MEDIUM (2) ### T-MED-1: No test for staleness demotion followed by immediate re-promotion (B-HIGH-1 scenario) **File**: `features/context_tier_runtime.feature` No Behave scenario exercises the following sequence: 1. Fragment accumulates many accesses in a tier (access_count >> threshold) 2. Fragment is demoted via `enforce_staleness()` 3. Fragment is accessed via `get()` immediately after demotion 4. Assert: fragment should NOT be immediately re-promoted (or assert current behavior if by design) The existing scenario "Promoted fragment requires fresh accesses for next promotion" only tests the cold→warm→hot chain with counter reset on **upward** transitions. It does not test what happens to `access_count` across a **downward** transition (demotion). **Suggested fix**: Add a scenario such as: ```gherkin Scenario: Demoted fragment requires fresh accesses before re-promotion Given a context tier service with default settings And a fragment "frag-pop" stored in the hot tier with 50 tokens When I access "frag-pop" via get 10 times And I invoke enforce_staleness on the service with a very short hot TTL And I access "frag-pop" via get 1 times Then fragment "frag-pop" should be in the warm tier ``` --- ### T-MED-2: No test for minimum promotion threshold edge case (threshold=1) **File**: `src/cleveragents/config/settings.py:281-285`, `features/context_tier_runtime.feature` The `context_tier_promotion_threshold` setting has `ge=1`, meaning a threshold of 1 is valid. With threshold=1, every single `get()` call on a warm/cold fragment would trigger auto-promotion. No test verifies this edge case: - Does threshold=1 correctly promote on the very first access? - Does the counter reset still work correctly (preventing chain promotion in a single get)? **Suggested fix**: Add a scenario with a custom threshold of 1 that verifies promotion on first access without chain-promoting through both tiers. --- ## LOW (1) ### B-LOW-1: Tier transition events lack `project_name` from the fragment **File**: `src/cleveragents/application/services/context_tiers.py:626-657` `_emit_tier_event()` creates a `DomainEvent` with only `fragment_id`, `from_tier`, and `to_tier` in the `details` dict. The `DomainEvent` model has a `project_name` field (line 62 of `models.py`), and `TieredFragment` carries `project_name` (used for scope enforcement). However, tier events don't populate `project_name` on the event itself. For multi-project deployments, event consumers (dashboards, alerting, audit logs) cannot filter tier events by project without looking up the fragment separately. **Suggested fix**: Pass `project_name` from the fragment to the `DomainEvent` constructor: ```python event = DomainEvent( event_type=event_type, project_name=fragment.project_name if fragment else None, details=details, ) ``` This would require `_emit_tier_event()` to accept the fragment (or project_name) as a parameter. --- ## Positive Observations - The B-CRIT-1 fix (promote fallback to warm on budget eviction) is well-designed and correctly handles the edge case. - The double-demotion prevention via `warm_before` snapshot in `enforce_staleness()` is a subtle correctness detail handled well. - Best-effort event emission with `try/except` properly isolates event bus failures from core operations. - The `_resolve()` helper correctly re-fetches the fragment after possible auto-promotion to return the latest state. - Comprehensive test coverage across three frameworks (22 Behave scenarios, 4 Robot tests, 4 ASV benchmarks). - Clean static analysis: the code passes Pyright and Ruff with zero violations. --- *Review performed on commit `275641d` — 4 global review cycles across bugs, test coverage, test flaws, performance, security, and specification alignment. Previous review findings cross-referenced to report only genuinely new issues.*
Author
Member

Review Fixes Applied — Resolution Report (Fourth Review Round)

All findings from the fourth code review were validated against docs/specification.md, issue #821 acceptance criteria, and CONTRIBUTING.md guidelines. Fixes were applied, tested, and amended into the feature commit (6522a76f).


Fixes Applied (3 items)

ID Finding Fix Applied
B-HIGH-1 access_count preserved through demotion enables immediate re-promotion after staleness enforcement Reset access_count to 0 in both demote() code paths (hot→warm and warm→cold) via model_copy(update={"tier": ..., "access_count": 0}). Added docstring explaining the rationale: without this reset, a previously popular fragment whose access_count already exceeds the promotion threshold would be re-promoted on the very next get() call, making staleness enforcement ineffective. This mirrors the counter-reset principle already established by the C1 fix for auto-promotion.
T-MED-1 No test for staleness demotion followed by immediate re-promotion Added Behave scenario "Demoted fragment requires fresh accesses before re-promotion" that: (1) stores a hot fragment, (2) accesses it 10 times to accumulate access_count, (3) enforces staleness with a 0-hour TTL, (4) accesses it once more, (5) asserts it remains in warm tier (not re-promoted). Added step definition step_enforce_staleness_custom_ttl for the custom-TTL staleness enforcement.
T-MED-2 No test for minimum promotion threshold edge case (threshold=1) Added Behave scenario "Promotion threshold of 1 promotes on first access without chain promotion" that: (1) creates a service with threshold=1, (2) stores a cold fragment, (3) accesses it once, (4) asserts it is in warm (not chain-promoted to hot). Added step definition step_service_custom_threshold.

Findings Not Applied (1 item) — with justification

ID Finding Reason Not Applied
B-LOW-1 Tier transition events lack project_name from the fragment Out of scope for #821. Issue #821's acceptance criterion is "Add tier transition event emission for observability" — the events are emitted correctly with fragment_id, from_tier, and to_tier. Adding project_name would require changing the _emit_tier_event() method signature (and all call sites) to accept either the fragment or its project_name. For eviction events, the fragment has already been deleted from the store before _emit_tier_event() is called, making lookup impossible — the method would need to receive the project_name as a parameter from each caller. This is a feature enhancement that expands the change scope beyond the bug fix and should be tracked as a separate issue.

Quality Gates (post-fix)

Session Result
lint PASS
typecheck PASS — 0 errors, 1 pre-existing warning
unit_tests PASS — 12,319 scenarios, 0 failures
integration_tests PASS — 1,706 tests, 0 failures
# Review Fixes Applied — Resolution Report (Fourth Review Round) All findings from the fourth code review were validated against `docs/specification.md`, issue #821 acceptance criteria, and `CONTRIBUTING.md` guidelines. Fixes were applied, tested, and amended into the feature commit (`6522a76f`). --- ## Fixes Applied (3 items) | ID | Finding | Fix Applied | |----|---------|-------------| | **B-HIGH-1** | `access_count` preserved through demotion enables immediate re-promotion after staleness enforcement | Reset `access_count` to `0` in both `demote()` code paths (hot→warm and warm→cold) via `model_copy(update={"tier": ..., "access_count": 0})`. Added docstring explaining the rationale: without this reset, a previously popular fragment whose `access_count` already exceeds the promotion threshold would be re-promoted on the very next `get()` call, making staleness enforcement ineffective. This mirrors the counter-reset principle already established by the C1 fix for auto-promotion. | | **T-MED-1** | No test for staleness demotion followed by immediate re-promotion | Added Behave scenario "Demoted fragment requires fresh accesses before re-promotion" that: (1) stores a hot fragment, (2) accesses it 10 times to accumulate access_count, (3) enforces staleness with a 0-hour TTL, (4) accesses it once more, (5) asserts it remains in warm tier (not re-promoted). Added step definition `step_enforce_staleness_custom_ttl` for the custom-TTL staleness enforcement. | | **T-MED-2** | No test for minimum promotion threshold edge case (threshold=1) | Added Behave scenario "Promotion threshold of 1 promotes on first access without chain promotion" that: (1) creates a service with threshold=1, (2) stores a cold fragment, (3) accesses it once, (4) asserts it is in warm (not chain-promoted to hot). Added step definition `step_service_custom_threshold`. | --- ## Findings Not Applied (1 item) — with justification | ID | Finding | Reason Not Applied | |----|---------|-------------------| | **B-LOW-1** | Tier transition events lack `project_name` from the fragment | **Out of scope for #821.** Issue #821's acceptance criterion is "Add tier transition event emission for observability" — the events are emitted correctly with `fragment_id`, `from_tier`, and `to_tier`. Adding `project_name` would require changing the `_emit_tier_event()` method signature (and all call sites) to accept either the fragment or its project_name. For eviction events, the fragment has already been deleted from the store before `_emit_tier_event()` is called, making lookup impossible — the method would need to receive the project_name as a parameter from each caller. This is a feature enhancement that expands the change scope beyond the bug fix and should be tracked as a separate issue. | --- ## Quality Gates (post-fix) | Session | Result | |---------|--------| | `lint` | PASS | | `typecheck` | PASS — 0 errors, 1 pre-existing warning | | `unit_tests` | PASS — 12,319 scenarios, 0 failures | | `integration_tests` | PASS — 1,706 tests, 0 failures |
CoreRasurae force-pushed bugfix/m5-context-tier-runtime from 275641d642
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Failing after 12s
CI / lint (pull_request) Successful in 3m18s
CI / quality (pull_request) Successful in 3m40s
CI / typecheck (pull_request) Successful in 3m52s
CI / security (pull_request) Successful in 4m2s
CI / e2e_tests (pull_request) Failing after 4m41s
CI / integration_tests (pull_request) Successful in 6m2s
CI / unit_tests (pull_request) Successful in 7m10s
CI / docker (pull_request) Successful in 56s
CI / coverage (pull_request) Successful in 11m10s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Successful in 1h1m51s
to d90b1b8a68
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 23s
CI / lint (pull_request) Successful in 3m22s
CI / typecheck (pull_request) Successful in 3m46s
CI / security (pull_request) Successful in 4m6s
CI / quality (pull_request) Successful in 4m11s
CI / e2e_tests (pull_request) Failing after 5m13s
CI / integration_tests (pull_request) Successful in 7m1s
CI / unit_tests (pull_request) Successful in 7m3s
CI / docker (pull_request) Successful in 1m14s
CI / coverage (pull_request) Successful in 9m49s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Has been cancelled
2026-03-25 17:08:18 +00:00
Compare
CoreRasurae force-pushed bugfix/m5-context-tier-runtime from d90b1b8a68
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 23s
CI / lint (pull_request) Successful in 3m22s
CI / typecheck (pull_request) Successful in 3m46s
CI / security (pull_request) Successful in 4m6s
CI / quality (pull_request) Successful in 4m11s
CI / e2e_tests (pull_request) Failing after 5m13s
CI / integration_tests (pull_request) Successful in 7m1s
CI / unit_tests (pull_request) Successful in 7m3s
CI / docker (pull_request) Successful in 1m14s
CI / coverage (pull_request) Successful in 9m49s
CI / status-check (pull_request) Successful in 1s
CI / benchmark-regression (pull_request) Has been cancelled
to 34c2acc354
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 3m18s
CI / typecheck (pull_request) Successful in 4m3s
CI / security (pull_request) Successful in 4m13s
CI / build (pull_request) Successful in 21s
CI / quality (pull_request) Successful in 3m42s
CI / integration_tests (pull_request) Successful in 8m50s
CI / unit_tests (pull_request) Successful in 9m14s
CI / e2e_tests (pull_request) Successful in 9m28s
CI / docker (pull_request) Successful in 1m9s
CI / coverage (pull_request) Successful in 10m24s
CI / status-check (pull_request) Successful in 3s
CI / benchmark-regression (pull_request) Successful in 48m36s
CI / build (push) Successful in 22s
CI / lint (push) Successful in 3m43s
CI / typecheck (push) Successful in 4m19s
CI / benchmark-regression (push) Has been skipped
CI / security (push) Successful in 4m28s
CI / quality (push) Successful in 4m21s
CI / integration_tests (push) Successful in 9m12s
CI / e2e_tests (push) Successful in 9m29s
CI / unit_tests (push) Successful in 9m36s
CI / docker (push) Successful in 1m8s
CI / coverage (push) Successful in 10m33s
CI / status-check (push) Successful in 1s
CI / benchmark-publish (push) Successful in 25m40s
2026-03-25 17:49:25 +00:00
Compare
CoreRasurae deleted branch bugfix/m5-context-tier-runtime 2026-03-25 19:04:12 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!1150
No description provided.