bug(acms): ContextTierService does not enforce TierBudget.max_decisions_warm and max_decisions_cold — warm and cold tiers grow unboundedly #2389

Open
opened 2026-04-03 17:30:13 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/acms-tier-capacity-enforcement
  • Commit Message: fix(acms): enforce warm and cold tier capacity limits in ContextTierService
  • Milestone: v3.4.0
  • Parent Epic: #396

Description

The ContextTierService enforces the hot tier token budget (TierBudget.max_tokens_hot) via _enforce_hot_budget(), but the warm and cold tier capacity limits (max_decisions_warm and max_decisions_cold) are defined in TierBudget but never enforced.

From context_tiers.pystore() method:

def store(self, fragment: TieredFragment) -> None:
    ...
    if fragment.tier == ContextTier.HOT:
        ...
        self._hot[fragment.fragment_id] = fragment
        self._enforce_hot_budget()  # ← hot tier enforced
    elif fragment.tier == ContextTier.WARM:
        self._warm[fragment.fragment_id] = fragment  # ← NO capacity check!
    else:
        self._cold[fragment.fragment_id] = fragment  # ← NO capacity check!

From tiers.pyTierBudget:

class TierBudget(BaseModel):
    max_tokens_hot: int = Field(default=8000, ...)  # enforced
    max_decisions_warm: int = Field(default=500, ...)  # NOT enforced
    max_decisions_cold: int = Field(default=5000, ...)  # NOT enforced

Impact:

  1. The warm tier can grow beyond max_decisions_warm (default 500) without any eviction. In a long-running session with many context fragments, this causes unbounded memory growth.
  2. The cold tier can grow beyond max_decisions_cold (default 5000) without any eviction or archival.
  3. The spec requires all three tiers to have enforced capacity limits as part of the tiered storage lifecycle (spec ACMS tier sections, issue #208).
  4. TierBudget.max_decisions_warm and max_decisions_cold are exposed in the CLI (agents project context set --warm-max-decisions, --cold-max-decisions) and stored in the project config, but the values have no effect on actual tier behavior.

Code Locations:

  • src/cleveragents/application/services/context_tiers.pystore() method (lines ~140-175)
  • src/cleveragents/application/services/tier_runtime.py_enforce_hot_budget() (hot tier only)
  • src/cleveragents/domain/models/acms/tiers.pyTierBudget definition
  • src/cleveragents/cli/commands/project_context.py — CLI exposes --warm-max-decisions and --cold-max-decisions

Expected Behavior (per spec ACMS tier sections):
When a fragment is stored in the warm tier and len(self._warm) > max_decisions_warm, the least-recently-used warm fragment should be demoted to cold (or evicted). Similarly for cold tier exceeding max_decisions_cold.

Actual Behavior:
Warm and cold tiers grow without bound. max_decisions_warm and max_decisions_cold are stored in config and displayed in CLI but have no effect on tier behavior.

Subtasks

  • Write a TDD issue-capture Behave scenario (tagged @tdd_expected_fail) in features/ demonstrating that storing more than max_decisions_warm fragments in the warm tier does NOT currently trigger eviction
  • Write a TDD issue-capture Behave scenario (tagged @tdd_expected_fail) demonstrating that storing more than max_decisions_cold fragments in the cold tier does NOT currently trigger eviction
  • Implement _enforce_warm_budget() in tier_runtime.py that evicts/demotes LRU warm fragments when len(self._warm) > budget.max_decisions_warm
  • Implement _enforce_cold_budget() in tier_runtime.py that evicts LRU cold fragments when len(self._cold) > budget.max_decisions_cold
  • Update store() in context_tiers.py to call _enforce_warm_budget() after inserting into self._warm
  • Update store() in context_tiers.py to call _enforce_cold_budget() after inserting into self._cold
  • Remove @tdd_expected_fail tags from the TDD capture scenarios and verify they now pass
  • Add full Behave unit test coverage for _enforce_warm_budget() and _enforce_cold_budget() (boundary conditions: at limit, one over, many over)
  • Add Robot Framework integration test verifying warm/cold tier capacity enforcement in a live session
  • Verify all type annotations are correct and nox -e typecheck passes
  • Verify nox -e lint passes
  • Verify nox -e unit_tests passes
  • Verify nox -e integration_tests passes
  • Verify nox -e coverage_report shows coverage >= 97%

Definition of Done

  • All subtasks above are checked off
  • _enforce_warm_budget() and _enforce_cold_budget() are implemented in tier_runtime.py with correct LRU eviction logic
  • store() in context_tiers.py calls the new enforcement methods for warm and cold tiers
  • TierBudget.max_decisions_warm and max_decisions_cold are now functionally enforced (not just stored in config)
  • Commit created with exact message: fix(acms): enforce warm and cold tier capacity limits in ContextTierService
  • Commit pushed to branch fix/acms-tier-capacity-enforcement
  • PR submitted, reviewed by at least 2 non-author contributors, and merged
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-uat-tester

## Metadata - **Branch**: `fix/acms-tier-capacity-enforcement` - **Commit Message**: `fix(acms): enforce warm and cold tier capacity limits in ContextTierService` - **Milestone**: v3.4.0 - **Parent Epic**: #396 ## Description The `ContextTierService` enforces the hot tier token budget (`TierBudget.max_tokens_hot`) via `_enforce_hot_budget()`, but the warm and cold tier capacity limits (`max_decisions_warm` and `max_decisions_cold`) are defined in `TierBudget` but never enforced. **From `context_tiers.py`** — `store()` method: ```python def store(self, fragment: TieredFragment) -> None: ... if fragment.tier == ContextTier.HOT: ... self._hot[fragment.fragment_id] = fragment self._enforce_hot_budget() # ← hot tier enforced elif fragment.tier == ContextTier.WARM: self._warm[fragment.fragment_id] = fragment # ← NO capacity check! else: self._cold[fragment.fragment_id] = fragment # ← NO capacity check! ``` **From `tiers.py`** — `TierBudget`: ```python class TierBudget(BaseModel): max_tokens_hot: int = Field(default=8000, ...) # enforced max_decisions_warm: int = Field(default=500, ...) # NOT enforced max_decisions_cold: int = Field(default=5000, ...) # NOT enforced ``` **Impact**: 1. The warm tier can grow beyond `max_decisions_warm` (default 500) without any eviction. In a long-running session with many context fragments, this causes unbounded memory growth. 2. The cold tier can grow beyond `max_decisions_cold` (default 5000) without any eviction or archival. 3. The spec requires all three tiers to have enforced capacity limits as part of the tiered storage lifecycle (spec ACMS tier sections, issue #208). 4. `TierBudget.max_decisions_warm` and `max_decisions_cold` are exposed in the CLI (`agents project context set --warm-max-decisions`, `--cold-max-decisions`) and stored in the project config, but the values have no effect on actual tier behavior. **Code Locations**: - `src/cleveragents/application/services/context_tiers.py` — `store()` method (lines ~140-175) - `src/cleveragents/application/services/tier_runtime.py` — `_enforce_hot_budget()` (hot tier only) - `src/cleveragents/domain/models/acms/tiers.py` — `TierBudget` definition - `src/cleveragents/cli/commands/project_context.py` — CLI exposes `--warm-max-decisions` and `--cold-max-decisions` **Expected Behavior** (per spec ACMS tier sections): When a fragment is stored in the warm tier and `len(self._warm) > max_decisions_warm`, the least-recently-used warm fragment should be demoted to cold (or evicted). Similarly for cold tier exceeding `max_decisions_cold`. **Actual Behavior**: Warm and cold tiers grow without bound. `max_decisions_warm` and `max_decisions_cold` are stored in config and displayed in CLI but have no effect on tier behavior. ## Subtasks - [ ] Write a TDD issue-capture Behave scenario (tagged `@tdd_expected_fail`) in `features/` demonstrating that storing more than `max_decisions_warm` fragments in the warm tier does NOT currently trigger eviction - [ ] Write a TDD issue-capture Behave scenario (tagged `@tdd_expected_fail`) demonstrating that storing more than `max_decisions_cold` fragments in the cold tier does NOT currently trigger eviction - [ ] Implement `_enforce_warm_budget()` in `tier_runtime.py` that evicts/demotes LRU warm fragments when `len(self._warm) > budget.max_decisions_warm` - [ ] Implement `_enforce_cold_budget()` in `tier_runtime.py` that evicts LRU cold fragments when `len(self._cold) > budget.max_decisions_cold` - [ ] Update `store()` in `context_tiers.py` to call `_enforce_warm_budget()` after inserting into `self._warm` - [ ] Update `store()` in `context_tiers.py` to call `_enforce_cold_budget()` after inserting into `self._cold` - [ ] Remove `@tdd_expected_fail` tags from the TDD capture scenarios and verify they now pass - [ ] Add full Behave unit test coverage for `_enforce_warm_budget()` and `_enforce_cold_budget()` (boundary conditions: at limit, one over, many over) - [ ] Add Robot Framework integration test verifying warm/cold tier capacity enforcement in a live session - [ ] Verify all type annotations are correct and `nox -e typecheck` passes - [ ] Verify `nox -e lint` passes - [ ] Verify `nox -e unit_tests` passes - [ ] Verify `nox -e integration_tests` passes - [ ] Verify `nox -e coverage_report` shows coverage >= 97% ## Definition of Done - [ ] All subtasks above are checked off - [ ] `_enforce_warm_budget()` and `_enforce_cold_budget()` are implemented in `tier_runtime.py` with correct LRU eviction logic - [ ] `store()` in `context_tiers.py` calls the new enforcement methods for warm and cold tiers - [ ] `TierBudget.max_decisions_warm` and `max_decisions_cold` are now functionally enforced (not just stored in config) - [ ] Commit created with exact message: `fix(acms): enforce warm and cold tier capacity limits in ContextTierService` - [ ] Commit pushed to branch `fix/acms-tier-capacity-enforcement` - [ ] PR submitted, reviewed by at least 2 non-author contributors, and merged - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-uat-tester
freemo added this to the v3.4.0 milestone 2026-04-03 17:30:18 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: Medium — Warm and cold tier capacity limits not enforced means these tiers grow unboundedly, potentially consuming excessive memory.
  • Milestone: v3.4.0
  • MoSCoW: Should Have — While the hot tier budget is enforced, unbounded warm/cold tiers are a resource management issue that should be fixed for production readiness.
  • Parent Epic: #396 (ACMS Context Pipeline)

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **Priority**: Medium — Warm and cold tier capacity limits not enforced means these tiers grow unboundedly, potentially consuming excessive memory. - **Milestone**: v3.4.0 - **MoSCoW**: Should Have — While the hot tier budget is enforced, unbounded warm/cold tiers are a resource management issue that should be fixed for production readiness. - **Parent Epic**: #396 (ACMS Context Pipeline) --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
freemo removed this from the v3.4.0 milestone 2026-04-06 21:01:27 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#396 Epic: ACMS Context Pipeline
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#2389
No description provided.