feat(acms): add ContextFragment and ScoredFragment data models #538

Closed
opened 2026-03-04 00:59:06 +00:00 by freemo · 1 comment
Owner

Metadata

  • Commit Message: feat(acms): add ContextFragment and ScoredFragment data models
  • Branch: feature/m5-context-fragment-models
Field Value
Type Feature
Priority Critical
MoSCoW Must Have
Points 5
Milestone v3.4.0
Assignee freemo
Parent Epic #396 (Epic: ACMS Context Pipeline)
Depends On #188 (ACMS v1 context pipeline), #189 (UKO ontology scaffolding)
Blocks Pipeline component issues (FragmentDeduplicator, FragmentScorer, BudgetPacker, etc.)

Background

The specification (§ Core Concepts > ContextFragment; § Architecture > ACMS, lines 42827-42850) defines ContextFragment as the atomic unit of context flowing through the ACMS pipeline. Every pipeline component (Deduplicator, Scorer, Packer, Orderer) consumes or produces these fragments. Without this data model, no pipeline component can be implemented.

The spec also defines ScoredFragment as a wrapper that pairs a ContextFragment with a composite relevance score computed by the FragmentScorer.

Currently the codebase has no ContextFragment or ScoredFragment model — the ACMS code works with raw dicts or ad-hoc structures.

Acceptance Criteria

  1. ContextFragment dataclass/model exists with fields: uko_uri, detail_depth, content, token_count, relevance_score, provenance, strategy_source, timestamp.
  2. ScoredFragment dataclass wraps a ContextFragment with composite_score, score_breakdown (dict of component scores), and rank.
  3. Both models are immutable (frozen dataclass or similar).
  4. Fragment equality is based on uko_uri + detail_depth (for deduplication).
  5. Models integrate with the existing ACMS module structure under domain/contexts/ or application/acms/.
  6. Full type hints and docstrings per CONTRIBUTING.md standards.

Subtasks

1. Design

  • Review spec lines 42827-42850 and 25077 for field definitions
  • Determine module placement (domain/contexts/fragment.py or application/acms/models.py)
  • Design serialization for SQLite persistence if needed

2. Implementation

  • Create ContextFragment frozen dataclass with all spec fields
  • Create ScoredFragment frozen dataclass
  • Add factory methods for common construction patterns
  • Add __eq__ and __hash__ based on uko_uri + detail_depth

3. Testing

  • Unit tests for construction, equality, hashing
  • Unit tests for immutability guarantees
  • Property-based tests for fragment collections (dedup behavior)

4. Documentation

  • Docstrings on all public classes and methods
  • Update module __init__.py exports

5. Integration

  • Verify compatibility with existing ACMS pipeline code (#188, #192)
  • Verify UKO URI format compatibility with #189

6. Observability

  • Add __repr__ for debugging
  • Token count validation (non-negative)

7. Security

  • No security implications for data models

Definition of Done

  • All acceptance criteria met
  • All subtask checkboxes checked
  • Tests pass in CI
  • Code reviewed and approved
  • No regressions in existing ACMS tests
## Metadata - **Commit Message**: `feat(acms): add ContextFragment and ScoredFragment data models` - **Branch**: `feature/m5-context-fragment-models` | Field | Value | |-------|-------| | **Type** | Feature | | **Priority** | Critical | | **MoSCoW** | Must Have | | **Points** | 5 | | **Milestone** | v3.4.0 | | **Assignee** | freemo | | **Parent Epic** | #396 (Epic: ACMS Context Pipeline) | | **Depends On** | #188 (ACMS v1 context pipeline), #189 (UKO ontology scaffolding) | | **Blocks** | Pipeline component issues (FragmentDeduplicator, FragmentScorer, BudgetPacker, etc.) | ## Background The specification (§ Core Concepts > ContextFragment; § Architecture > ACMS, lines 42827-42850) defines `ContextFragment` as the atomic unit of context flowing through the ACMS pipeline. Every pipeline component (Deduplicator, Scorer, Packer, Orderer) consumes or produces these fragments. Without this data model, no pipeline component can be implemented. The spec also defines `ScoredFragment` as a wrapper that pairs a `ContextFragment` with a composite relevance score computed by the `FragmentScorer`. Currently the codebase has no `ContextFragment` or `ScoredFragment` model — the ACMS code works with raw dicts or ad-hoc structures. ## Acceptance Criteria 1. `ContextFragment` dataclass/model exists with fields: `uko_uri`, `detail_depth`, `content`, `token_count`, `relevance_score`, `provenance`, `strategy_source`, `timestamp`. 2. `ScoredFragment` dataclass wraps a `ContextFragment` with `composite_score`, `score_breakdown` (dict of component scores), and `rank`. 3. Both models are immutable (frozen dataclass or similar). 4. Fragment equality is based on `uko_uri` + `detail_depth` (for deduplication). 5. Models integrate with the existing ACMS module structure under `domain/contexts/` or `application/acms/`. 6. Full type hints and docstrings per CONTRIBUTING.md standards. ## Subtasks ### 1. Design - [ ] Review spec lines 42827-42850 and 25077 for field definitions - [ ] Determine module placement (`domain/contexts/fragment.py` or `application/acms/models.py`) - [ ] Design serialization for SQLite persistence if needed ### 2. Implementation - [ ] Create `ContextFragment` frozen dataclass with all spec fields - [ ] Create `ScoredFragment` frozen dataclass - [ ] Add factory methods for common construction patterns - [ ] Add `__eq__` and `__hash__` based on `uko_uri` + `detail_depth` ### 3. Testing - [ ] Unit tests for construction, equality, hashing - [ ] Unit tests for immutability guarantees - [ ] Property-based tests for fragment collections (dedup behavior) ### 4. Documentation - [ ] Docstrings on all public classes and methods - [ ] Update module `__init__.py` exports ### 5. Integration - [ ] Verify compatibility with existing ACMS pipeline code (#188, #192) - [ ] Verify UKO URI format compatibility with #189 ### 6. Observability - [ ] Add `__repr__` for debugging - [ ] Token count validation (non-negative) ### 7. Security - [ ] No security implications for data models ## Definition of Done - [ ] All acceptance criteria met - [ ] All subtask checkboxes checked - [ ] Tests pass in CI - [ ] Code reviewed and approved - [ ] No regressions in existing ACMS tests
freemo added this to the v3.2.0 milestone 2026-03-04 00:59:50 +00:00
freemo modified the milestone from v3.2.0 to v3.4.0 2026-03-04 01:09:35 +00:00
freemo self-assigned this 2026-03-04 01:41:11 +00:00
Author
Owner

Implementation Notes

What was done

  1. ScoredFragment model (src/cleveragents/domain/contexts/fragment.py) — Frozen Pydantic v2 model wrapping ContextFragment with:

    • composite_score: float (0.0-1.0) — overall ranking score
    • score_breakdown: dict[str, float] — component-level breakdown (e.g. relevance, hierarchy, recency)
    • rank: int — ordinal rank assigned after scoring
    • Custom __eq__/__hash__ based on fragment.uko_node + fragment.detail_depth for deduplication
    • Factory methods: from_fragment() and from_relevance()
    • Convenience properties: uko_node, detail_depth, token_count
  2. strategy_source field added to ContextFragment (src/cleveragents/domain/models/core/context_fragment.py) — Tracks which strategy produced the fragment. Defaults to empty string.

  3. domain/contexts/__init__.py updated to export ScoredFragment.

Design Decisions

  • Wrapper pattern (not inheritance): ScoredFragment wraps a ContextFragment via composition (fragment field) rather than extending it. This matches the spec's @dataclass(frozen=True) design and keeps scoring concerns separate from the base fragment model.
  • Frozen model: Both models are immutable, safe for sharing across pipeline stages without defensive copies.
  • Equality semantics: Based on uko_node + detail_depth per spec requirements for deduplication. Two scored fragments wrapping the same logical content are considered equal regardless of scoring differences.

Testing

  • Behave: 16 scenarios in features/context_fragment_models.feature covering construction, equality, hashing, immutability, validation, and strategy_source
  • Robot: 7 integration test cases in robot/context_fragment_models.robot
  • ASV benchmarks: benchmarks/context_fragment_models_bench.py with creation, equality, and deduplication benchmarks

Verification

All nox sessions pass:

  • lint — passed
  • format --check — passed
  • typecheck (pyright strict) — 0 errors, 0 warnings
  • unit_tests — 8516 scenarios passed, 0 failed
  • integration_tests — 1201 tests passed, 0 failed
  • coverage_report — 97% (threshold met)
  • build — wheel built successfully

PR: #599

## Implementation Notes ### What was done 1. **`ScoredFragment` model** (`src/cleveragents/domain/contexts/fragment.py`) — Frozen Pydantic v2 model wrapping `ContextFragment` with: - `composite_score: float` (0.0-1.0) — overall ranking score - `score_breakdown: dict[str, float]` — component-level breakdown (e.g. relevance, hierarchy, recency) - `rank: int` — ordinal rank assigned after scoring - Custom `__eq__`/`__hash__` based on `fragment.uko_node` + `fragment.detail_depth` for deduplication - Factory methods: `from_fragment()` and `from_relevance()` - Convenience properties: `uko_node`, `detail_depth`, `token_count` 2. **`strategy_source` field** added to `ContextFragment` (`src/cleveragents/domain/models/core/context_fragment.py`) — Tracks which strategy produced the fragment. Defaults to empty string. 3. **`domain/contexts/__init__.py`** updated to export `ScoredFragment`. ### Design Decisions - **Wrapper pattern** (not inheritance): `ScoredFragment` wraps a `ContextFragment` via composition (`fragment` field) rather than extending it. This matches the spec's `@dataclass(frozen=True)` design and keeps scoring concerns separate from the base fragment model. - **Frozen model**: Both models are immutable, safe for sharing across pipeline stages without defensive copies. - **Equality semantics**: Based on `uko_node` + `detail_depth` per spec requirements for deduplication. Two scored fragments wrapping the same logical content are considered equal regardless of scoring differences. ### Testing - **Behave**: 16 scenarios in `features/context_fragment_models.feature` covering construction, equality, hashing, immutability, validation, and strategy_source - **Robot**: 7 integration test cases in `robot/context_fragment_models.robot` - **ASV benchmarks**: `benchmarks/context_fragment_models_bench.py` with creation, equality, and deduplication benchmarks ### Verification All nox sessions pass: - `lint` — passed - `format --check` — passed - `typecheck` (pyright strict) — 0 errors, 0 warnings - `unit_tests` — 8516 scenarios passed, 0 failed - `integration_tests` — 1201 tests passed, 0 failed - `coverage_report` — 97% (threshold met) - `build` — wheel built successfully PR: #599
freemo 2026-03-09 21:42:12 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#538
No description provided.