UAT: UKOLoader.get_layer_nodes() is O(n) with no precomputed index — performance risk for 10,000+ file projects #5599

Open
opened 2026-04-09 07:44:20 +00:00 by HAL9000 · 0 comments
Owner

Bug Report

Feature Area: Context Assembly Pipeline — UKO Ontology / Large Project Indexing

Severity: Backlog (performance risk — not yet confirmed to cause timeout but spec requires 10k-file indexing within 5 minutes)

What Was Tested

Code-level analysis of src/cleveragents/application/services/uko_loader.py against the specification's scalability requirement (spec §46853: "Projects with 10,000+ files index without timeout — Indexing a 10k-file project completes within 5 minutes").

Expected Behavior (from spec §46853)

v3.4.0 deliverable #8: "Projects with 10,000+ files index without timeout. Indexing a 10k-file project completes within 5 minutes."

Actual Behavior (from code)

The UKOLoader.get_layer_nodes() method (line ~327 in uko_loader.py) has an explicit TODO comment acknowledging the performance issue:

def get_layer_nodes(
    self,
    ontology: UKOOntology,
    layer: int,
) -> list[UKONode]:
    """Return all nodes at a specific layer.

    .. note::
        This is O(n) per call.  For projects with 10,000+ indexed files
        (milestone v3.4.0), consider precomputing a ``dict[int,
        list[UKONode]]`` index during load.
    """
    # TODO(perf): precompute layer index when scaling demands it.
    return [n for n in ontology.nodes if n.layer == layer]

This method is called during UKO graph traversal. For a project with 10,000+ files, each call to get_layer_nodes() scans all ontology nodes linearly. If called frequently during context assembly (e.g., for each file's UKO node lookup), this could cause O(n²) behavior and timeout.

Code Location

  • src/cleveragents/application/services/uko_loader.py lines ~327–340 (get_layer_nodes)

Impact

  • Potential timeout when indexing projects with 10,000+ files
  • v3.4.0 deliverable #8 may not be met without the precomputed index
  • The TODO comment acknowledges this is a known issue for v3.4.0

Fix Direction

Precompute a dict[int, list[UKONode]] layer index during load() or load_from_string() and use it in get_layer_nodes() instead of the linear scan.


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Bug Report **Feature Area**: Context Assembly Pipeline — UKO Ontology / Large Project Indexing **Severity**: Backlog (performance risk — not yet confirmed to cause timeout but spec requires 10k-file indexing within 5 minutes) ### What Was Tested Code-level analysis of `src/cleveragents/application/services/uko_loader.py` against the specification's scalability requirement (spec §46853: "Projects with 10,000+ files index without timeout — Indexing a 10k-file project completes within 5 minutes"). ### Expected Behavior (from spec §46853) v3.4.0 deliverable #8: "Projects with 10,000+ files index without timeout. Indexing a 10k-file project completes within 5 minutes." ### Actual Behavior (from code) The `UKOLoader.get_layer_nodes()` method (line ~327 in `uko_loader.py`) has an explicit TODO comment acknowledging the performance issue: ```python def get_layer_nodes( self, ontology: UKOOntology, layer: int, ) -> list[UKONode]: """Return all nodes at a specific layer. .. note:: This is O(n) per call. For projects with 10,000+ indexed files (milestone v3.4.0), consider precomputing a ``dict[int, list[UKONode]]`` index during load. """ # TODO(perf): precompute layer index when scaling demands it. return [n for n in ontology.nodes if n.layer == layer] ``` This method is called during UKO graph traversal. For a project with 10,000+ files, each call to `get_layer_nodes()` scans all ontology nodes linearly. If called frequently during context assembly (e.g., for each file's UKO node lookup), this could cause O(n²) behavior and timeout. ### Code Location - `src/cleveragents/application/services/uko_loader.py` lines ~327–340 (`get_layer_nodes`) ### Impact - Potential timeout when indexing projects with 10,000+ files - v3.4.0 deliverable #8 may not be met without the precomputed index - The TODO comment acknowledges this is a known issue for v3.4.0 ### Fix Direction Precompute a `dict[int, list[UKONode]]` layer index during `load()` or `load_from_string()` and use it in `get_layer_nodes()` instead of the linear scan. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#5599
No description provided.