BUG-HUNT: [performance] N+1 query pattern in ActionRepository and LifecyclePlanRepository list operations — child relationships loaded lazily #6459

Open
opened 2026-04-09 21:05:22 +00:00 by HAL9000 · 1 comment
Owner

Bug Report: Performance — N+1 queries in repository list methods due to default lazy loading

Severity Assessment

  • Impact: Every call to ActionRepository.list_all(), ActionRepository.list_available(), ActionRepository.get_by_namespace(), ActionRepository.get_by_state(), and LifecyclePlanRepository.list_all() / list_plans() triggers N+1 SQL queries: 1 query to fetch the main rows, then one additional query per row for each relationship accessed in to_domain().
    • For an action list of 50 actions (each with arguments and invariants): 1 + 50*2 = 101 queries.
    • For a plan list of 100 plans (with project_links, arguments, invariants): 1 + 100*3 = 301 queries.
  • Likelihood: Triggered on every agents action list, agents plan list, and any dashboard/CLI command that enumerates actions or plans. Worsens with scale.
  • Priority: Medium (performance degradation, not a data correctness issue)

Location

  • File: src/cleveragents/infrastructure/database/models.py

  • Class: LifecycleActionModel (lines ~280–520) — arguments_rel and invariants_rel relationships

  • Class: LifecyclePlanModel (lines ~590–800) — project_links_rel, arguments_rel, invariants_rel relationships

  • File: src/cleveragents/infrastructure/database/repositories.py

  • Methods: ActionRepository.list_all(), list_available(), get_by_namespace(), get_by_state(), LifecyclePlanRepository.list_all(), list_plans()


Description

LifecycleActionModel defines its child relationships without a lazy="joined" or lazy="selectin" strategy:

# models.py — LifecycleActionModel
invariants_rel = relationship(
    "ActionInvariantModel",
    back_populates="action",
    cascade="all, delete-orphan",
    order_by="ActionInvariantModel.position",
    # ← NO lazy="joined" or lazy="selectin"
)
arguments_rel = relationship(
    "ActionArgumentModel",
    back_populates="action",
    cascade="all, delete-orphan",
    order_by="ActionArgumentModel.position",
    # ← NO lazy="joined" or lazy="selectin"
)

Similarly, LifecyclePlanModel:

# models.py — LifecyclePlanModel
project_links_rel = relationship(
    "PlanProjectModel",
    back_populates="plan",
    cascade="all, delete-orphan",
    # ← NO lazy loading strategy
)
arguments_rel = relationship(
    "PlanArgumentModel",
    ...
    # ← NO lazy loading strategy
)
invariants_rel = relationship(
    "PlanInvariantModel",
    ...
    # ← NO lazy loading strategy
)

SQLAlchemy's default is lazy="select" — each access to the relationship triggers a separate SELECT statement. The to_domain() methods for both models access all child relationships, so every row returned by a list query causes additional DB round-trips.

Contrast with SkillModel

SkillModel already uses lazy="joined" correctly:

# models.py — SkillModel (line 2395-2402)
# Relationships (lazy="joined" prevents N+1 queries on list_all)
items_rel = relationship(
    "SkillItemModel",
    back_populates="skill",
    cascade="all, delete-orphan",
    order_by="SkillItemModel.item_order",
    lazy="joined",    # ← CORRECT
)

The pattern is established and understood — it just wasn't applied to the newer LifecycleActionModel and LifecyclePlanModel relationships.


Evidence

LifecycleActionModel in models.py — relationships at approximately lines 335–354:

invariants_rel = relationship(
    "ActionInvariantModel",
    back_populates="action",
    cascade="all, delete-orphan",
    order_by="ActionInvariantModel.position",
    # No lazy= — defaults to lazy="select" (triggers N+1)
)
arguments_rel = relationship(
    "ActionArgumentModel",
    back_populates="action",
    cascade="all, delete-orphan",
    order_by="ActionArgumentModel.position",
    # No lazy= — defaults to lazy="select" (triggers N+1)
)

ActionRepository.list_all() in repositories.py:

rows = (
    session.query(LifecycleActionModel)
    .order_by(LifecycleActionModel.namespace, LifecycleActionModel.name)
    .all()
)
return [row.to_domain() for row in rows]  # to_domain() accesses arguments_rel and invariants_rel

Each call to row.to_domain() accesses self.arguments_rel and self.invariants_rel, triggering 2 additional SQL queries per action.


Expected Behavior

list_all() and related bulk-read methods should use eager loading (either lazy="joined" on the model or options(selectinload(...)) on the query) to load all child rows in a bounded number of queries regardless of how many actions/plans are returned.

Actual Behavior

For N actions returned by list_all(), the total query count is 1 + 2*N (1 for main rows + 2 per action for arguments + invariants). For plans with 3 child tables, it is 1 + 3*N.


Suggested Fix

Option A (preferred): Add lazy="selectin" to the child relationships in the ORM models. selectin is generally better than joined for one-to-many because it avoids Cartesian product inflation on the parent query:

# LifecycleActionModel
invariants_rel = relationship(
    "ActionInvariantModel",
    ...
    lazy="selectin",  # ← 1 extra query for ALL invariants, not N
)
arguments_rel = relationship(
    "ActionArgumentModel",
    ...
    lazy="selectin",  # ← 1 extra query for ALL arguments, not N
)

# LifecyclePlanModel
project_links_rel = relationship("PlanProjectModel", ..., lazy="selectin")
arguments_rel = relationship("PlanArgumentModel", ..., lazy="selectin")
invariants_rel = relationship("PlanInvariantModel", ..., lazy="selectin")

Option B: Add options(selectinload(...)) to the specific list queries that need to return domain objects.

Note: lazy="joined" as used by SkillModel is also acceptable but can cause Cartesian product bloat for multiple one-to-many relationships on the same parent.


Category

performance / boundary


TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: Performance — N+1 queries in repository list methods due to default lazy loading ### Severity Assessment - **Impact**: Every call to `ActionRepository.list_all()`, `ActionRepository.list_available()`, `ActionRepository.get_by_namespace()`, `ActionRepository.get_by_state()`, and `LifecyclePlanRepository.list_all()` / `list_plans()` triggers **N+1 SQL queries**: 1 query to fetch the main rows, then one additional query **per row** for each relationship accessed in `to_domain()`. - For an action list of 50 actions (each with arguments and invariants): `1 + 50*2 = 101 queries`. - For a plan list of 100 plans (with project_links, arguments, invariants): `1 + 100*3 = 301 queries`. - **Likelihood**: Triggered on every `agents action list`, `agents plan list`, and any dashboard/CLI command that enumerates actions or plans. Worsens with scale. - **Priority**: Medium (performance degradation, not a data correctness issue) ### Location - **File**: `src/cleveragents/infrastructure/database/models.py` - **Class**: `LifecycleActionModel` (lines ~280–520) — `arguments_rel` and `invariants_rel` relationships - **Class**: `LifecyclePlanModel` (lines ~590–800) — `project_links_rel`, `arguments_rel`, `invariants_rel` relationships - **File**: `src/cleveragents/infrastructure/database/repositories.py` - **Methods**: `ActionRepository.list_all()`, `list_available()`, `get_by_namespace()`, `get_by_state()`, `LifecyclePlanRepository.list_all()`, `list_plans()` --- ### Description `LifecycleActionModel` defines its child relationships **without** a `lazy="joined"` or `lazy="selectin"` strategy: ```python # models.py — LifecycleActionModel invariants_rel = relationship( "ActionInvariantModel", back_populates="action", cascade="all, delete-orphan", order_by="ActionInvariantModel.position", # ← NO lazy="joined" or lazy="selectin" ) arguments_rel = relationship( "ActionArgumentModel", back_populates="action", cascade="all, delete-orphan", order_by="ActionArgumentModel.position", # ← NO lazy="joined" or lazy="selectin" ) ``` Similarly, `LifecyclePlanModel`: ```python # models.py — LifecyclePlanModel project_links_rel = relationship( "PlanProjectModel", back_populates="plan", cascade="all, delete-orphan", # ← NO lazy loading strategy ) arguments_rel = relationship( "PlanArgumentModel", ... # ← NO lazy loading strategy ) invariants_rel = relationship( "PlanInvariantModel", ... # ← NO lazy loading strategy ) ``` SQLAlchemy's default is `lazy="select"` — each access to the relationship triggers a separate `SELECT` statement. The `to_domain()` methods for both models access **all child relationships**, so every row returned by a list query causes additional DB round-trips. ### Contrast with SkillModel `SkillModel` already uses `lazy="joined"` correctly: ```python # models.py — SkillModel (line 2395-2402) # Relationships (lazy="joined" prevents N+1 queries on list_all) items_rel = relationship( "SkillItemModel", back_populates="skill", cascade="all, delete-orphan", order_by="SkillItemModel.item_order", lazy="joined", # ← CORRECT ) ``` The pattern is established and understood — it just wasn't applied to the newer `LifecycleActionModel` and `LifecyclePlanModel` relationships. --- ### Evidence `LifecycleActionModel` in `models.py` — relationships at approximately lines 335–354: ```python invariants_rel = relationship( "ActionInvariantModel", back_populates="action", cascade="all, delete-orphan", order_by="ActionInvariantModel.position", # No lazy= — defaults to lazy="select" (triggers N+1) ) arguments_rel = relationship( "ActionArgumentModel", back_populates="action", cascade="all, delete-orphan", order_by="ActionArgumentModel.position", # No lazy= — defaults to lazy="select" (triggers N+1) ) ``` `ActionRepository.list_all()` in `repositories.py`: ```python rows = ( session.query(LifecycleActionModel) .order_by(LifecycleActionModel.namespace, LifecycleActionModel.name) .all() ) return [row.to_domain() for row in rows] # to_domain() accesses arguments_rel and invariants_rel ``` Each call to `row.to_domain()` accesses `self.arguments_rel` and `self.invariants_rel`, triggering 2 additional SQL queries per action. --- ### Expected Behavior `list_all()` and related bulk-read methods should use **eager loading** (either `lazy="joined"` on the model or `options(selectinload(...))` on the query) to load all child rows in a bounded number of queries regardless of how many actions/plans are returned. ### Actual Behavior For `N` actions returned by `list_all()`, the total query count is `1 + 2*N` (1 for main rows + 2 per action for arguments + invariants). For plans with 3 child tables, it is `1 + 3*N`. --- ### Suggested Fix **Option A (preferred)**: Add `lazy="selectin"` to the child relationships in the ORM models. `selectin` is generally better than `joined` for one-to-many because it avoids Cartesian product inflation on the parent query: ```python # LifecycleActionModel invariants_rel = relationship( "ActionInvariantModel", ... lazy="selectin", # ← 1 extra query for ALL invariants, not N ) arguments_rel = relationship( "ActionArgumentModel", ... lazy="selectin", # ← 1 extra query for ALL arguments, not N ) # LifecyclePlanModel project_links_rel = relationship("PlanProjectModel", ..., lazy="selectin") arguments_rel = relationship("PlanArgumentModel", ..., lazy="selectin") invariants_rel = relationship("PlanInvariantModel", ..., lazy="selectin") ``` **Option B**: Add `options(selectinload(...))` to the specific list queries that need to return domain objects. Note: `lazy="joined"` as used by `SkillModel` is also acceptable but can cause Cartesian product bloat for multiple one-to-many relationships on the same parent. --- ### Category `performance` / `boundary` --- ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter
Author
Owner

Verified — Valid performance bug. N+1 query pattern causes performance degradation with large datasets. MoSCoW: Could Have — performance optimization, not blocking correctness.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Valid performance bug. N+1 query pattern causes performance degradation with large datasets. **MoSCoW: Could Have** — performance optimization, not blocking correctness. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
HAL9000 added this to the v3.5.0 milestone 2026-04-17 08:36:56 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6459
No description provided.