UAT: Devcontainer activation_state not persisted to database — lifecycle state is lost on process restart, making CLI stop/rebuild non-functional across sessions #2144

Open
opened 2026-04-03 04:25:16 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: feat/persist-devcontainer-activation-state
  • Commit Message: feat(resource): persist devcontainer activation_state to database
  • Milestone: v3.7.0
  • Parent Epic: #825

Bug Report

What Was Tested

The persistence of devcontainer lifecycle state (activation_state) across process restarts.

Expected Behavior (from spec)

Per docs/specification.md line 35101-35105, the devcontainer-instance resource type defines:

activation_state:
  type: "enum(detected, building, running, stopping, stopped, failed)"
  required: true
  default: "detected"
  description: "Lifecycle state. Starts as 'detected' (lazy); transitions to 'building' then 'running' on first access."

The spec defines activation_state as a required field on the devcontainer-instance resource type. This implies it must be persisted with the resource in the database, just like all other resource fields.

Additionally, the spec's CLI commands (agents resource stop, agents resource rebuild) must work correctly across process restarts — a user should be able to start a container in one session and stop it in another.

Actual Behavior (from code)

Per docs/reference/devcontainer_resources.md lines 249-250:

| In-memory lifecycle state (F20) | Lifecycle state is tracked in an in-memory registry (`_lifecycle_registry` dict), **not** persisted on the resource model or database. State is lost on process restart. CLI `stop`/`rebuild` in a new process cannot see trackers from a previous process. |

The _lifecycle_registry in src/cleveragents/resource/handlers/_devcontainer_internals.py is a plain Python dict that lives only in memory. When the process restarts:

  1. All lifecycle state is lost
  2. agents resource stop in a new process cannot find the running container
  3. agents resource rebuild cannot determine the container's current state
  4. The activation_state field on the resource model is never populated

Root Cause

The lifecycle tracker (ContainerLifecycleTracker) was implemented as an in-memory registry rather than being persisted to the database as a field on the Resource model. The activation_state field defined in the spec is not stored in the database schema.

Code Location

  • src/cleveragents/resource/handlers/_devcontainer_internals.py, line 44: _lifecycle_registry: dict[str, ContainerLifecycleTracker] = {}
  • src/cleveragents/domain/models/core/resource.py: Resource model (missing activation_state field)
  • alembic/versions/: No migration adding activation_state to the resources table

Steps to Reproduce

  1. Start a devcontainer: agents resource stop local/my-dc (after activating it)
  2. Restart the agents process
  3. Attempt to stop: agents resource stop local/my-dc
  4. Observe: The CLI cannot determine the container's state (defaults to detected), and may incorrectly report "Cannot stop: container is in 'detected' state"

Severity

Medium — The feature works within a single process session but breaks across restarts. This is a significant usability issue for long-running deployments and server mode.

Subtasks

  • Add activation_state field to the Resource domain model
  • Create Alembic migration to add activation_state column to the resources table
  • Update ResourceRepository to persist and hydrate activation_state
  • Update DevcontainerHandler to hydrate ContainerLifecycleTracker from the persisted activation_state on startup
  • Update stop_container() and rebuild_container() to work correctly when the tracker is hydrated from the database
  • Add integration test for stop/rebuild across process restarts

Definition of Done

  • activation_state is persisted to the database for devcontainer-instance resources
  • agents resource stop works correctly after a process restart
  • agents resource rebuild works correctly after a process restart
  • All existing tests continue to pass
  • All nox stages pass
  • Coverage >= 97%
  • The associated PR is merged

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `feat/persist-devcontainer-activation-state` - **Commit Message**: `feat(resource): persist devcontainer activation_state to database` - **Milestone**: v3.7.0 - **Parent Epic**: #825 ## Bug Report ### What Was Tested The persistence of devcontainer lifecycle state (`activation_state`) across process restarts. ### Expected Behavior (from spec) Per `docs/specification.md` line 35101-35105, the `devcontainer-instance` resource type defines: ```yaml activation_state: type: "enum(detected, building, running, stopping, stopped, failed)" required: true default: "detected" description: "Lifecycle state. Starts as 'detected' (lazy); transitions to 'building' then 'running' on first access." ``` The spec defines `activation_state` as a **required field** on the `devcontainer-instance` resource type. This implies it must be persisted with the resource in the database, just like all other resource fields. Additionally, the spec's CLI commands (`agents resource stop`, `agents resource rebuild`) must work correctly across process restarts — a user should be able to start a container in one session and stop it in another. ### Actual Behavior (from code) Per `docs/reference/devcontainer_resources.md` lines 249-250: ``` | In-memory lifecycle state (F20) | Lifecycle state is tracked in an in-memory registry (`_lifecycle_registry` dict), **not** persisted on the resource model or database. State is lost on process restart. CLI `stop`/`rebuild` in a new process cannot see trackers from a previous process. | ``` The `_lifecycle_registry` in `src/cleveragents/resource/handlers/_devcontainer_internals.py` is a plain Python dict that lives only in memory. When the process restarts: 1. All lifecycle state is lost 2. `agents resource stop` in a new process cannot find the running container 3. `agents resource rebuild` cannot determine the container's current state 4. The `activation_state` field on the resource model is never populated ### Root Cause The lifecycle tracker (`ContainerLifecycleTracker`) was implemented as an in-memory registry rather than being persisted to the database as a field on the `Resource` model. The `activation_state` field defined in the spec is not stored in the database schema. ### Code Location - `src/cleveragents/resource/handlers/_devcontainer_internals.py`, line 44: `_lifecycle_registry: dict[str, ContainerLifecycleTracker] = {}` - `src/cleveragents/domain/models/core/resource.py`: `Resource` model (missing `activation_state` field) - `alembic/versions/`: No migration adding `activation_state` to the resources table ### Steps to Reproduce 1. Start a devcontainer: `agents resource stop local/my-dc` (after activating it) 2. Restart the `agents` process 3. Attempt to stop: `agents resource stop local/my-dc` 4. Observe: The CLI cannot determine the container's state (defaults to `detected`), and may incorrectly report "Cannot stop: container is in 'detected' state" ### Severity **Medium** — The feature works within a single process session but breaks across restarts. This is a significant usability issue for long-running deployments and server mode. ## Subtasks - [ ] Add `activation_state` field to the `Resource` domain model - [ ] Create Alembic migration to add `activation_state` column to the resources table - [ ] Update `ResourceRepository` to persist and hydrate `activation_state` - [ ] Update `DevcontainerHandler` to hydrate `ContainerLifecycleTracker` from the persisted `activation_state` on startup - [ ] Update `stop_container()` and `rebuild_container()` to work correctly when the tracker is hydrated from the database - [ ] Add integration test for stop/rebuild across process restarts ## Definition of Done - [ ] `activation_state` is persisted to the database for `devcontainer-instance` resources - [ ] `agents resource stop` works correctly after a process restart - [ ] `agents resource rebuild` works correctly after a process restart - [ ] All existing tests continue to pass - [ ] All nox stages pass - [ ] Coverage >= 97% - [ ] The associated PR is merged --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
freemo added this to the v3.7.0 milestone 2026-04-03 04:25:20 +00:00
freemo self-assigned this 2026-04-03 16:58:01 +00:00
Author
Owner

MoSCoW classification: Should Have

Rationale: This issue addresses an important spec requirement or quality improvement. The project should include this fix but it is not strictly essential for the milestone.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

MoSCoW classification: **Should Have** Rationale: This issue addresses an important spec requirement or quality improvement. The project should include this fix but it is not strictly essential for the milestone. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
Reference
cleveragents/cleveragents-core#2144
No description provided.