[Bug Hunt][Cycle 2][Resource] Resource Handler Partial Creation Leak on Resolution Failure #7145

Open
opened 2026-04-10 08:10:53 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Branch: bugfix/m3-resource-handler-partial-creation-leak
  • Commit Message: fix(resource): add transactional cleanup for partial resource creation in resolve_binding
  • Milestone: backlog
  • Parent Epic: #7023

Bug Report: Resource Management — Resource Handler Partial Creation Leak on Resolution Failure

Severity Assessment

  • Impact: Resource leaks when handler resolution fails after partial resource creation
  • Likelihood: Low during normal operation, higher during system errors or misconfigurations
  • Priority: Medium

Location

  • File: src/cleveragents/application/services/resource_handler_service.py
  • Method: resolve_binding
  • Lines: 85-120 (resolve_binding method)

Description

The ResourceHandlerService.resolve_binding() method calls handler.resolve() to create resources but provides no cleanup mechanism if the resolution process fails after partially creating resources (sandboxes, connections, file handles, etc.).

Evidence

def resolve_binding(self, binding: BindingResult, plan_id: str, access: str = "read_only") -> BoundResource:
    # ... lookup resource and type ...

    # Step 4: Delegate to handler
    bound = handler.resolve(
        resource=resource,
        plan_id=plan_id,
        slot_name=binding.slot_name,
        sandbox_manager=self._sandbox_manager,
        access=access,
    )
    # If handler.resolve() fails here after partial creation, no cleanup occurs

    # Emit RESOURCE_ACCESSED event
    if self._event_bus is not None:
        try:
            self._event_bus.emit(...)  # This try/except only handles event failures
        except Exception:
            logger.warning("event_bus_emit_failed...")
    return bound

Issue: If handler.resolve() fails after:

  • Creating a sandbox directory
  • Opening database connections
  • Acquiring file locks
  • Initializing temporary resources

These partially created resources are not cleaned up, leading to resource leaks.

Expected Behavior

Resource creation should be transactional:

  • Success: all resources properly created and returned
  • Failure: any partially created resources are cleaned up automatically
  • No resource leaks during error conditions
  • Proper exception propagation with cleanup

Actual Behavior

When handler.resolve() throws exceptions after partial resource creation:

  • Sandbox directories may remain on filesystem
  • Database connections may stay open
  • File descriptors may leak
  • Temporary files may not be cleaned up
  • Resource registry entries may be inconsistent

Suggested Fix

Implement proper exception handling with resource cleanup:

def resolve_binding(self, binding: BindingResult, plan_id: str, access: str = "read_only") -> BoundResource:
    # ... lookup resource and type ...

    created_resources = []
    try:
        # Step 4: Delegate to handler with tracking
        bound = handler.resolve(
            resource=resource,
            plan_id=plan_id,
            slot_name=binding.slot_name,
            sandbox_manager=self._sandbox_manager,
            access=access,
        )
        # Track created resources for potential cleanup
        created_resources.append(bound)

        # Emit event only after successful creation
        self._emit_resource_accessed_event(bound, plan_id)
        return bound

    except Exception:
        # Cleanup any partially created resources
        self._cleanup_partial_resources(created_resources, plan_id)
        raise

Category

resource

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_, and @tdd_expected_fail to prove the bug exists before fixing it.

Subtasks

  • Reproduce the partial creation leak by triggering a failure mid-way through handler.resolve()
  • Implement _cleanup_partial_resources() helper method on ResourceHandlerService
  • Wrap handler.resolve() call in try/except with cleanup on failure
  • Ensure event emission only occurs after successful resource creation
  • Add argument validation guards at method entry per project standards
  • Write BDD Behave scenarios covering: successful resolution, failure with cleanup, partial creation rollback
  • Verify no resource leaks remain after exception paths (sandbox dirs, DB connections, file handles)
  • Update integration tests to cover transactional resource creation behaviour

Definition of Done

  • resolve_binding cleans up all partially created resources when handler.resolve() raises
  • Exception propagates correctly after cleanup (no swallowing)
  • BDD unit scenarios pass for all success and failure paths
  • Integration tests confirm no leaked resources after resolution failure
  • All nox stages pass
  • Coverage >= 97%

Backlog note: This issue was discovered during autonomous operation
on milestone v3.5.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.


Automated by CleverAgents Bot
Supervisor: Bug Hunt | Agent: new-issue-creator

## Metadata - **Branch**: `bugfix/m3-resource-handler-partial-creation-leak` - **Commit Message**: `fix(resource): add transactional cleanup for partial resource creation in resolve_binding` - **Milestone**: backlog - **Parent Epic**: #7023 ## Bug Report: Resource Management — Resource Handler Partial Creation Leak on Resolution Failure ### Severity Assessment - **Impact**: Resource leaks when handler resolution fails after partial resource creation - **Likelihood**: Low during normal operation, higher during system errors or misconfigurations - **Priority**: Medium ### Location - **File**: `src/cleveragents/application/services/resource_handler_service.py` - **Method**: `resolve_binding` - **Lines**: 85-120 (resolve_binding method) ### Description The `ResourceHandlerService.resolve_binding()` method calls `handler.resolve()` to create resources but provides no cleanup mechanism if the resolution process fails after partially creating resources (sandboxes, connections, file handles, etc.). ### Evidence ```python def resolve_binding(self, binding: BindingResult, plan_id: str, access: str = "read_only") -> BoundResource: # ... lookup resource and type ... # Step 4: Delegate to handler bound = handler.resolve( resource=resource, plan_id=plan_id, slot_name=binding.slot_name, sandbox_manager=self._sandbox_manager, access=access, ) # If handler.resolve() fails here after partial creation, no cleanup occurs # Emit RESOURCE_ACCESSED event if self._event_bus is not None: try: self._event_bus.emit(...) # This try/except only handles event failures except Exception: logger.warning("event_bus_emit_failed...") return bound ``` **Issue**: If `handler.resolve()` fails after: - Creating a sandbox directory - Opening database connections - Acquiring file locks - Initializing temporary resources These partially created resources are not cleaned up, leading to resource leaks. ### Expected Behavior Resource creation should be transactional: - Success: all resources properly created and returned - Failure: any partially created resources are cleaned up automatically - No resource leaks during error conditions - Proper exception propagation with cleanup ### Actual Behavior When `handler.resolve()` throws exceptions after partial resource creation: - Sandbox directories may remain on filesystem - Database connections may stay open - File descriptors may leak - Temporary files may not be cleaned up - Resource registry entries may be inconsistent ### Suggested Fix Implement proper exception handling with resource cleanup: ```python def resolve_binding(self, binding: BindingResult, plan_id: str, access: str = "read_only") -> BoundResource: # ... lookup resource and type ... created_resources = [] try: # Step 4: Delegate to handler with tracking bound = handler.resolve( resource=resource, plan_id=plan_id, slot_name=binding.slot_name, sandbox_manager=self._sandbox_manager, access=access, ) # Track created resources for potential cleanup created_resources.append(bound) # Emit event only after successful creation self._emit_resource_accessed_event(bound, plan_id) return bound except Exception: # Cleanup any partially created resources self._cleanup_partial_resources(created_resources, plan_id) raise ``` ### Category resource ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it. ## Subtasks - [ ] Reproduce the partial creation leak by triggering a failure mid-way through `handler.resolve()` - [ ] Implement `_cleanup_partial_resources()` helper method on `ResourceHandlerService` - [ ] Wrap `handler.resolve()` call in try/except with cleanup on failure - [ ] Ensure event emission only occurs after successful resource creation - [ ] Add argument validation guards at method entry per project standards - [ ] Write BDD Behave scenarios covering: successful resolution, failure with cleanup, partial creation rollback - [ ] Verify no resource leaks remain after exception paths (sandbox dirs, DB connections, file handles) - [ ] Update integration tests to cover transactional resource creation behaviour ## Definition of Done - [ ] `resolve_binding` cleans up all partially created resources when `handler.resolve()` raises - [ ] Exception propagates correctly after cleanup (no swallowing) - [ ] BDD unit scenarios pass for all success and failure paths - [ ] Integration tests confirm no leaked resources after resolution failure - [ ] All nox stages pass - [ ] Coverage >= 97% > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.5.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunt | Agent: new-issue-creator
Author
Owner

Verified — Resource leak: partial creation leak on resolution failure. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Resource leak: partial creation leak on resolution failure. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7145
No description provided.