BUG-HUNT: [resource] AsyncResourceTracker.close_all() closes resources sequentially causing O(n*timeout) shutdown time #7322

Open
opened 2026-04-10 16:48:03 +00:00 by HAL9000 · 0 comments
Owner

Bug Report: Resource Management — Sequential Resource Closure in AsyncResourceTracker

Severity Assessment

  • Impact: With N registered resources and a timeout of T seconds each, close_all() takes up to N×T seconds in the worst case. For example, 50 connections × 30s timeout = 1500 seconds (25 minutes) for shutdown. This causes application hangs on shutdown.
  • Likelihood: Medium — any service with many registered async resources (connections, MCP clients, LLM sessions) will experience this
  • Priority: Medium

Location

  • File: src/cleveragents/core/async_cleanup.py
  • Function/Class: AsyncResourceTracker.close_all()
  • Lines: 85–126

Description

close_all() iterates over resources with a simple for loop, calling await asyncio.wait_for(resource.close(), timeout=timeout) for each one in sequence. This means:

  1. Resource 1 is closed (waits up to timeout seconds)
  2. Resource 2 is closed (waits up to timeout seconds)
  3. ...
  4. Resource N is closed (waits up to timeout seconds)

Total worst-case shutdown time = N × timeout seconds.

All resources could be closed concurrently using asyncio.gather() with a global deadline, completing in timeout seconds regardless of N.

Evidence

# src/cleveragents/core/async_cleanup.py lines 107-126
for name, resource in snapshot.items():
    try:
        await asyncio.wait_for(resource.close(), timeout=timeout)  # ← sequential!
        logger.info("Closed async resource '%s'", name)
    except TimeoutError:
        self.timed_out_resources.append(name)
        logger.warning(...)
    except (Exception, asyncio.CancelledError):
        logger.exception(...)

With 50 resources and timeout=30.0, worst-case shutdown time is 50 × 30 = 1500 seconds.

Expected Behavior

All resources should be closed concurrently, with the total timeout bounded by a single deadline:

tasks = {
    name: asyncio.create_task(resource.close())
    for name, resource in snapshot.items()
}
done, pending = await asyncio.wait(tasks.values(), timeout=timeout)
for task in pending:
    task.cancel()
    name = next(n for n, t in tasks.items() if t is task)
    self.timed_out_resources.append(name)

Actual Behavior

Resources are closed one-by-one, each with its own full timeout. Total shutdown time scales linearly with the number of resources.

Suggested Fix

Replace the sequential loop with asyncio.gather() or asyncio.wait() with a shared deadline, so all resources are closed in parallel and total shutdown time is bounded by a single timeout.

Category

resource

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor

## Bug Report: Resource Management — Sequential Resource Closure in AsyncResourceTracker ### Severity Assessment - **Impact**: With N registered resources and a timeout of T seconds each, `close_all()` takes up to N×T seconds in the worst case. For example, 50 connections × 30s timeout = 1500 seconds (25 minutes) for shutdown. This causes application hangs on shutdown. - **Likelihood**: Medium — any service with many registered async resources (connections, MCP clients, LLM sessions) will experience this - **Priority**: Medium ### Location - **File**: `src/cleveragents/core/async_cleanup.py` - **Function/Class**: `AsyncResourceTracker.close_all()` - **Lines**: 85–126 ### Description `close_all()` iterates over resources with a simple `for` loop, calling `await asyncio.wait_for(resource.close(), timeout=timeout)` for each one in sequence. This means: 1. Resource 1 is closed (waits up to `timeout` seconds) 2. Resource 2 is closed (waits up to `timeout` seconds) 3. ... 4. Resource N is closed (waits up to `timeout` seconds) Total worst-case shutdown time = N × timeout seconds. All resources could be closed concurrently using `asyncio.gather()` with a global deadline, completing in `timeout` seconds regardless of N. ### Evidence ```python # src/cleveragents/core/async_cleanup.py lines 107-126 for name, resource in snapshot.items(): try: await asyncio.wait_for(resource.close(), timeout=timeout) # ← sequential! logger.info("Closed async resource '%s'", name) except TimeoutError: self.timed_out_resources.append(name) logger.warning(...) except (Exception, asyncio.CancelledError): logger.exception(...) ``` With 50 resources and `timeout=30.0`, worst-case shutdown time is 50 × 30 = **1500 seconds**. ### Expected Behavior All resources should be closed concurrently, with the total timeout bounded by a single deadline: ```python tasks = { name: asyncio.create_task(resource.close()) for name, resource in snapshot.items() } done, pending = await asyncio.wait(tasks.values(), timeout=timeout) for task in pending: task.cancel() name = next(n for n, t in tasks.items() if t is task) self.timed_out_resources.append(name) ``` ### Actual Behavior Resources are closed one-by-one, each with its own full timeout. Total shutdown time scales linearly with the number of resources. ### Suggested Fix Replace the sequential loop with `asyncio.gather()` or `asyncio.wait()` with a shared deadline, so all resources are closed in parallel and total shutdown time is bounded by a single `timeout`. ### Category resource ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor
HAL9000 added this to the v3.5.0 milestone 2026-04-10 19:05:43 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7322
No description provided.