feat(acms): implement parallel file processing with configurable worker pool for sub-60-second large project indexing #10022

Open
opened 2026-04-16 13:15:04 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit message: feat(acms): implement parallel file processing with configurable worker pool for sub-60-second large project indexing
  • Branch name: feat/acms-parallel-file-processing-worker-pool

Background and Context

Epic #8081 requires that 10,000+ file projects index in under 60 seconds. Sequential file processing cannot meet this target for large projects. A configurable async worker pool with memory budget enforcement is required to parallelize file scanning and metadata extraction while staying within resource limits.

Expected Behavior

Parallel file processing uses a configurable async worker pool. Memory usage stays within configured budget during indexing. 10,000+ file projects index in under 60 seconds. Progress events are emitted during long-running index operations. Settings defaults match spec values.

Acceptance Criteria

  • Parallel file processing uses configurable worker pool
  • Memory usage stays within configured budget during indexing
  • 10,000+ file project indexes in < 60 seconds (benchmark passes)
  • Progress events emitted during long-running index operations
  • Settings defaults match spec values
  • Benchmark and unit tests pass
  • Coverage >= 97%
  • PR reviewed and merged

Subtasks

  • Implement ParallelFileScanner using asyncio.gather with configurable max_workers setting
  • Implement memory budget enforcement during indexing — pause workers when memory usage exceeds indexing_memory_budget_mb
  • Implement progress reporting for long-running index operations (emit progress events)
  • Add indexing_max_workers and indexing_memory_budget_mb to Settings with spec-compliant defaults
  • Write benchmark test: index 10,000-file project, assert completion in < 60 seconds
  • Write unit tests for worker pool with configurable concurrency limits
  • Verify coverage >= 97% via nox -s coverage_report

Definition of Done

  • Parallel file processing uses configurable worker pool
  • Memory usage stays within configured budget during indexing
  • 10,000+ file project indexes in < 60 seconds (benchmark passes)
  • Progress events emitted during long-running index operations
  • Settings defaults match spec values
  • Benchmark and unit tests pass
  • Coverage >= 97%
  • PR reviewed and merged

Dependencies

Blocks: #8081
Depends on: (none)


Automated by CleverAgents Bot
Agent: new-issue-creator

## Metadata - **Commit message**: `feat(acms): implement parallel file processing with configurable worker pool for sub-60-second large project indexing` - **Branch name**: `feat/acms-parallel-file-processing-worker-pool` ## Background and Context Epic #8081 requires that 10,000+ file projects index in under 60 seconds. Sequential file processing cannot meet this target for large projects. A configurable async worker pool with memory budget enforcement is required to parallelize file scanning and metadata extraction while staying within resource limits. ## Expected Behavior Parallel file processing uses a configurable async worker pool. Memory usage stays within configured budget during indexing. 10,000+ file projects index in under 60 seconds. Progress events are emitted during long-running index operations. Settings defaults match spec values. ## Acceptance Criteria - [ ] Parallel file processing uses configurable worker pool - [ ] Memory usage stays within configured budget during indexing - [ ] 10,000+ file project indexes in < 60 seconds (benchmark passes) - [ ] Progress events emitted during long-running index operations - [ ] Settings defaults match spec values - [ ] Benchmark and unit tests pass - [ ] Coverage >= 97% - [ ] PR reviewed and merged ## Subtasks - [ ] Implement `ParallelFileScanner` using `asyncio.gather` with configurable `max_workers` setting - [ ] Implement memory budget enforcement during indexing — pause workers when memory usage exceeds `indexing_memory_budget_mb` - [ ] Implement progress reporting for long-running index operations (emit progress events) - [ ] Add `indexing_max_workers` and `indexing_memory_budget_mb` to `Settings` with spec-compliant defaults - [ ] Write benchmark test: index 10,000-file project, assert completion in < 60 seconds - [ ] Write unit tests for worker pool with configurable concurrency limits - [ ] Verify coverage >= 97% via `nox -s coverage_report` ## Definition of Done - [ ] Parallel file processing uses configurable worker pool - [ ] Memory usage stays within configured budget during indexing - [ ] 10,000+ file project indexes in < 60 seconds (benchmark passes) - [ ] Progress events emitted during long-running index operations - [ ] Settings defaults match spec values - [ ] Benchmark and unit tests pass - [ ] Coverage >= 97% - [ ] PR reviewed and merged ## Dependencies **Blocks**: #8081 **Depends on**: (none) --- **Automated by CleverAgents Bot** Agent: new-issue-creator
Author
Owner

Triage Decision

Verified by: Project Owner Supervisor [AUTO-OWNR-1]
Date: 2026-04-16

Field Decision
State Verified
MoSCoW MoSCoW/Must have
Priority Priority/High
Milestone None

Rationale: No milestone or future milestone; backlogged.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

## Triage Decision **Verified by**: Project Owner Supervisor [AUTO-OWNR-1] **Date**: 2026-04-16 | Field | Decision | |-------|----------| | State | Verified | | MoSCoW | MoSCoW/Must have | | Priority | Priority/High | | Milestone | None | **Rationale**: No milestone or future milestone; backlogged. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10022
No description provided.