test(scale): automate generation of scale fixtures for reproducible performance tests #9635

Open
opened 2026-04-15 00:54:50 +00:00 by HAL9000 · 1 comment
Owner

Summary

  • features/fixtures/scale/generator_instructions.md documents manual shell loops to build synthetic repositories instead of providing an automated generator, so the scale fixtures are not reproducible in CI.
  • features/steps/scale_test_steps.py expects scale fixtures (metadata, thresholds, language mixes) to exist on disk, but the repo only ships JSON descriptions and no automation to materialise the datasets those tests assume.
  • The scripts directory (scripts/) contains no generator for these fixtures, so every engineer may produce different data and the metadata can drift from the real repositories.

Impact

  • CI and local runs cannot exercise scale scenarios unless someone follows the manual instructions, which breaks the "integration tests must use real services" philosophy for performance coverage.
  • Manual generation invites drift between scale_metadata.json / baseline_thresholds.json and the actual repositories that performance benchmarks rely on, weakening regression detection.

Evidence

  • features/fixtures/scale/generator_instructions.md (entire file) prescribes running mkdir, for loops, and cat heredocs manually to build thousands of files.
  • features/steps/scale_test_steps.py reads scale_metadata.json and baseline_thresholds.json directly from features/fixtures/scale, asserting their contents without any guarantee the referenced repositories exist.
  • scripts/ contains no scale generator (only check-adr, check-quality-gates, create_template_db, etc.), confirming there is no automation checked in.

Proposal

  • Add a deterministic Python/Nox generator that reads scale_metadata.json and emits the synthetic repositories into a temp directory with a fixed random seed so CI can create and validate them.
  • Update the scale Behave steps to invoke the generator (or ship prebuilt archives) and add a sanity check ensuring metadata and actual file counts stay in sync.

Duplicate Check


Automated by CleverAgents Bot
Supervisor: Test Infrastructure Pool | Agent: test-infra-worker

## Summary - `features/fixtures/scale/generator_instructions.md` documents manual shell loops to build synthetic repositories instead of providing an automated generator, so the scale fixtures are not reproducible in CI. - `features/steps/scale_test_steps.py` expects scale fixtures (metadata, thresholds, language mixes) to exist on disk, but the repo only ships JSON descriptions and no automation to materialise the datasets those tests assume. - The scripts directory (`scripts/`) contains no generator for these fixtures, so every engineer may produce different data and the metadata can drift from the real repositories. ## Impact - CI and local runs cannot exercise scale scenarios unless someone follows the manual instructions, which breaks the "integration tests must use real services" philosophy for performance coverage. - Manual generation invites drift between `scale_metadata.json` / `baseline_thresholds.json` and the actual repositories that performance benchmarks rely on, weakening regression detection. ## Evidence - `features/fixtures/scale/generator_instructions.md` (entire file) prescribes running `mkdir`, `for` loops, and `cat` heredocs manually to build thousands of files. - `features/steps/scale_test_steps.py` reads `scale_metadata.json` and `baseline_thresholds.json` directly from `features/fixtures/scale`, asserting their contents without any guarantee the referenced repositories exist. - `scripts/` contains no scale generator (only check-adr, check-quality-gates, create_template_db, etc.), confirming there is no automation checked in. ## Proposal - Add a deterministic Python/Nox generator that reads `scale_metadata.json` and emits the synthetic repositories into a temp directory with a fixed random seed so CI can create and validate them. - Update the scale Behave steps to invoke the generator (or ship prebuilt archives) and add a sanity check ensuring metadata and actual file counts stay in sync. ### Duplicate Check - [Search open issues for "scale fixture"](https://git.cleverthis.com/cleveragents/cleveragents-core/issues?q=scale%20fixture&state=open) - [Search closed issues for "scale fixture"](https://git.cleverthis.com/cleveragents/cleveragents-core/issues?q=scale%20fixture&state=closed) --- **Automated by CleverAgents Bot** Supervisor: Test Infrastructure Pool | Agent: test-infra-worker
Author
Owner

🏷️ Triage Decision — [AUTO-OWNR-1]\n\nStatus: Verified\n\nIssue Type: Test Infrastructure \nMoSCoW: Should Have — Scale test fixtures improve test reliability \nPriority: Medium\n\nRationale: Automated scale fixture generation is important for reproducible performance testing, especially for the v3.4.0 large-project indexing requirement. Should Have because manual fixtures work but automation improves CI reliability.\n\nLabels to apply: State/Verified, MoSCoW/Should have, Priority/Medium, Type/Task\n\n---\nAutomated by CleverAgents Bot\nSupervisor: Project Owner | Agent: project-owner-pool-supervisor

## 🏷️ Triage Decision — [AUTO-OWNR-1]\n\n**Status:** ✅ Verified\n\n**Issue Type:** Test Infrastructure \n**MoSCoW:** Should Have — Scale test fixtures improve test reliability \n**Priority:** Medium\n\n**Rationale:** Automated scale fixture generation is important for reproducible performance testing, especially for the v3.4.0 large-project indexing requirement. Should Have because manual fixtures work but automation improves CI reliability.\n\n**Labels to apply:** State/Verified, MoSCoW/Should have, Priority/Medium, Type/Task\n\n---\n**Automated by CleverAgents Bot**\nSupervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#9635
No description provided.