UAT: benchmark nox session included in default sessions list, causing failures in developer environments #3946

Open
opened 2026-04-06 07:42:25 +00:00 by freemo · 0 comments
Owner

Bug Report

Feature Area: Benchmarks and Performance
Severity: Medium
Found by: UAT automated testing (instance: uat-benchmarks-perf-001)

What Was Tested

The noxfile.py nox.options.sessions list was inspected to verify that the default nox invocation is appropriate for developer workflows.

Expected Behavior

Running nox without arguments should execute sessions that work in any developer environment without special prerequisites. Sessions that require external infrastructure (machine configuration, dedicated benchmark runners) should be excluded from the default session list and run only explicitly or in CI.

Actual Behavior

The benchmark session is included in nox.options.sessions (the default sessions list):

# noxfile.py, lines 846-858
nox.options.sessions = [
    "lint",
    "format",
    "typecheck",
    "security_scan",
    "dead_code",
    "unit_tests",
    "integration_tests",
    "docs",
    "build",
    "benchmark",  # ← ASV benchmarks for performance tracking
    "coverage_report",
]

The benchmark session runs asv machine --machine=forgejo-runner ... which:

  1. Requires a machine named forgejo-runner to be pre-configured in ASV's machine database
  2. Requires the asv tool to be installed (only in .[tests] extras)
  3. Runs a full ASV benchmark suite that can take many minutes
  4. Is designed to run on a dedicated docker-benchmark runner (as specified in ci.yml)

When a developer runs nox without arguments in their local environment, the benchmark session will fail because:

  • The machine forgejo-runner is not configured locally
  • The benchmark run requires a stable, isolated environment for reproducible timing
  • The session is designed for CI infrastructure, not developer laptops

Evidence

From noxfile.py:

@nox.session(python=DEFAULT_PYTHON, reuse_venv=True, venv_backend="uv")
def benchmark(session: nox.Session):
    """Run Airspeed Velocity benchmarks and publish results."""
    session.install("-e", ".[tests]")
    config_path = "asv.conf.json"
    session.run(
        "asv",
        "machine",
        "--machine=forgejo-runner",  # ← hardcoded CI machine name
        "--os=Linux_6.x",
        "--arch=x86_64",
        "--num_cpu=32",
        "--ram=32GB",
        "--cpu=AMD",
        f"--config={config_path}",
    )
    ...

From ci.yml:

benchmark-publish:
    runs-on: docker-benchmark  # ← dedicated benchmark runner

The benchmark session is correctly configured for CI (dedicated runner, specific machine profile), but it should NOT be in the default nox.options.sessions list.

Contrast with e2e_tests

The e2e_tests session correctly documents that it is NOT in the default sessions:

@nox.session(...)
def e2e_tests(session: nox.Session):
    """...
    This session is NOT included in the default ``nox`` run because it
    requires real API keys.  Run explicitly via ``nox -s e2e_tests``.
    """

The benchmark session should follow the same pattern.

Impact

  • Developers running nox without arguments will encounter a failing benchmark session
  • The failure message from ASV about an unconfigured machine is confusing and not actionable for developers
  • The default nox run takes significantly longer than necessary due to the benchmark session
  • CI's status-check job does NOT include benchmark-regression or benchmark-publish in its required jobs, confirming benchmarks are not expected to be part of the standard CI gate

Remove "benchmark" from nox.options.sessions in noxfile.py:

nox.options.sessions = [
    "lint",
    "format",
    "typecheck",
    "security_scan",
    "dead_code",
    "unit_tests",
    "integration_tests",
    "docs",
    "build",
    # "benchmark" removed — run explicitly with: nox -s benchmark
    # Benchmarks require a dedicated runner and ASV machine configuration.
    # They are run automatically in CI via the benchmark-publish and
    # benchmark-regression jobs.
    "coverage_report",
]

Code Location

  • noxfile.py, line 856: "benchmark", # ASV benchmarks for performance tracking

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-uat-tester

## Bug Report **Feature Area:** Benchmarks and Performance **Severity:** Medium **Found by:** UAT automated testing (instance: uat-benchmarks-perf-001) ### What Was Tested The `noxfile.py` `nox.options.sessions` list was inspected to verify that the default `nox` invocation is appropriate for developer workflows. ### Expected Behavior Running `nox` without arguments should execute sessions that work in any developer environment without special prerequisites. Sessions that require external infrastructure (machine configuration, dedicated benchmark runners) should be excluded from the default session list and run only explicitly or in CI. ### Actual Behavior The `benchmark` session is included in `nox.options.sessions` (the default sessions list): ```python # noxfile.py, lines 846-858 nox.options.sessions = [ "lint", "format", "typecheck", "security_scan", "dead_code", "unit_tests", "integration_tests", "docs", "build", "benchmark", # ← ASV benchmarks for performance tracking "coverage_report", ] ``` The `benchmark` session runs `asv machine --machine=forgejo-runner ...` which: 1. Requires a machine named `forgejo-runner` to be pre-configured in ASV's machine database 2. Requires the `asv` tool to be installed (only in `.[tests]` extras) 3. Runs a full ASV benchmark suite that can take many minutes 4. Is designed to run on a dedicated `docker-benchmark` runner (as specified in `ci.yml`) When a developer runs `nox` without arguments in their local environment, the `benchmark` session will fail because: - The machine `forgejo-runner` is not configured locally - The benchmark run requires a stable, isolated environment for reproducible timing - The session is designed for CI infrastructure, not developer laptops ### Evidence From `noxfile.py`: ```python @nox.session(python=DEFAULT_PYTHON, reuse_venv=True, venv_backend="uv") def benchmark(session: nox.Session): """Run Airspeed Velocity benchmarks and publish results.""" session.install("-e", ".[tests]") config_path = "asv.conf.json" session.run( "asv", "machine", "--machine=forgejo-runner", # ← hardcoded CI machine name "--os=Linux_6.x", "--arch=x86_64", "--num_cpu=32", "--ram=32GB", "--cpu=AMD", f"--config={config_path}", ) ... ``` From `ci.yml`: ```yaml benchmark-publish: runs-on: docker-benchmark # ← dedicated benchmark runner ``` The `benchmark` session is correctly configured for CI (dedicated runner, specific machine profile), but it should NOT be in the default `nox.options.sessions` list. ### Contrast with `e2e_tests` The `e2e_tests` session correctly documents that it is NOT in the default sessions: ```python @nox.session(...) def e2e_tests(session: nox.Session): """... This session is NOT included in the default ``nox`` run because it requires real API keys. Run explicitly via ``nox -s e2e_tests``. """ ``` The `benchmark` session should follow the same pattern. ### Impact - Developers running `nox` without arguments will encounter a failing `benchmark` session - The failure message from ASV about an unconfigured machine is confusing and not actionable for developers - The default `nox` run takes significantly longer than necessary due to the benchmark session - CI's `status-check` job does NOT include `benchmark-regression` or `benchmark-publish` in its required jobs, confirming benchmarks are not expected to be part of the standard CI gate ### Recommended Fix Remove `"benchmark"` from `nox.options.sessions` in `noxfile.py`: ```python nox.options.sessions = [ "lint", "format", "typecheck", "security_scan", "dead_code", "unit_tests", "integration_tests", "docs", "build", # "benchmark" removed — run explicitly with: nox -s benchmark # Benchmarks require a dedicated runner and ASV machine configuration. # They are run automatically in CI via the benchmark-publish and # benchmark-regression jobs. "coverage_report", ] ``` ### Code Location - `noxfile.py`, line 856: `"benchmark", # ASV benchmarks for performance tracking` --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-uat-tester
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#3946
No description provided.