Updated README.md Quick Start: added dev extras to pip install, added scripts/setup-dev.sh step.
Updated README.md Developing section: fixed oxt typo -> nox, added quality/security nox sessions, linked to docs/development/quality-automation.md.

2026-02-09: Ruff Cleanup in src/cleveragents/ - StrEnum Migration

Fixed all 26 ruff findings (all UP042: str, Enum -> StrEnum) + 12 consequent F401 unused Enum imports.
Migrated 26 enum classes across 15 files from class Foo(str, Enum) to class Foo(StrEnum).
StrEnum (Python 3.11+) is the modern replacement; project targets Python 3.13.
Semantic difference: str(StrEnum.MEMBER) returns the value (e.g., "foo") rather than "ClassName.MEMBER" — this is the correct/intended behavior for config/JSON string enums.
Verified: no code uses str() on enum members in the old format; all tests pass (304 scenarios, 0 failures).

2026-02-09: Task Q0.9 Complete - Ruff Lint Findings in features/

Fixed all 200 ruff lint findings in features/ directory -> 0 findings.
Config-level suppressions (168 findings): Added per-file-ignores in pyproject.toml for Behave-specific patterns (F811 for redefined step_impl, E501 for long step decorator strings).
Manual fixes (31 findings across 18 files): SIM115, UP028, SIM117, RUF005, B904, RUF012, SIM105, B007, B018, F821, SIM102, I001.

2026-02-09: Task Q0.8 Complete - Bandit Security Findings Remediation

Fixed all 16 pre-existing bandit findings (2 HIGH, 3 MEDIUM, 11 LOW) -> 0 findings.
Security hardening (HIGH+MEDIUM): Replaced jinja2.Environment with jinja2.sandbox.SandboxedEnvironment, added _validate_code_ast() helper for AST-based pre-validation, added _validate_lambda_ast() helper restricting eval() to lambda-only expressions.
Code quality (LOW): Replaced 6 assert statements with proper if/raise, replaced try/except/pass with contextlib.suppress(Exception), added logging to silent exception handlers.

2026-02-09: Stages Q0, Q1, Q2 Complete - Full Quality Automation Setup

Stage Q0 - Pre-commit Hooks: Created .pre-commit-config.yaml with 12 hooks across 5 categories (branch protection, general checks, Ruff, Pyright, Bandit, Vulture, Semgrep, Commitizen).
Discovery: CI platform is Forgejo (.forgejo/), NOT GitHub. Stage Q1 must use Forgejo Actions.
Discovery: 37 source files have formatting issues, ~27 files have trailing whitespace - pre-existing debt.
Discovery: features/steps/actor_cli_steps.py has many F811 (redefined step_impl) violations - Behave pattern.
Discovery: Average code complexity is A (3.56) across 979 blocks - good baseline.
Discovery: High complexity methods identified: LegacyDataMigrator.migrate_project_data E(37), ProviderRegistry._create_provider_llm C(20), ProviderRegistry.create_ai_provider C(18).
Stage Q1 - CI/CD Pipeline: Extended .forgejo/workflows/ci.yml with security, quality, coverage jobs. Created scripts/check-quality-gates.py.
Stage Q2 - Advanced Automation: Created nightly quality monitoring workflow, ADR compliance checker, PR template. New nox sessions: pre_commit, security_scan, dead_code, complexity, adr_compliance.

2026-02-10: Task 10B.4 - Quality Metrics Baseline Established [Brent]

Ran full quality suite via nox to establish current baseline:
- Unit Tests: 105 features, 1613 scenarios, 7555 steps - ALL PASS
- Lint (ruff): 0 findings
- Typecheck (pyright): 0 errors, 0 warnings
- Security (bandit): 0 findings
- Dead Code (vulture): 0 findings
- Complexity (radon): Average A (3.56), 981 blocks analyzed, no grade-F methods
- Coverage: 96% (9860 statements, 269 missing, 2852 branches, 213 branch-miss)
Fixed pre-existing test failure: Rich table column wrapping at narrow terminal widths. Fixed by patching console width to 200 in test setup.
Fixed missing dependency: added langchain-anthropic>=0.2.0 to pyproject.toml.

(Migrated from docs/implementation-notes.md)

## Implementation Notes — Quality Automation Setup (Q0-Q2) **2026-02-09**: Task Q0.6b Complete - README.md Setup Instructions - Updated README.md Quick Start: added `dev` extras to `pip install`, added `scripts/setup-dev.sh` step. - Updated README.md Developing section: fixed `oxt` typo -> `nox`, added quality/security nox sessions, linked to `docs/development/quality-automation.md`. **2026-02-09**: Ruff Cleanup in src/cleveragents/ - StrEnum Migration - Fixed all 26 ruff findings (all UP042: `str, Enum` -> `StrEnum`) + 12 consequent F401 unused `Enum` imports. - Migrated 26 enum classes across 15 files from `class Foo(str, Enum)` to `class Foo(StrEnum)`. - `StrEnum` (Python 3.11+) is the modern replacement; project targets Python 3.13. - Semantic difference: `str(StrEnum.MEMBER)` returns the value (e.g., `"foo"`) rather than `"ClassName.MEMBER"` — this is the correct/intended behavior for config/JSON string enums. - Verified: no code uses `str()` on enum members in the old format; all tests pass (304 scenarios, 0 failures). **2026-02-09**: Task Q0.9 Complete - Ruff Lint Findings in features/ - Fixed all 200 ruff lint findings in `features/` directory -> **0 findings**. - **Config-level suppressions** (168 findings): Added `per-file-ignores` in `pyproject.toml` for Behave-specific patterns (F811 for redefined `step_impl`, E501 for long step decorator strings). - **Manual fixes** (31 findings across 18 files): SIM115, UP028, SIM117, RUF005, B904, RUF012, SIM105, B007, B018, F821, SIM102, I001. **2026-02-09**: Task Q0.8 Complete - Bandit Security Findings Remediation - Fixed all 16 pre-existing bandit findings (2 HIGH, 3 MEDIUM, 11 LOW) -> **0 findings**. - **Security hardening** (HIGH+MEDIUM): Replaced `jinja2.Environment` with `jinja2.sandbox.SandboxedEnvironment`, added `_validate_code_ast()` helper for AST-based pre-validation, added `_validate_lambda_ast()` helper restricting `eval()` to lambda-only expressions. - **Code quality** (LOW): Replaced 6 `assert` statements with proper `if`/`raise`, replaced `try/except/pass` with `contextlib.suppress(Exception)`, added logging to silent exception handlers. **2026-02-09**: Stages Q0, Q1, Q2 Complete - Full Quality Automation Setup - **Stage Q0 - Pre-commit Hooks**: Created `.pre-commit-config.yaml` with 12 hooks across 5 categories (branch protection, general checks, Ruff, Pyright, Bandit, Vulture, Semgrep, Commitizen). - **Discovery**: CI platform is Forgejo (`.forgejo/`), NOT GitHub. Stage Q1 must use Forgejo Actions. - **Discovery**: 37 source files have formatting issues, ~27 files have trailing whitespace - pre-existing debt. - **Discovery**: `features/steps/actor_cli_steps.py` has many F811 (redefined step_impl) violations - Behave pattern. - **Discovery**: Average code complexity is A (3.56) across 979 blocks - good baseline. - **Discovery**: High complexity methods identified: `LegacyDataMigrator.migrate_project_data` E(37), `ProviderRegistry._create_provider_llm` C(20), `ProviderRegistry.create_ai_provider` C(18). - **Stage Q1 - CI/CD Pipeline**: Extended `.forgejo/workflows/ci.yml` with security, quality, coverage jobs. Created `scripts/check-quality-gates.py`. - **Stage Q2 - Advanced Automation**: Created nightly quality monitoring workflow, ADR compliance checker, PR template. New nox sessions: `pre_commit`, `security_scan`, `dead_code`, `complexity`, `adr_compliance`. **2026-02-10**: Task 10B.4 - Quality Metrics Baseline Established [Brent] - Ran full quality suite via nox to establish current baseline: - **Unit Tests**: 105 features, 1613 scenarios, 7555 steps - ALL PASS - **Lint (ruff)**: 0 findings - **Typecheck (pyright)**: 0 errors, 0 warnings - **Security (bandit)**: 0 findings - **Dead Code (vulture)**: 0 findings - **Complexity (radon)**: Average A (3.56), 981 blocks analyzed, no grade-F methods - **Coverage**: 96% (9860 statements, 269 missing, 2852 branches, 213 branch-miss) - Fixed pre-existing test failure: Rich table column wrapping at narrow terminal widths. Fixed by patching console width to 200 in test setup. - Fixed missing dependency: added `langchain-anthropic>=0.2.0` to `pyproject.toml`. *(Migrated from `docs/implementation-notes.md`)*