cleveragents/cleveragents-core

Fork 3

refactor(a2a): update test files for ACP to A2A rename #737

Merged

hamza.khyari merged 1 commit from refactor/m7-acp-to-a2a-tests into master

2026-03-14 01:35:34 +00:00

hamza.khyari commented

2026-03-12 16:23:45 +00:00

Member

Summary

Fixes 2 stale "ACP" comment references in vulture_whitelist.py (lines 633, 957) that were missed by the source rename in PR #705
Exhaustive audit confirmed all other ACP→A2A renames (13 test file renames, ~626 content replacements across ~27 files) were already completed by PR #705

Audit Results

All 6 acceptance criteria from Issue #689 are satisfied on origin/master:

#	Criterion	Status
1	All 13 test files renamed	PASS
2	All imports updated (cleveragents.acp → cleveragents.a2a)	PASS
3	Gherkin scenario names updated	PASS
4	Robot test names updated	PASS
5	JSON fixture content updated	PASS
6	Zero ACP references in test files	PASS

Only remaining gaps were 2 stale comments in vulture_whitelist.py, fixed in this PR.

Closes #689

## Summary - Fixes 2 stale "ACP" comment references in `vulture_whitelist.py` (lines 633, 957) that were missed by the source rename in PR #705 - Exhaustive audit confirmed all other ACP→A2A renames (13 test file renames, ~626 content replacements across ~27 files) were already completed by PR #705 ## Audit Results All 6 acceptance criteria from Issue #689 are satisfied on `origin/master`: | # | Criterion | Status | |---|---|---| | 1 | All 13 test files renamed | PASS | | 2 | All imports updated (cleveragents.acp → cleveragents.a2a) | PASS | | 3 | Gherkin scenario names updated | PASS | | 4 | Robot test names updated | PASS | | 5 | JSON fixture content updated | PASS | | 6 | Zero ACP references in test files | PASS | Only remaining gaps were 2 stale comments in `vulture_whitelist.py`, fixed in this PR. Closes #689

CoreRasurae reviewed

2026-03-12 17:12:41 +00:00

CoreRasurae left a comment

Code Review Report -- PR #737 / Branch `refactor/m7-acp-to-a2a-tests`

Scope: Full review of the ACP-to-A2A rename (PR #705 commit ec0b7631, PR #737 commit 54e52b57), cross-referenced against docs/specification.md and issue #689 acceptance criteria.

Review cycles: 3 global passes covering bugs, security, specification compliance, test flaws, test coverage gaps, test isolation, and performance.

Rename status: The ACP-to-A2A rename in source code, test files, feature files, Robot files, and JSON fixtures is complete. Zero stale Acp* class names, acp.* structlog events, or acp_version field names remain in code. The vulture whitelist comment fix in this PR is correct.

Summary Table

Category	High	Medium	Low	Total
Bugs	--	2	2	4
Specification Compliance	2	3	1	6
Test Flaws	--	1	5	6
Test Coverage Gaps	1	5	7	13
Test Isolation	--	2	1	3
Performance	--	1	1	2
Security	--	--	4	4
Total	3	14	21	38

HIGH Severity

SC-1. Error codes are strings, not JSON-RPC 2.0 integers

File: src/cleveragents/a2a/errors.py:32-39, src/cleveragents/a2a/models.py:53

The specification (specification.md:43211-43229) requires JSON-RPC 2.0 integer error codes:

-32001 Entity not found       -32005 Already exists
-32002 Auth required           -32006 Budget exceeded
-32003 Forbidden               -32007 Version mismatch
-32004 Invalid state            -32008 Plan error

The implementation uses string codes ("NOT_FOUND", "PLAN_ERROR", etc.) and A2aErrorDetail.code is typed as str. Clients dispatching on integer codes per the spec will break.

SC-2. `map_domain_error` missing 2 spec-required exception mappings

File: src/cleveragents/a2a/errors.py:100-139

The specification maps DuplicateEntityError to -32005 and BudgetExceededError to -32006. Neither exception type appears in map_domain_error. These will silently fall through to the generic INTERNAL_ERROR catch-all.

TC-1. `map_domain_error` -- 3 of 8 exception branches untested

File: src/cleveragents/a2a/errors.py:131-136

No test exercises:

AuthenticationError -> AUTH_ERROR
AuthorizationError -> FORBIDDEN
ConfigurationError -> CONFIGURATION_ERROR

These are production error paths with zero coverage.

MEDIUM Severity

B-1. `publish()` iterates subscription dict directly -- re-entrancy crash

File: src/cleveragents/a2a/events.py:56

for sub_id, callback in self._subscriptions.items():

If any callback calls subscribe_local(), unsubscribe(), or close(), the dict is mutated during iteration, raising RuntimeError: dictionary changed size during iteration. This is a single-threaded re-entrancy bug, not just a threading concern.

Fix: for sub_id, callback in list(self._subscriptions.items()):

B-2. `subscribe_local()` succeeds on a closed queue

File: src/cleveragents/a2a/events.py:65-72

After close() sets _is_closed = True and clears _subscriptions, calling subscribe_local() silently adds a new entry to the cleared dict. The subscription will never receive events. No guard checks _is_closed.

SC-3. Operation names don't match specification convention

File: src/cleveragents/a2a/facade.py:57-69

The specification (specification.md:43091-43116) defines standard operations as session/new, session/load, session/prompt, etc. and extensions as _cleveragents/plan/use, _cleveragents/plan/execute, etc. The implementation uses a different naming convention (session.create, plan.create, plan.execute, etc.) that doesn't align with either.

SC-4. No Agent Card model or `/.well-known/agent.json` discovery

The specification (specification.md:43187-43207) requires Agent Card discovery. No AgentCard model or discovery endpoint exists anywhere in the codebase.

SC-5. `A2A-Version` HTTP header not implemented

File: src/cleveragents/a2a/versioning.py

Version negotiation logic exists but is not wired to the HTTP transport layer. The specification (specification.md:43234) requires A2A-Version header on requests/responses.

TC-2. 5 `A2aEventQueue` input validation guards untested

File: src/cleveragents/a2a/events.py:48-49, 67-68, 76-77, 85-86, 111-112

Guards for publish() with non-A2aEvent, subscribe_local() with non-callable, unsubscribe() with empty string, get_events() with invalid limit, and subscribe_remote() with empty endpoint are all untested.

TC-3. `_cleanup_session_devcontainers` not tested via facade

File: src/cleveragents/a2a/facade.py:252-279

The session.close handler calls _cleanup_session_devcontainers() which dynamically imports CleanupService. This entire code path (deferred import, try/except, logging) is never exercised by any A2A test.

TC-4. `A2aHttpTransport` and `A2aVersionNegotiator` input guards untested

Files: src/cleveragents/a2a/transport.py:30-31, 47-48, src/cleveragents/a2a/versioning.py:31-32, 44-45

Validation guards for send() with non-A2aRequest, connect() with empty URL, negotiate() with empty string, and is_supported() with empty string have zero test coverage.

TC-5. `map_domain_error` catch-all paths untested

File: src/cleveragents/a2a/errors.py:120-121, 137-139

The TypeError guard for non-Exception input (line 120), the generic CleverAgentsError fallback (line 137), and the plain Exception catch-all (line 139) are untested.

TC-6. Weak error assertions -- no error code validation

File: features/a2a_facade_wiring.feature:23-26, 39-42

session.close and plan.create validation failure tests only assert status == "error" without checking the error code or message. A bug that returns the wrong error code would pass.

P-1. Unbounded `_events` list growth -- memory leak

File: src/cleveragents/a2a/events.py:31, 50

self._events grows without limit. No max size, TTL, or eviction policy. In a long-running process, this accumulates all events forever.

TF-1. No-op assertion -- `step_service_registered` always passes

File: features/steps/a2a_facade_steps.py:57-59

def step_service_registered(context: Context) -> None:
    pass  # verifies nothing

Should assert the service is actually registered in the facade's service dict.

TI-1. Benchmark queue grows unboundedly across iterations

File: benchmarks/a2a_facade_bench.py:107-118

time_publish_single and time_publish_100 append to the queue across ASV iterations without clearing. Memory bloats and get_events() becomes progressively slower, producing non-representative benchmark results.

TI-2. Benchmark subscriber accumulation

File: benchmarks/a2a_facade_bench.py:117-118

time_subscribe_local adds a new subscriber per ASV iteration. After thousands of iterations, every publish() call fans out to thousands of callbacks, completely skewing timing results.

LOW Severity

B-3. 6 facade stub handlers skip validation

File: src/cleveragents/a2a/facade.py:242, 288, 303, 313, 328, 342

When services are None, stub handlers for session.close, plan.create, plan.execute, plan.status, plan.diff, and plan.apply return success responses without validating required parameters. The service-present paths validate. This creates behavioral divergence between modes -- tests written against stubs accept empty plan_id/session_id that would fail with real services.

B-4. `A2aOperationNotFoundError` bypasses error response wrapping

File: src/cleveragents/a2a/facade.py:138-139

All other exceptions get wrapped in A2aResponse(status="error"). This error is re-raised directly, forcing callers to handle two different error shapes.

S-1. No length/charset constraint on operation name

File: src/cleveragents/a2a/models.py:76-81

The _operation_non_empty validator only checks non-empty. An operation string with \n characters could inject fake log lines via structlog output. Recommend: re.match(r'^[a-z][a-z0-9_.]{0,63}$', value).

S-2. `params` and `auth` fields accept arbitrary unvalidated data

File: src/cleveragents/a2a/models.py:73-74

auth: dict[str, Any] | None has no schema or key constraints. Low risk in local mode but a deserialization concern if the model handles external HTTP input.

S-3. No SSRF protection on `server_url`

File: src/cleveragents/a2a/server_config.py:34-42

Validates http/https scheme but accepts http://169.254.169.254/, http://localhost:6379/, etc. Design gap to fix before server mode goes live.

S-4. `str(exc)` in error mapping could leak sensitive info

File: src/cleveragents/a2a/errors.py:124-139

If domain exceptions include tokens/credentials in their message strings, str(exc) propagates them into A2aResponse.error.message and into logs at facade.py:155.

docs/adr/ADR-047-acp-standard-adoption.md should be ADR-047-a2a-standard-adoption.md
docs/adr/ADR-026-agent-client-protocol.md should be ADR-026-agent-to-agent-protocol.md
Referenced in mkdocs.yml:70 and specification.md at lines 55, 23304, 23373, 42992, 43253

TF-2. Trivially true assertions in facade steps

File: features/steps/a2a_facade_steps.py:399-401, 425-427

assert context.response is not None and assert context.error_detail is not None can never fail once the preceding step assigns them.

TF-3. Type-coercing assertion hides type mismatch

File: features/a2a_facade_wiring.feature:93, step at a2a_facade_wiring_steps.py:213-215

Uses str(actual) == value which converts True (bool) to "True" (string), masking type-correctness issues. Compare to robot/helper_a2a_facade_wiring.py:196 which correctly uses is True.

TF-4. Tautological protocol-default-body tests

File: features/steps/a2a_clients_coverage_boost_steps.py:133-137

Protocol default bodies are ... (Ellipsis) which is a no-op. Testing "no exception was thrown" is trivially guaranteed for 7 scenarios.

TF-5. Hardcoded magic number 11 for operation count

File: robot/helper_a2a_facade.py:95-96

len(ops) == 11 breaks silently if _SUPPORTED_OPERATIONS changes. Should derive from source of truth.

TF-6. Mock `delete` not verified with correct argument

File: features/steps/a2a_facade_wiring_steps.py:77-81

session.close test doesn't assert svc.delete.assert_called_once_with("MOCK-SESSION-001"). A bug that passes the wrong session_id would be undetected.

TC-7 through TC-13. Additional coverage gaps (low priority)

Plan stub-mode operations (plan.execute/status/diff/apply without service) untested
Registry filter parameters (namespace, type_name) untested
EventQueue callback error handling (failing callback not preventing delivery) untested
Weak stub-fallback assertions (missing plan_id, tools list checks)
close() double-call idempotency untested
Publish-after-close with active subscribers untested
map_domain_error with generic RuntimeError untested

TI-3. Triplicated mock infrastructure

Files: a2a_facade_coverage_boost_steps.py:34-54, a2a_facade_wiring_steps.py:42-90, robot/helper_a2a_facade_wiring.py:36-95

Three files define near-identical _MockPlanIdentity, _MockPlan, and builder functions. Divergence risk on service contract changes.

P-2. `disconnect()` raises in teardown contexts

File: src/cleveragents/a2a/transport.py:41-47

A2aHttpTransport.disconnect() raises A2aNotAvailableError. If called in a finally block, this masks the original exception. Consider returning silently like is_connected() returns False.

Positive Findings

Rename completeness: All Python, Feature, Robot, JSON, and config files use correct A2A naming. Zero stale ACP references in code.
CLI integration: server.py, lsp.py, and lsp/server.py all have correct A2A imports and no stale references.
No import cycles: The deferred CleanupService import in facade.py is safe.
Handler map caching: The PERF-1 fix with invalidation on register_service is correctly implemented.
Error hierarchy: Exception class hierarchy mirrors the project-wide CleverAgentsError pattern correctly.

Reviewed by automated code review agent. 3 global review cycles performed across all categories (bugs, security, spec compliance, test flaws, test coverage, test isolation, performance). Cross-referenced against docs/specification.md (sections 43087-43248 for A2A architecture) and issue #689 acceptance criteria.

# Code Review Report -- PR #737 / Branch `refactor/m7-acp-to-a2a-tests` **Scope:** Full review of the ACP-to-A2A rename (PR #705 commit `ec0b7631`, PR #737 commit `54e52b57`), cross-referenced against `docs/specification.md` and issue #689 acceptance criteria. **Review cycles:** 3 global passes covering bugs, security, specification compliance, test flaws, test coverage gaps, test isolation, and performance. **Rename status:** The ACP-to-A2A rename in source code, test files, feature files, Robot files, and JSON fixtures is **complete**. Zero stale `Acp*` class names, `acp.*` structlog events, or `acp_version` field names remain in code. The vulture whitelist comment fix in this PR is correct. --- ## Summary Table | Category | High | Medium | Low | Total | |---|---|---|---|---| | Bugs | -- | 2 | 2 | 4 | | Specification Compliance | 2 | 3 | 1 | 6 | | Test Flaws | -- | 1 | 5 | 6 | | Test Coverage Gaps | 1 | 5 | 7 | 13 | | Test Isolation | -- | 2 | 1 | 3 | | Performance | -- | 1 | 1 | 2 | | Security | -- | -- | 4 | 4 | | **Total** | **3** | **14** | **21** | **38** | --- ## HIGH Severity ### SC-1. Error codes are strings, not JSON-RPC 2.0 integers **File:** `src/cleveragents/a2a/errors.py:32-39`, `src/cleveragents/a2a/models.py:53` The specification (`specification.md:43211-43229`) requires JSON-RPC 2.0 **integer** error codes: ``` -32001 Entity not found -32005 Already exists -32002 Auth required -32006 Budget exceeded -32003 Forbidden -32007 Version mismatch -32004 Invalid state -32008 Plan error ``` The implementation uses **string** codes (`"NOT_FOUND"`, `"PLAN_ERROR"`, etc.) and `A2aErrorDetail.code` is typed as `str`. Clients dispatching on integer codes per the spec will break. ### SC-2. `map_domain_error` missing 2 spec-required exception mappings **File:** `src/cleveragents/a2a/errors.py:100-139` The specification maps `DuplicateEntityError` to `-32005` and `BudgetExceededError` to `-32006`. Neither exception type appears in `map_domain_error`. These will silently fall through to the generic `INTERNAL_ERROR` catch-all. ### TC-1. `map_domain_error` -- 3 of 8 exception branches untested **File:** `src/cleveragents/a2a/errors.py:131-136` No test exercises: - `AuthenticationError` -> `AUTH_ERROR` - `AuthorizationError` -> `FORBIDDEN` - `ConfigurationError` -> `CONFIGURATION_ERROR` These are production error paths with zero coverage. --- ## MEDIUM Severity ### B-1. `publish()` iterates subscription dict directly -- re-entrancy crash **File:** `src/cleveragents/a2a/events.py:56` ```python for sub_id, callback in self._subscriptions.items(): ``` If any callback calls `subscribe_local()`, `unsubscribe()`, or `close()`, the dict is mutated during iteration, raising `RuntimeError: dictionary changed size during iteration`. This is a **single-threaded re-entrancy bug**, not just a threading concern. **Fix:** `for sub_id, callback in list(self._subscriptions.items()):` ### B-2. `subscribe_local()` succeeds on a closed queue **File:** `src/cleveragents/a2a/events.py:65-72` After `close()` sets `_is_closed = True` and clears `_subscriptions`, calling `subscribe_local()` silently adds a new entry to the cleared dict. The subscription will never receive events. No guard checks `_is_closed`. ### SC-3. Operation names don't match specification convention **File:** `src/cleveragents/a2a/facade.py:57-69` The specification (`specification.md:43091-43116`) defines standard operations as `session/new`, `session/load`, `session/prompt`, etc. and extensions as `_cleveragents/plan/use`, `_cleveragents/plan/execute`, etc. The implementation uses a different naming convention (`session.create`, `plan.create`, `plan.execute`, etc.) that doesn't align with either. ### SC-4. No Agent Card model or `/.well-known/agent.json` discovery The specification (`specification.md:43187-43207`) requires Agent Card discovery. No `AgentCard` model or discovery endpoint exists anywhere in the codebase. ### SC-5. `A2A-Version` HTTP header not implemented **File:** `src/cleveragents/a2a/versioning.py` Version negotiation logic exists but is not wired to the HTTP transport layer. The specification (`specification.md:43234`) requires `A2A-Version` header on requests/responses. ### TC-2. 5 `A2aEventQueue` input validation guards untested **File:** `src/cleveragents/a2a/events.py:48-49, 67-68, 76-77, 85-86, 111-112` Guards for `publish()` with non-A2aEvent, `subscribe_local()` with non-callable, `unsubscribe()` with empty string, `get_events()` with invalid limit, and `subscribe_remote()` with empty endpoint are all untested. ### TC-3. `_cleanup_session_devcontainers` not tested via facade **File:** `src/cleveragents/a2a/facade.py:252-279` The `session.close` handler calls `_cleanup_session_devcontainers()` which dynamically imports `CleanupService`. This entire code path (deferred import, try/except, logging) is never exercised by any A2A test. ### TC-4. `A2aHttpTransport` and `A2aVersionNegotiator` input guards untested **Files:** `src/cleveragents/a2a/transport.py:30-31, 47-48`, `src/cleveragents/a2a/versioning.py:31-32, 44-45` Validation guards for `send()` with non-A2aRequest, `connect()` with empty URL, `negotiate()` with empty string, and `is_supported()` with empty string have zero test coverage. ### TC-5. `map_domain_error` catch-all paths untested **File:** `src/cleveragents/a2a/errors.py:120-121, 137-139` The `TypeError` guard for non-Exception input (line 120), the generic `CleverAgentsError` fallback (line 137), and the plain `Exception` catch-all (line 139) are untested. ### TC-6. Weak error assertions -- no error code validation **File:** `features/a2a_facade_wiring.feature:23-26, 39-42` `session.close` and `plan.create` validation failure tests only assert `status == "error"` without checking the error code or message. A bug that returns the wrong error code would pass. ### P-1. Unbounded `_events` list growth -- memory leak **File:** `src/cleveragents/a2a/events.py:31, 50` `self._events` grows without limit. No max size, TTL, or eviction policy. In a long-running process, this accumulates all events forever. ### TF-1. No-op assertion -- `step_service_registered` always passes **File:** `features/steps/a2a_facade_steps.py:57-59` ```python def step_service_registered(context: Context) -> None: pass # verifies nothing ``` Should assert the service is actually registered in the facade's service dict. ### TI-1. Benchmark queue grows unboundedly across iterations **File:** `benchmarks/a2a_facade_bench.py:107-118` `time_publish_single` and `time_publish_100` append to the queue across ASV iterations without clearing. Memory bloats and `get_events()` becomes progressively slower, producing non-representative benchmark results. ### TI-2. Benchmark subscriber accumulation **File:** `benchmarks/a2a_facade_bench.py:117-118` `time_subscribe_local` adds a new subscriber per ASV iteration. After thousands of iterations, every `publish()` call fans out to thousands of callbacks, completely skewing timing results. --- ## LOW Severity ### B-3. 6 facade stub handlers skip validation **File:** `src/cleveragents/a2a/facade.py:242, 288, 303, 313, 328, 342` When services are `None`, stub handlers for `session.close`, `plan.create`, `plan.execute`, `plan.status`, `plan.diff`, and `plan.apply` return success responses without validating required parameters. The service-present paths validate. This creates behavioral divergence between modes -- tests written against stubs accept empty `plan_id`/`session_id` that would fail with real services. ### B-4. `A2aOperationNotFoundError` bypasses error response wrapping **File:** `src/cleveragents/a2a/facade.py:138-139` All other exceptions get wrapped in `A2aResponse(status="error")`. This error is re-raised directly, forcing callers to handle two different error shapes. ### S-1. No length/charset constraint on operation name **File:** `src/cleveragents/a2a/models.py:76-81` The `_operation_non_empty` validator only checks non-empty. An operation string with `\n` characters could inject fake log lines via structlog output. Recommend: `re.match(r'^[a-z][a-z0-9_.]{0,63}$', value)`. ### S-2. `params` and `auth` fields accept arbitrary unvalidated data **File:** `src/cleveragents/a2a/models.py:73-74` `auth: dict[str, Any] | None` has no schema or key constraints. Low risk in local mode but a deserialization concern if the model handles external HTTP input. ### S-3. No SSRF protection on `server_url` **File:** `src/cleveragents/a2a/server_config.py:34-42` Validates http/https scheme but accepts `http://169.254.169.254/`, `http://localhost:6379/`, etc. Design gap to fix before server mode goes live. ### S-4. `str(exc)` in error mapping could leak sensitive info **File:** `src/cleveragents/a2a/errors.py:124-139` If domain exceptions include tokens/credentials in their message strings, `str(exc)` propagates them into `A2aResponse.error.message` and into logs at `facade.py:155`. ### SC-6. 2 stale ADR filenames (outside PR scope but related) - `docs/adr/ADR-047-acp-standard-adoption.md` should be `ADR-047-a2a-standard-adoption.md` - `docs/adr/ADR-026-agent-client-protocol.md` should be `ADR-026-agent-to-agent-protocol.md` - Referenced in `mkdocs.yml:70` and `specification.md` at lines 55, 23304, 23373, 42992, 43253 ### TF-2. Trivially true assertions in facade steps **File:** `features/steps/a2a_facade_steps.py:399-401, 425-427` `assert context.response is not None` and `assert context.error_detail is not None` can never fail once the preceding step assigns them. ### TF-3. Type-coercing assertion hides type mismatch **File:** `features/a2a_facade_wiring.feature:93`, step at `a2a_facade_wiring_steps.py:213-215` Uses `str(actual) == value` which converts `True` (bool) to `"True"` (string), masking type-correctness issues. Compare to `robot/helper_a2a_facade_wiring.py:196` which correctly uses `is True`. ### TF-4. Tautological protocol-default-body tests **File:** `features/steps/a2a_clients_coverage_boost_steps.py:133-137` Protocol default bodies are `...` (Ellipsis) which is a no-op. Testing "no exception was thrown" is trivially guaranteed for 7 scenarios. ### TF-5. Hardcoded magic number 11 for operation count **File:** `robot/helper_a2a_facade.py:95-96` `len(ops) == 11` breaks silently if `_SUPPORTED_OPERATIONS` changes. Should derive from source of truth. ### TF-6. Mock `delete` not verified with correct argument **File:** `features/steps/a2a_facade_wiring_steps.py:77-81` `session.close` test doesn't assert `svc.delete.assert_called_once_with("MOCK-SESSION-001")`. A bug that passes the wrong `session_id` would be undetected. ### TC-7 through TC-13. Additional coverage gaps (low priority) - Plan stub-mode operations (`plan.execute/status/diff/apply` without service) untested - Registry filter parameters (`namespace`, `type_name`) untested - EventQueue callback error handling (failing callback not preventing delivery) untested - Weak stub-fallback assertions (missing plan_id, tools list checks) - `close()` double-call idempotency untested - Publish-after-close with active subscribers untested - `map_domain_error` with generic `RuntimeError` untested ### TI-3. Triplicated mock infrastructure **Files:** `a2a_facade_coverage_boost_steps.py:34-54`, `a2a_facade_wiring_steps.py:42-90`, `robot/helper_a2a_facade_wiring.py:36-95` Three files define near-identical `_MockPlanIdentity`, `_MockPlan`, and builder functions. Divergence risk on service contract changes. ### P-2. `disconnect()` raises in teardown contexts **File:** `src/cleveragents/a2a/transport.py:41-47` `A2aHttpTransport.disconnect()` raises `A2aNotAvailableError`. If called in a `finally` block, this masks the original exception. Consider returning silently like `is_connected()` returns `False`. --- ## Positive Findings - **Rename completeness:** All Python, Feature, Robot, JSON, and config files use correct A2A naming. Zero stale ACP references in code. - **CLI integration:** `server.py`, `lsp.py`, and `lsp/server.py` all have correct A2A imports and no stale references. - **No import cycles:** The deferred `CleanupService` import in `facade.py` is safe. - **Handler map caching:** The PERF-1 fix with invalidation on `register_service` is correctly implemented. - **Error hierarchy:** Exception class hierarchy mirrors the project-wide `CleverAgentsError` pattern correctly. --- *Reviewed by automated code review agent. 3 global review cycles performed across all categories (bugs, security, spec compliance, test flaws, test coverage, test isolation, performance). Cross-referenced against `docs/specification.md` (sections 43087-43248 for A2A architecture) and issue #689 acceptance criteria.*

hamza.khyari was assigned by freemo

2026-03-12 20:22:22 +00:00

freemo added this to the v3.6.0 milestone

2026-03-12 20:22:22 +00:00

freemo added the

Type

Task

label

2026-03-12 20:22:22 +00:00

freemo requested changes

2026-03-12 20:37:02 +00:00

freemo left a comment

Review — PR #737: refactor(a2a): update stale ACP comments in vulture whitelist

Diff Summary

Files changed: 1 (vulture_whitelist.py)
Lines: +2 / −2

Commit Message Compliance

Issue #689 metadata: refactor(a2a): update test files for ACP to A2A rename
PR commit first line: refactor(a2a): update stale ACP comments in vulture whitelist
Match: ❌ Mismatch. The issue metadata prescribes refactor(a2a): update test files for ACP to A2A rename but the PR commit uses a different message. Per CONTRIBUTING.md, the commit message first line must match the issue metadata's Commit Message field exactly.

I understand the PR only covers the final 2 stale comments (the bulk of the rename was done in PR #705), so the commit message is more descriptively accurate for this PR's scope. However, the project convention requires it to match the issue metadata. Options:

Amend the commit to use the prescribed message, or
Update issue #689's metadata to reflect a revised commit message for this remaining cleanup PR.

Closes Keyword

PR body uses ISSUES CLOSED: #689 ❌ — This is not a standard Forgejo auto-close keyword. Forgejo recognizes: close, closes, closed, fix, fixes, fixed, resolve, resolves, resolved (followed by #N). Please change to Closes #689.

Label Compliance

Required Label	Status
Type label	✅ `Type/Task`
Priority label	❌ Missing (issue has `Priority/Medium`)
MoSCoW label	❌ Missing (issue has `MoSCoW/Must have`)

Code Quality

The 2-line change itself is correct: replacing stale "ACP" with "A2A" in comment text at lines 633 and 957 of vulture_whitelist.py.
The audit table in the PR body documenting all 6 acceptance criteria is a nice touch.

Required Changes

Fix commit message to match issue #689 metadata, or coordinate with the issue author to update the metadata.
Replace ISSUES CLOSED: #689 with Closes #689 in the PR body.
Add labels: Priority/Medium and MoSCoW/Must have.

## Review — PR #737: refactor(a2a): update stale ACP comments in vulture whitelist ### Diff Summary - **Files changed:** 1 (`vulture_whitelist.py`) - **Lines:** +2 / −2 ### Commit Message Compliance - **Issue #689 metadata:** `refactor(a2a): update test files for ACP to A2A rename` - **PR commit first line:** `refactor(a2a): update stale ACP comments in vulture whitelist` - **Match:** ❌ **Mismatch.** The issue metadata prescribes `refactor(a2a): update test files for ACP to A2A rename` but the PR commit uses a different message. Per CONTRIBUTING.md, the commit message first line must match the issue metadata's Commit Message field exactly. I understand the PR only covers the final 2 stale comments (the bulk of the rename was done in PR #705), so the commit message is more descriptively accurate for *this* PR's scope. However, the project convention requires it to match the issue metadata. Options: 1. Amend the commit to use the prescribed message, or 2. Update issue #689's metadata to reflect a revised commit message for this remaining cleanup PR. ### Closes Keyword - PR body uses `ISSUES CLOSED: #689` ❌ — This is **not** a standard Forgejo auto-close keyword. Forgejo recognizes: `close`, `closes`, `closed`, `fix`, `fixes`, `fixed`, `resolve`, `resolves`, `resolved` (followed by `#N`). Please change to `Closes #689`. ### Label Compliance | Required Label | Status | |---|---| | Type label | ✅ `Type/Task` | | Priority label | ❌ Missing (issue has `Priority/Medium`) | | MoSCoW label | ❌ Missing (issue has `MoSCoW/Must have`) | ### Code Quality - The 2-line change itself is correct: replacing stale "ACP" with "A2A" in comment text at lines 633 and 957 of `vulture_whitelist.py`. - The audit table in the PR body documenting all 6 acceptance criteria is a nice touch. ### Required Changes 1. **Fix commit message** to match issue #689 metadata, or coordinate with the issue author to update the metadata. 2. **Replace `ISSUES CLOSED: #689`** with `Closes #689` in the PR body. 3. **Add labels:** `Priority/Medium` and `MoSCoW/Must have`.

hamza.khyari force-pushed refactor/m7-acp-to-a2a-tests from 54e52b574c

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 16s

Details

CI / build (pull_request) Successful in 18s

Details

CI / quality (pull_request) Successful in 21s

Details

CI / security (pull_request) Successful in 40s

Details

CI / typecheck (pull_request) Successful in 1m13s

Details

CI / unit_tests (pull_request) Successful in 3m6s

Details

CI / docker (pull_request) Successful in 42s

Details

CI / integration_tests (pull_request) Successful in 3m57s

Details

CI / coverage (pull_request) Successful in 5m21s

Details

CI / benchmark-regression (pull_request) Successful in 38m42s

Details

to 0038b737c1

CI / lint (pull_request) Successful in 14s

Details

CI / typecheck (pull_request) Successful in 36s

Details

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 14s

Details

CI / coverage (pull_request) Successful in 5m23s

Details

CI / security (pull_request) Successful in 33s

Details

CI / quality (pull_request) Successful in 15s

Details

CI / benchmark-regression (pull_request) Has been cancelled

Details

CI / integration_tests (pull_request) Has been cancelled

Details

CI / unit_tests (pull_request) Has been cancelled

Details

CI / docker (pull_request) Has been cancelled

Details

2026-03-13 14:52:23 +00:00

Compare

hamza.khyari changed title from ~~refactor(a2a): update stale ACP comments in vulture whitelist~~ to refactor(a2a): update test files for ACP to A2A rename

2026-03-13 14:52:47 +00:00

hamza.khyari added the

labels

2026-03-13 14:53:25 +00:00

hamza.khyari commented

2026-03-13 19:28:40 +00:00

Author

Member

Follow-up Issues from Code Review

The review by @CoreRasurae identified several pre-existing bugs and spec compliance gaps in the A2A module (src/cleveragents/a2a/). These are outside the scope of this PR (#689 — ACP→A2A test rename) and should be tracked as separate issues.

All findings below have been validated against the source code and docs/specification.md.

HIGH — Spec Compliance

1. Error codes are strings, not JSON-RPC 2.0 integers

errors.py:32-39 defines codes as "NOT_FOUND", "PLAN_ERROR", etc.
models.py:53 types A2aErrorDetail.code as str
Spec (specification.md:43211-43229) requires integer codes: -32001 (entity not found) through -32008 (plan error), plus reserved -32600 to -32603
Fix: change constants to int, change A2aErrorDetail.code: int, update map_domain_error return type to tuple[int, str]

2. map_domain_error missing spec-required exception mappings

Spec maps DuplicateEntityError → -32005 and BudgetExceededError → -32006
Neither exception class exists in src/cleveragents/core/exceptions.py
Neither mapping exists in errors.py:100-139
Fix: add both exception classes to core/exceptions.py, add branches in map_domain_error

3. 3 of 8 exception branches in map_domain_error untested

AuthenticationError → AUTH_ERROR, AuthorizationError → FORBIDDEN, ConfigurationError → CONFIGURATION_ERROR have zero test coverage
Only 4 branches tested (in robot/helper_a2a_facade_wiring.py:216-232)

HIGH — Bugs

4. publish() re-entrancy crash

events.py:56: for sub_id, callback in self._subscriptions.items() iterates dict directly
If any callback calls subscribe_local(), unsubscribe(), or close(), raises RuntimeError: dictionary changed size during iteration
Fix: list(self._subscriptions.items())

5. subscribe_local() succeeds on a closed queue

events.py:65-72: no _is_closed guard — silently adds subscriptions that will never fire
publish() has the guard (line 46), subscribe_local() doesn't
Fix: add if self._is_closed: raise RuntimeError(...) guard

MEDIUM — Spec Compliance

6. Operation names don't match spec convention

facade.py:57-69 uses dot-delimited names: session.create, plan.execute, etc.
Spec (specification.md:23422-23433, 43107-43114) uses slash-delimited with _cleveragents/ prefix: _cleveragents/plan/use, _cleveragents/plan/execute, _cleveragents/registry/{entity}/list, etc.
This is a larger change with cascade across all test files

MEDIUM — Performance

7. Unbounded _events list growth

events.py:31,50: _events grows without limit; get_events() never removes; no max size/TTL/eviction
Only close() clears — in a long-running process, memory grows indefinitely
Fix: add max_size parameter with oldest-event eviction on overflow

LOW — Test Quality

8. No-op step assertion

a2a_facade_steps.py:57-59: step_service_registered body is pass — asserts nothing
Should verify the service is actually registered in the facade

## Follow-up Issues from Code Review The review by @CoreRasurae identified several pre-existing bugs and spec compliance gaps in the A2A module (`src/cleveragents/a2a/`). These are outside the scope of this PR (#689 — ACP→A2A test rename) and should be tracked as separate issues. All findings below have been validated against the source code and `docs/specification.md`. ### HIGH — Spec Compliance **1. Error codes are strings, not JSON-RPC 2.0 integers** - `errors.py:32-39` defines codes as `"NOT_FOUND"`, `"PLAN_ERROR"`, etc. - `models.py:53` types `A2aErrorDetail.code` as `str` - Spec (`specification.md:43211-43229`) requires integer codes: `-32001` (entity not found) through `-32008` (plan error), plus reserved `-32600` to `-32603` - Fix: change constants to `int`, change `A2aErrorDetail.code: int`, update `map_domain_error` return type to `tuple[int, str]` **2. `map_domain_error` missing spec-required exception mappings** - Spec maps `DuplicateEntityError` → `-32005` and `BudgetExceededError` → `-32006` - Neither exception class exists in `src/cleveragents/core/exceptions.py` - Neither mapping exists in `errors.py:100-139` - Fix: add both exception classes to `core/exceptions.py`, add branches in `map_domain_error` **3. 3 of 8 exception branches in `map_domain_error` untested** - `AuthenticationError` → `AUTH_ERROR`, `AuthorizationError` → `FORBIDDEN`, `ConfigurationError` → `CONFIGURATION_ERROR` have zero test coverage - Only 4 branches tested (in `robot/helper_a2a_facade_wiring.py:216-232`) ### HIGH — Bugs **4. `publish()` re-entrancy crash** - `events.py:56`: `for sub_id, callback in self._subscriptions.items()` iterates dict directly - If any callback calls `subscribe_local()`, `unsubscribe()`, or `close()`, raises `RuntimeError: dictionary changed size during iteration` - Fix: `list(self._subscriptions.items())` **5. `subscribe_local()` succeeds on a closed queue** - `events.py:65-72`: no `_is_closed` guard — silently adds subscriptions that will never fire - `publish()` has the guard (line 46), `subscribe_local()` doesn't - Fix: add `if self._is_closed: raise RuntimeError(...)` guard ### MEDIUM — Spec Compliance **6. Operation names don't match spec convention** - `facade.py:57-69` uses dot-delimited names: `session.create`, `plan.execute`, etc. - Spec (`specification.md:23422-23433, 43107-43114`) uses slash-delimited with `_cleveragents/` prefix: `_cleveragents/plan/use`, `_cleveragents/plan/execute`, `_cleveragents/registry/{entity}/list`, etc. - This is a larger change with cascade across all test files ### MEDIUM — Performance **7. Unbounded `_events` list growth** - `events.py:31,50`: `_events` grows without limit; `get_events()` never removes; no max size/TTL/eviction - Only `close()` clears — in a long-running process, memory grows indefinitely - Fix: add `max_size` parameter with oldest-event eviction on overflow ### LOW — Test Quality **8. No-op step assertion** - `a2a_facade_steps.py:57-59`: `step_service_registered` body is `pass` — asserts nothing - Should verify the service is actually registered in the facade

hamza.khyari force-pushed refactor/m7-acp-to-a2a-tests from 0038b737c1

CI / lint (pull_request) Successful in 14s

Details

CI / typecheck (pull_request) Successful in 36s

Details

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 14s

Details

CI / coverage (pull_request) Successful in 5m23s

Details

CI / security (pull_request) Successful in 33s

Details

CI / quality (pull_request) Successful in 15s

Details

CI / benchmark-regression (pull_request) Has been cancelled

Details

CI / integration_tests (pull_request) Has been cancelled

Details

CI / unit_tests (pull_request) Has been cancelled

Details

CI / docker (pull_request) Has been cancelled

Details

to 8b1dfe8510

CI / benchmark-publish (pull_request) Has been skipped

Details

CI / lint (pull_request) Successful in 14s

Details

CI / build (pull_request) Successful in 15s

Details

CI / quality (pull_request) Successful in 17s

Details

CI / e2e_tests (pull_request) Successful in 26s

Details

CI / typecheck (pull_request) Successful in 32s

Details

CI / security (pull_request) Successful in 32s

Details

CI / unit_tests (pull_request) Successful in 2m21s

Details

CI / integration_tests (pull_request) Successful in 3m2s

Details

CI / docker (pull_request) Successful in 47s

Details

CI / coverage (pull_request) Successful in 4m56s

Details

CI / benchmark-regression (pull_request) Successful in 35m58s

Details

2026-03-13 19:30:29 +00:00

Compare

brent.edwards approved these changes

2026-03-13 22:22:25 +00:00

Dismissed

brent.edwards left a comment

Code Review — PR #737: refactor(a2a): update test files for ACP to A2A rename

Reviewer: @brent.edwards
Commit reviewed: 8b1dfe85
Review cycles: 2 full passes + 4 parallel verification threads (ACP reference audit, vulture whitelist accuracy, issue #689 acceptance criteria, CONTRIBUTING.md compliance)

Verification of Previous Requested Changes (@freemo)

All three items from @freemo's REQUEST_CHANGES review (commit 54e52b57) have been addressed in the force-pushed commit 8b1dfe85:

Requested Change	Status
Fix commit message to match issue #689 metadata	Done — first line is now `refactor(a2a): update test files for ACP to A2A rename`
Replace `ISSUES CLOSED: #689` with `Closes #689` in PR body	Done — PR body ends with `Closes #689`
Add `Priority/Medium` and `MoSCoW/Must have` labels	Done — both labels present on the PR

Code Correctness

Diff scope: 1 file changed (vulture_whitelist.py), +2/−2 lines. Both changes are comment-only.

Change 1 (line 633): `ACP` → `A2A` in facade wiring section header

Verified correct. The error codes and facade wiring symbols referenced by this section (NOT_FOUND, VALIDATION_ERROR, INVALID_STATE, PLAN_ERROR, AUTH_ERROR, FORBIDDEN, CONFIGURATION_ERROR, INTERNAL_ERROR, map_domain_error, _session_service, _plan_lifecycle_service, _tool_registry, _resource_registry_service, _event_queue, _cleanup_session_devcontainers) all live in src/cleveragents/a2a/errors.py and src/cleveragents/a2a/facade.py. All 15 whitelist entries below this comment are valid.

Change 2 (line 957): `ACP` → `A2A` in LspServer facade property comment

Verified correct. src/cleveragents/lsp/server.py defines LspServer.facade as a lazy property that creates an A2aLocalFacade instance (imported from cleveragents.a2a.facade). The module docstring and property docstring both reference "A2A". The old "ACP" comment was stale.

Rename Completeness (Issue #689 Acceptance Criteria)

Exhaustive audit across features/, robot/, benchmarks/, and features/fixtures/ confirms all 6 statically verifiable acceptance criteria are satisfied:

#	Criterion	Result
1	All 13 test files renamed	PASS — 13 `a2a_` files found, zero `acp_` files
2	All imports updated (`cleveragents.acp` → `cleveragents.a2a`)	PASS — 54 `a2a` import lines found, zero `acp` imports
3	Gherkin scenario names updated	PASS — all feature files reference A2A
4	Robot test names updated	PASS — all robot files reference A2A
5	JSON fixture content updated	PASS — `a2a_facade_flows.json` uses A2A naming
6	Zero ACP references in test files	PASS — case-insensitive sweep returned zero matches

The bulk of the rename was completed in PR #705 (already on master). This PR correctly fixes the only 2 remaining stale comments in vulture_whitelist.py. After this PR merges, zero acp/ACP references will remain in any code or test file (only legitimate historical references in ADR docs and specification.md).

PR Metadata Compliance

Check	Status
Commit first line matches issue #689 metadata	PASS
PR body has Forgejo closing keyword (`Closes #689`)	PASS
`Type/Task` label present	PASS
Milestone `v3.6.0` matches issue	PASS
Branch name matches issue metadata	PASS
Single commit, no merge commits	PASS
Branch rebased on current master HEAD (`232ef965`)	PASS
CONTRIBUTORS.md (hamza.khyari already listed)	PASS
Conventional Changelog format	PASS

Findings

P3:nit — No changelog entry

CONTRIBUTING.md PR requirement #6 requires a changelog update for every PR. This is a 2-line comment-only change with zero user-facing impact, so a changelog entry is arguably unnecessary. Noting for completeness — author's discretion.

P3:nit — Verify Forgejo dependency link direction

CONTRIBUTING.md requires the PR to be configured as "blocking" issue #689 (and issue #689 to "depend on" this PR). Please confirm this dependency link is set with the correct direction.

Verdict

APPROVED. The code change is trivially correct, the ACP→A2A rename is verified complete across all test infrastructure, and all of @freemo's requested changes have been addressed. Only P3 nits remain, which are at the author's discretion per the review playbook.

Note: @freemo's REQUEST_CHANGES review from the previous commit is still active and will need to be re-approved or dismissed before merge.

Reviewed by @brent.edwards. 2 review cycles + 4 parallel verification threads (ACP reference audit, vulture whitelist symbol verification, issue #689 acceptance criteria, CONTRIBUTING.md compliance).

# Code Review — PR #737: refactor(a2a): update test files for ACP to A2A rename **Reviewer:** @brent.edwards **Commit reviewed:** `8b1dfe85` **Review cycles:** 2 full passes + 4 parallel verification threads (ACP reference audit, vulture whitelist accuracy, issue #689 acceptance criteria, CONTRIBUTING.md compliance) --- ## Verification of Previous Requested Changes (@freemo) All three items from @freemo's REQUEST_CHANGES review (commit `54e52b57`) have been addressed in the force-pushed commit `8b1dfe85`: | Requested Change | Status | |---|---| | Fix commit message to match issue #689 metadata | **Done** — first line is now `refactor(a2a): update test files for ACP to A2A rename` | | Replace `ISSUES CLOSED: #689` with `Closes #689` in PR body | **Done** — PR body ends with `Closes #689` | | Add `Priority/Medium` and `MoSCoW/Must have` labels | **Done** — both labels present on the PR | --- ## Code Correctness **Diff scope:** 1 file changed (`vulture_whitelist.py`), +2/−2 lines. Both changes are comment-only. ### Change 1 (line 633): `ACP` → `A2A` in facade wiring section header Verified correct. The error codes and facade wiring symbols referenced by this section (`NOT_FOUND`, `VALIDATION_ERROR`, `INVALID_STATE`, `PLAN_ERROR`, `AUTH_ERROR`, `FORBIDDEN`, `CONFIGURATION_ERROR`, `INTERNAL_ERROR`, `map_domain_error`, `_session_service`, `_plan_lifecycle_service`, `_tool_registry`, `_resource_registry_service`, `_event_queue`, `_cleanup_session_devcontainers`) all live in `src/cleveragents/a2a/errors.py` and `src/cleveragents/a2a/facade.py`. All 15 whitelist entries below this comment are valid. ### Change 2 (line 957): `ACP` → `A2A` in LspServer facade property comment Verified correct. `src/cleveragents/lsp/server.py` defines `LspServer.facade` as a lazy property that creates an `A2aLocalFacade` instance (imported from `cleveragents.a2a.facade`). The module docstring and property docstring both reference "A2A". The old "ACP" comment was stale. --- ## Rename Completeness (Issue #689 Acceptance Criteria) Exhaustive audit across `features/`, `robot/`, `benchmarks/`, and `features/fixtures/` confirms all 6 statically verifiable acceptance criteria are satisfied: | # | Criterion | Result | |---|---|---| | 1 | All 13 test files renamed | **PASS** — 13 `a2a_*` files found, zero `acp_*` files | | 2 | All imports updated (`cleveragents.acp` → `cleveragents.a2a`) | **PASS** — 54 `a2a` import lines found, zero `acp` imports | | 3 | Gherkin scenario names updated | **PASS** — all feature files reference A2A | | 4 | Robot test names updated | **PASS** — all robot files reference A2A | | 5 | JSON fixture content updated | **PASS** — `a2a_facade_flows.json` uses A2A naming | | 6 | Zero ACP references in test files | **PASS** — case-insensitive sweep returned zero matches | The bulk of the rename was completed in PR #705 (already on master). This PR correctly fixes the only 2 remaining stale comments in `vulture_whitelist.py`. After this PR merges, zero `acp`/`ACP` references will remain in any code or test file (only legitimate historical references in ADR docs and `specification.md`). --- ## PR Metadata Compliance | Check | Status | |---|---| | Commit first line matches issue #689 metadata | **PASS** | | PR body has Forgejo closing keyword (`Closes #689`) | **PASS** | | `Type/Task` label present | **PASS** | | Milestone `v3.6.0` matches issue | **PASS** | | Branch name matches issue metadata | **PASS** | | Single commit, no merge commits | **PASS** | | Branch rebased on current master HEAD (`232ef965`) | **PASS** | | CONTRIBUTORS.md (hamza.khyari already listed) | **PASS** | | Conventional Changelog format | **PASS** | --- ## Findings ### P3:nit — No changelog entry CONTRIBUTING.md PR requirement #6 requires a changelog update for every PR. This is a 2-line comment-only change with zero user-facing impact, so a changelog entry is arguably unnecessary. Noting for completeness — author's discretion. ### P3:nit — Verify Forgejo dependency link direction CONTRIBUTING.md requires the PR to be configured as "blocking" issue #689 (and issue #689 to "depend on" this PR). Please confirm this dependency link is set with the correct direction. --- ## Verdict **APPROVED.** The code change is trivially correct, the ACP→A2A rename is verified complete across all test infrastructure, and all of @freemo's requested changes have been addressed. Only P3 nits remain, which are at the author's discretion per the review playbook. Note: @freemo's REQUEST_CHANGES review from the previous commit is still active and will need to be re-approved or dismissed before merge. *Reviewed by @brent.edwards. 2 review cycles + 4 parallel verification threads (ACP reference audit, vulture whitelist symbol verification, issue #689 acceptance criteria, CONTRIBUTING.md compliance).*

brent.edwards approved these changes

2026-03-13 22:22:29 +00:00

brent.edwards left a comment

Approve