feat(lsp): add LSP server stub #608
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleveragents/cleveragents-core!608
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feature/m6-lsp-stub"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Adds a minimal LSP server stub supporting the initialize/shutdown/exit handshake over JSON-RPC stdin/stdout transport with Content-Length framing. Unsupported methods return MethodNotFound error with descriptive message. LSP requests are wired through the ACP facade in local mode. Adds
agents lsp serveCLI command with--log-levelflag, PID output, and startup banner. Includes reference documentation, Behave BDD tests (48 scenarios), Robot smoke test, and ASV startup latency benchmark.Closes #203
Changes
LSP Server (
src/cleveragents/lsp/server.py)LspServerclass with JSON-RPC stdin/stdout transportinitialize/shutdown/exithandshake per LSP spec_discard_bodyfor oversized messages with EOF handlingCLI Commands (
src/cleveragents/cli/commands/lsp.py)agents lsp serve [--log-level]— launches the LSP stub serveragents lsp add --config <path>— registers an LSP server configurationDocumentation (
docs/reference/lsp_stub.md)Testing
Spec References
feat(lsp): add LSP server stubISSUES CLOSED: #203
Review: feat(lsp): add LSP server stub
Well-structured PR with thorough documentation, comprehensive BDD coverage (48 scenarios), Robot smoke tests, and ASV benchmarks. The LSP server stub is defensively coded with proper transport hardening (Content-Length caps, header flood guards, stream desync recovery). The PR description is detailed and follows conventions.
Five rounds of self-review have clearly paid off — the code quality is high. The findings below are the remaining items I'd like addressed.
Findings
P1:must-fix —
UnboundLocalErrorrisk inserve()commandFile:
src/cleveragents/cli/commands/lsp.py—serve()functionIf
LspServer()constructor orserver.run()raises an unexpected exception (e.g.,OSErroron stdin/stdout access),exit_codeis never assigned. Thefinallyblock restores the structlog config correctly, but thenraise typer.Exit(code=exit_code)triggers anUnboundLocalError— masking the original exception with a confusing traceback.Per CONTRIBUTING.md §Error and Exception Handling: "Do not suppress errors. Let exceptions propagate to top-level execution."
Fix: Initialize
exit_code = 1before thetryblock:P1:must-fix — Dead
exit_codecomputation in_handle_exit()creates divergence riskFile:
src/cleveragents/lsp/server.py—_handle_exit()method_handle_exit()computesexit_code = 0 if self._shutdown_requested else 1and logs it, but then only setsself._running = False— the computed value is discarded. Therun()method independently recomputes the same expression. The two are currently identical, so behavior is correct today, but if one is ever changed without updating the other the exit code will silently diverge.This is effectively dead code that creates a maintenance trap.
Fix: Either (a) remove the dead
exit_codevariable from_handle_exitand just log the expected code inline, or (b) have_handle_exitstore the exit code onselfand haverun()read it.P2:should-fix —
exists=Falseon--configoption is misleadingFile:
src/cleveragents/cli/commands/lsp.py—add()commandIn Typer,
exists=Falsemeans "do not validate that the file exists" (it's also the default forPathoptions). The function then immediately performs the check manually withif not config.exists(). A reader unfamiliar with Typer's API will misreadexists=Falseas "file must not exist." SinceFalseis already the default, this parameter adds confusion without value — remove it, or switch toexists=Trueand let Typer handle the validation (eliminating the manual check).P2:should-fix — PR is not mergeable (merge conflict with master)
The PR currently shows
mergeable: false. The branch needs to be rebased or merged with currentmasterbefore this can proceed.P2:should-fix — PR missing milestone and Type label (CONTRIBUTING.md §21.11, §21.12)
Per CONTRIBUTING.md:
Type/label." This PR should haveType/Feature.Additionally, issue #203 is still in
State/In Progress— per §After Submission it should be moved toState/In Review.P3:nit — Banner assertion relies on Typer's
mix_stderrdefaultFile:
features/steps/lsp_server_stub_steps.py—step_lsp_serve_output_contains()The serve command writes its startup banner to
_serve_console = Console(stderr=True), i.e., stderr. The test step assertstext in result.outputwhich is Typer's stdout capture. This works becauseCliRunner()defaults tomix_stderr=True, merging stdout and stderr. If that default ever changes (or if someone passesmix_stderr=False), the assertions will silently fail. Consider adding a comment noting this dependency onmix_stderr=True.Checklist (CLI Review per review_playbook.md)
UnboundLocalErrorcaseChecklist (Security — Server Stubs per review_playbook.md)
Summary
The two P1 findings should be addressed before merge. The P2 items can be addressed as part of this PR or in a quick follow-up. Overall this is solid work — the transport hardening and test coverage are particularly well done.
P2:should-fix —
exists=Falseis the default for TyperPathoptions and reads misleadingly as "file must not exist" to someone unfamiliar with the API. Since you immediately checkconfig.exists()manually below, either:exists=False(it's the default), orexists=Trueand remove the manualif not config.exists()checkP1:must-fix —
UnboundLocalErrorifLspServer()orserver.run()raises.exit_codeis first assigned inside thetryblock. If an exception propagates before that assignment (e.g.,OSErroron stream access), thefinallyblock completes normally butraise typer.Exit(code=exit_code)hitsUnboundLocalError— masking the real error.Fix: initialize
exit_code = 1beforetry.@ -0,0 +265,4 @@return _SKIPexcept RecursionError:logger.warning("lsp.transport.json_recursion_error",P1:must-fix — Dead
exit_codevariable creates maintenance divergence risk.exit_codeis computed here and logged, but never stored or returned. Therun()method independently recomputes the same expression at the end of the loop. Today they're identical, but if one is changed without the other, the exit code will silently diverge.Either remove the dead variable (log the expression inline), or store it on
self._exit_codeand read it inrun().Review Fixes Applied
All findings from Brent's review have been addressed. The commit on
feature/m6-lsp-stubhas been amended.Findings Fixed
UnboundLocalErrorrisk inserve()—exit_codeuninitialized beforetryexit_code = 1before thetryblock inlsp.py:373so the variable is always defined whenraise typer.Exit(code=exit_code)is reached.exit_codevariable in_handle_exit()creates divergence risk_handle_exit(). The exit code expression is now logged inline in_handle_exit(), andrun()is the single source of truth — it computesexit_code = 0 if self._shutdown_requested else 1once after the event loop ends.exists=Falseon--configoption is misleadingexists=Falseparameter (it's the Typer default). The manualconfig.exists()check is retained for its custom error message.State/In ProgressType/Featurelabel added. Issue #203 moved fromState/In ProgresstoState/In Review.mix_stderrdefaultstep_lsp_serve_output_contains()documenting theCliRunner(mix_stderr=True)dependency.Verification Results
nox -s lintnox -s typechecknox -s unit_tests -- features/lsp_server_stub.featureserver.pylsp.py(CLI)2931edd6e1cdf0f88f39Review Round 2: feat(lsp): add LSP server stub
Verification of Round 1 Fixes
All six findings from my initial review have been addressed:
UnboundLocalErrorrisk inserve()exit_code = 1initialized beforetryblock (lsp.py:373)exit_codein_handle_exit()run()is single source of truth (server.py:526)exists=Falsemisleading on--configmergeable: trueType/Featurelabel, issue #203 inState/In Reviewmix_stderrdependencylsp_server_stub_steps.py:650-654)Quality Gates
nox -s lintAll checks passed!nox -s typecheck0 errors, 0 warnings, 0 informationsnox -s unit_tests -- features/lsp_server_stub.featureNew Findings
P1:must-fix — PR body is empty (CONTRIBUTING.md §1)
The PR body is currently empty (
""). Per CONTRIBUTING.md:At minimum the PR body needs:
Closes #203closing keywordThe commit message itself is well-written and contains everything needed — this just needs to be reflected in the PR description.
P2:should-fix — Step docstring references code by line number
File:
features/steps/lsp_server_stub_steps.py—step_oversized_short_body_eof()docstringPer CONTRIBUTING.md §Documentation Standards: "Never reference code by line number (
file_path:line_number) as line numbers shift with every edit and become misleading quickly." Replace with a logical reference, e.g., "the early-exitbreakinside_discard_body's read loop".P3:nit — File sizes slightly exceed 500-line guideline
server.py: 537 lines (CONTRIBUTING.md §Code Style: "Keep files under 500 lines")lsp_server_stub_steps.py: 695 linesThe server file is marginal and cohesive — not worth splitting. The step file is larger but splitting it would conflict with the BDD organization guideline to "Name feature-specific step files after their feature." Noting for awareness; no action needed.
Checklists
CLI Review (per review_playbook.md)
Security — Server Stubs (per review_playbook.md)
Merge Checklist (CONTRIBUTING.md)
Closes #203Type/Feature)ISSUES CLOSED: #203)State/In ReviewSummary
The code is solid — all Round 1 P1/P2 findings are confirmed fixed, transport hardening and test coverage remain excellent (48 scenarios, all passing). The only blocking item is populating the PR description, which is a quick copy from the commit message body.
@ -0,0 +504,4 @@"""Build a message whose declared CL exceeds the limit but whose actualbody is much shorter than the discard cap. This forces``_discard_body`` to hit EOF mid-read (the ``break`` on line 184 of``server.py``), covering the early-exit path.P2:should-fix — This docstring references
server.pyby line number ("thebreakon line 184"). Per CONTRIBUTING.md §Documentation Standards: "Never reference code by line number as line numbers shift with every edit." Use a logical reference instead, e.g., "the early-exitbreakinsideLspServer._discard_body's read loop".cdf0f88f39358672435835867243582fc25df586PR #608 — feat(lsp): add LSP server stub (Issue #203)
Reviewed:
2fc25df5(single squashed commit), 14 files, +2570/−6 lines.Summary
Clean, well-structured LSP server stub. The JSON-RPC transport is correctly implemented with Content-Length framing, the lifecycle state machine (initialize → shutdown → exit) follows LSP 3.17, and the security hardening (MAX_CONTENT_LENGTH, MAX_HEADER_LINES,
_drain_headers,_discard_body, RecursionError guard) is thorough. Test coverage is excellent: 48 BDD scenarios, 5 Robot smoke tests, 5 ASV benchmarks. All edge cases I probed manually (negative Content-Length, zero-length body, extra whitespace in headers, pre-init requests, post-shutdown requests, double init, exit without shutdown, batch arrays) behaved correctly.The findings below are minor-to-medium severity. None are blocking bugs in the current stub, but several become relevant as the server evolves toward production.
BUG-1 (Low) —
structlog.is_configuredflag leaked afterserve()returnsFile:
src/cleveragents/cli/commands/lsp.py:372-397The save/restore pattern
previous_config = dict(structlog.get_config())/structlog.configure(**previous_config)correctly restores all 5 config values (processors, wrapper_class, context_class, logger_factory, cache_logger_on_first_use). However,structlog.configure()unconditionally sets_CONFIG.is_configured = True(structlog_config.py:244), andget_config()does not capture this flag.If structlog was not previously configured (using defaults), any subsequent call to
structlog.configure_once()will silently refuse to apply its configuration and emit aRuntimeWarninginstead of succeeding. There are currently zeroconfigure_once()calls in this codebase, so the impact is latent only.Suggested fix: Either (a) accept the limitation and add a code comment documenting it, or (b) call
structlog.reset_defaults()followed bystructlog.configure(**previous_config)in the finally block to fully restore state.BUG-2 (Low) —
_initializedset oninitializerequest, notinitializednotificationFile:
src/cleveragents/lsp/server.py:452-457The code sets
self._initialized = Trueupon receiving theinitializerequest rather than the subsequentinitializednotification. Per LSP 3.17 §3.1, the server should not fully activate until theinitializednotification arrives. The code comment at line 452-456 acknowledges this deviation explicitly.This is acceptable for a stub but should be tracked as a TODO for the production server (M7+). The existing comment is sufficient — noting here for completeness.
SPEC-1 (Low) —
exists=Falseremoved fromaddcommand's--configoptionFile:
src/cleveragents/cli/commands/lsp.py:116(diff: removes lineexists=False)The diff removes the explicit
exists=Falsefrom thetyper.Option()for--configin theaddcommand. Sinceexists=Falseis the Typer default, this is a no-op change and does not alter behavior. However, removing it obscures the original author's intent — the explicitexists=Falsewas a signal that theaddcommand deliberately does not rely on Typer's path-existence validation (it does its own check at line 140). Consider restoring the explicit default for clarity, or at minimum, keep the manualconfig.exists()check at line 140 (which is already present).SEC-1 (Low) —
_read_messagetrusts Content-Length for exactread()without timeoutFile:
src/cleveragents/lsp/server.py:249self._input.read(content_length)blocks until exactlycontent_lengthbytes are available or EOF. A malicious or buggy client that sends a valid Content-Length header but then stalls partway through the body will block the server indefinitely. TheMAX_CONTENT_LENGTHcap (10 MB) limits memory exposure but not the time dimension.For a stub this is acceptable (the server is single-threaded by design). When evolving to production, consider wrapping the read in
select()/asynciowith a per-message timeout, or documenting that the caller (IDE) is trusted.SEC-2 (Low) —
readline()in header parsing has no per-line size limitFile:
src/cleveragents/lsp/server.py:206self._input.readline()reads until\nwith no maximum line length. A client sending a multi-gigabyte header line without a newline will cause unbounded memory allocation. TheMAX_HEADER_LINESguard (line 210) only fires after a complete line has been read and counted.For a stub behind a trusted IDE this is fine. For hardened production use, pass a max-length argument to
readline()(e.g.,readline(8192)).PERF-1 (Info) — Eager import of
LspServerat CLI module levelFile:
src/cleveragents/cli/commands/lsp.py:65from cleveragents.lsp.server import LspServeris a top-level import. Theservermodule transitively importsAcpLocalFacade, which pulls in the full ACP dependency chain (~200ms cold). This cost is paid by everyagents lspsubcommand (add,remove,list,show), none of which needLspServer.A lazy import inside
serve()would avoid this cost for non-serve commands. Low priority since the ACP chain is likely already warm in typical CLI usage.TEST-1 (Low) — Robot PID smoke test is a sham assertion
File:
robot/lsp_stub_smoke.robot(PID test)The Robot test that verifies the process runs with a valid PID only checks that the process started and exited — it doesn't validate PID plausibility (e.g., > 0). This is a very minor gap; the BDD suite covers PID logging more thoroughly.
TEST-2 (Info) — No test for
readline()memory exhaustion vectorThere is no BDD scenario that sends a single header line exceeding a reasonable length (e.g., 1 MB of
Content-Length:followed by garbage). This would exercise thereadline()unbounded-read path noted in SEC-2. Low priority for a stub — worth adding when hardening for production.8 findings: 2 Low bugs, 1 Low spec, 2 Low security, 1 Info perf, 2 Low/Info test.
2fc25df58674345bb5af74345bb5af23803f14ecNew commits pushed, approval review dismissed automatically according to repository settings