[sentinel #11237] fix(lsp): wrap post-Popen init in cleanup guard to prevent orphaned processes #1

Open
HAL9000 wants to merge 2 commits from tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup into master
Owner

This is an auto-generated sentinel duplicate of upstream PR cleveragents/cleveragents-core#11237 for pipeline testing. It targets the fork's master and is isolated from the canonical pipeline.


Summary

Fixes a resource leak in StdioTransport.start() (#7044) where subprocess cleanup is not performed if an exception occurs after successful Popen. The unprotected logger.info("lsp.transport.started", ...) call could leave orphaned LSP server processes accumulating in the background.

Changes

  • src/cleveragents/lsp/transport.py: Wrapped post-Popen initialization (logger.info log) in a try/except cleanup guard that mirrors the existing stop() method pattern:

    • On any exception after successful subprocess spawn, terminate → wait(timeout) → kill fallback
    • Reset self._process = None to ensure clean state
    • Re-raise the original exception so callers still get proper error semantics
    • Updated docstring to document the cleanup contract and the constraint that all post-Popen init logic must be placed within the guarded block
  • features/lsp_transport_coverage.feature: Added TDD scenarios for issue #7044:

    • Verifies subprocess is terminated when post-spawn log fails (@tdd_issue_7044 @tdd_expected_fail)
    • Verifies process cleanup and state reset to None
    • Explicit is_alive() coverage (alive + not-alive states)
  • features/steps/_ltcov_helpers.py: Shared mock helper functions extracted from the main step file to keep individual step files under the 500-line limit per CONTRIBUTING.md guidelines.

  • features/steps/lsp_transport_coverage_steps.py: Refactored to import helpers from the shared module; reduced from 514→476 lines.

  • features/steps/lsp_transport_post_spawn_cleanup_steps.py: Dedicated step definitions for TDD and is_alive test scenarios.

Problem & Solution

Problem: The StdioTransport.start() method called logger.info() after a successful subprocess.Popen, but this call was unprotected. If logging failed (or any future post-spawn init code raised), the subprocess handle would never be cleaned up, leaving zombie processes.

Solution: Defensive cleanup by wrapping all post-spawn initialization in a try/except block. If any step raises an exception after Popen succeeds, the subprocess is terminated and waited on (with kill fallback for unresponsive processes) before re-raising the original error.

Testing

  • TDD scenario verifies subprocess termination when post-Popen init fails
  • Verifies self._process is reset to None after cleanup
  • Explicit is_alive() scenarios confirm behavior in both states
  • Normal initialization path is completely unaffected (no change to happy path)

Acceptance Criteria

  • Subprocess terminated on any exception after Popen spawn
  • Process reference cleared (self._process = None)
  • Original exception re-raised (not suppressed)
  • Normal initialization path unaffected
  • No resource leaks on failed initialization
  • Step files under 500-line limit

Closes #7044


Automated by CleverAgents Bot
Agent: task-implementor


Original upstream PR: cleveragents/cleveragents-core#11237

_This is an auto-generated sentinel duplicate of upstream PR `cleveragents/cleveragents-core#11237` for pipeline testing. It targets the fork's `master` and is **isolated** from the canonical pipeline._ --- ## Summary Fixes a resource leak in `StdioTransport.start()` (#7044) where subprocess cleanup is not performed if an exception occurs after successful `Popen`. The unprotected `logger.info("lsp.transport.started", ...)` call could leave orphaned LSP server processes accumulating in the background. ## Changes - **src/cleveragents/lsp/transport.py**: Wrapped post-Popen initialization (`logger.info` log) in a try/except cleanup guard that mirrors the existing `stop()` method pattern: - On any exception after successful subprocess spawn, terminate → wait(timeout) → kill fallback - Reset `self._process = None` to ensure clean state - Re-raise the original exception so callers still get proper error semantics - Updated docstring to document the cleanup contract and the constraint that all post-Popen init logic must be placed within the guarded block - **features/lsp_transport_coverage.feature**: Added TDD scenarios for issue #7044: - Verifies subprocess is terminated when post-spawn log fails (`@tdd_issue_7044 @tdd_expected_fail`) - Verifies process cleanup and state reset to None - Explicit `is_alive()` coverage (alive + not-alive states) - **features/steps/_ltcov_helpers.py**: Shared mock helper functions extracted from the main step file to keep individual step files under the 500-line limit per CONTRIBUTING.md guidelines. - **features/steps/lsp_transport_coverage_steps.py**: Refactored to import helpers from the shared module; reduced from 514→476 lines. - **features/steps/lsp_transport_post_spawn_cleanup_steps.py**: Dedicated step definitions for TDD and is_alive test scenarios. ## Problem & Solution **Problem:** The `StdioTransport.start()` method called `logger.info()` after a successful `subprocess.Popen`, but this call was unprotected. If logging failed (or any future post-spawn init code raised), the subprocess handle would never be cleaned up, leaving zombie processes. **Solution:** Defensive cleanup by wrapping all post-spawn initialization in a try/except block. If any step raises an exception after Popen succeeds, the subprocess is terminated and waited on (with kill fallback for unresponsive processes) before re-raising the original error. ## Testing - TDD scenario verifies subprocess termination when post-Popen init fails - Verifies `self._process` is reset to None after cleanup - Explicit `is_alive()` scenarios confirm behavior in both states - Normal initialization path is completely unaffected (no change to happy path) ## Acceptance Criteria - [x] Subprocess terminated on any exception after Popen spawn - [x] Process reference cleared (`self._process = None`) - [x] Original exception re-raised (not suppressed) - [x] Normal initialization path unaffected - [x] No resource leaks on failed initialization - [x] Step files under 500-line limit Closes #7044 --- **Automated by CleverAgents Bot** Agent: task-implementor --- _Original upstream PR: cleveragents/cleveragents-core#11237_
HAL9000 self-assigned this 2026-05-21 10:32:46 +00:00
Closes #7044

The StdioTransport.start() method had an unprotected logger.info() call
after successful Popen(). If that call raised, the subprocess would
leak as an orphaned process. Wrap all post-spawn initialization in a
try/except guard: on any exception after spawn, terminate and wait for
the process (with kill fallback), reset state to None, then re-raise
so callers still get proper error semantics.

The existing stop() method cleanup pattern (terminate → wait → kill) is
mirrored here for consistency across the transport lifecycle.

Tests added:
  - TDD scenario with @tdd_issue_7044 verifying subprocess cleanup on
    post-Popen exception and state reset to None
  - Explicit is_alive() scenarios covering both alive and not-alive states

Refactoring:
  - Extracted _make_mock_process and _build_lsp_frame helpers into a
    shared _ltcov_helpers module to keep step files under the 500-line
    CONTRIBUTING.md limit.

ISSUES CLOSED: #7044
This pull request can be merged automatically.
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup:tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup
git switch tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup

Merge

Merge the changes and update on Forgejo.

Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.

git switch master
git merge --no-ff tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup
git switch tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup
git rebase master
git switch master
git merge --ff-only tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup
git switch tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup
git rebase master
git switch master
git merge --no-ff tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup
git switch master
git merge --squash tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup
git switch master
git merge --ff-only tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup
git switch master
git merge tests/sentinel-11237-bugfix-m3.6.0-lsp-7044-subprocess-cleanup
git push origin master
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
HAL9000/cleveragents-core!1
No description provided.