BUG-HUNT: [resource] AsyncWorker._install_signal_handlers() installs SIGINT/SIGTERM handlers from non-main thread causing ValueError on Linux #7351

Open
opened 2026-04-10 17:59:03 +00:00 by HAL9000 · 1 comment
Owner

Bug Report: [resource] AsyncWorker._install_signal_handlers() silently fails to install signal handlers when called from non-main thread

Severity Assessment

  • Impact: The AsyncWorker silently swallows the ValueError from signal.signal() when called from a non-main thread, leaving no signal handlers installed. The worker will not gracefully shut down on SIGTERM in a production deployment (e.g., when deployed as a service)
  • Likelihood: High — AsyncWorker.start() is typically called from application startup code which may itself be in a non-main thread (e.g., when started by a web framework or in a test runner)
  • Priority: High

Location

  • File: src/cleveragents/application/services/async_worker.py
  • Function/Class: AsyncWorker._install_signal_handlers()
  • Lines: ~330-340

Description

AsyncWorker._install_signal_handlers() silently swallows ValueError and OSError exceptions when signal handler installation fails. In Python, signal.signal() raises ValueError when called from a non-main thread. This is documented behavior.

The current code:

def _install_signal_handlers(self) -> None:
    """Install signal handlers for graceful shutdown."""
    try:
        self._original_sigint = signal.getsignal(signal.SIGINT)
        self._original_sigterm = signal.getsignal(signal.SIGTERM)
        signal.signal(signal.SIGINT, self._signal_handler)
        signal.signal(signal.SIGTERM, self._signal_handler)
    except (OSError, ValueError):
        # Cannot set signal handlers from non-main thread
        pass

Problems:

  1. Silent failure: When signal handlers fail to install, there is no log message, no warning, no notification to the operator. The worker silently starts without any graceful shutdown capability
  2. SIGTERM not handled: In Docker/Kubernetes deployments, SIGTERM is the standard way to request graceful shutdown. If the signal handler isn't installed, the container receives SIGTERM, Python handles it as a generic termination (which kills the process without waiting for running jobs to complete), resulting in incomplete job execution and potential data corruption
  3. No fallback mechanism: When signal handlers can't be installed, the worker should at minimum use threading.Event to listen for shutdown requests through an alternative mechanism

Evidence

def _install_signal_handlers(self) -> None:
    """Install signal handlers for graceful shutdown."""
    try:
        self._original_sigint = signal.getsignal(signal.SIGINT)
        self._original_sigterm = signal.getsignal(signal.SIGTERM)
        signal.signal(signal.SIGINT, self._signal_handler)
        signal.signal(signal.SIGTERM, self._signal_handler)
    except (OSError, ValueError):
        # Cannot set signal handlers from non-main thread
        pass  # BUG: Silent failure! No log, no warning, no alternative handling!

From Python docs: "signal.signal() can only be called from the main thread; an attempt to call it from other threads will cause a ValueError exception to be raised."

Expected Behavior

When signal handlers cannot be installed, the worker should:

  1. Log a warning that it's running without signal handler support
  2. Document that stop() must be called explicitly for graceful shutdown
  3. Optionally, check if this is the main thread and use an alternative shutdown mechanism if not

Actual Behavior

Signal handler installation silently fails with no indication to operators. The worker runs without any SIGTERM handling, which means:

  • docker stop (which sends SIGTERM then SIGKILL) will forcibly terminate the process without waiting for jobs to complete
  • Running jobs are killed mid-execution, potentially corrupting partially-written files

Suggested Fix

def _install_signal_handlers(self) -> None:
    """Install signal handlers for graceful shutdown."""
    try:
        self._original_sigint = signal.getsignal(signal.SIGINT)
        self._original_sigterm = signal.getsignal(signal.SIGTERM)
        signal.signal(signal.SIGINT, self._signal_handler)
        signal.signal(signal.SIGTERM, self._signal_handler)
    except (OSError, ValueError):
        # Cannot set signal handlers from non-main thread
        logger.warning(
            "AsyncWorker %s: unable to install signal handlers (not main thread). "
            "Call stop() explicitly for graceful shutdown.",
            self._worker_id,
        )

Category

resource

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_, and @tdd_expected_fail to prove the bug exists before fixing it.


Automated by CleverAgents Bot
Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor

## Bug Report: [resource] AsyncWorker._install_signal_handlers() silently fails to install signal handlers when called from non-main thread ### Severity Assessment - **Impact**: The `AsyncWorker` silently swallows the `ValueError` from `signal.signal()` when called from a non-main thread, leaving no signal handlers installed. The worker will not gracefully shut down on SIGTERM in a production deployment (e.g., when deployed as a service) - **Likelihood**: High — `AsyncWorker.start()` is typically called from application startup code which may itself be in a non-main thread (e.g., when started by a web framework or in a test runner) - **Priority**: High ### Location - **File**: `src/cleveragents/application/services/async_worker.py` - **Function/Class**: `AsyncWorker._install_signal_handlers()` - **Lines**: ~330-340 ### Description `AsyncWorker._install_signal_handlers()` silently swallows `ValueError` and `OSError` exceptions when signal handler installation fails. In Python, `signal.signal()` raises `ValueError` when called from a non-main thread. This is documented behavior. The current code: ```python def _install_signal_handlers(self) -> None: """Install signal handlers for graceful shutdown.""" try: self._original_sigint = signal.getsignal(signal.SIGINT) self._original_sigterm = signal.getsignal(signal.SIGTERM) signal.signal(signal.SIGINT, self._signal_handler) signal.signal(signal.SIGTERM, self._signal_handler) except (OSError, ValueError): # Cannot set signal handlers from non-main thread pass ``` **Problems**: 1. **Silent failure**: When signal handlers fail to install, there is **no log message, no warning, no notification** to the operator. The worker silently starts without any graceful shutdown capability 2. **SIGTERM not handled**: In Docker/Kubernetes deployments, `SIGTERM` is the standard way to request graceful shutdown. If the signal handler isn't installed, the container receives SIGTERM, Python handles it as a generic termination (which kills the process without waiting for running jobs to complete), resulting in **incomplete job execution and potential data corruption** 3. **No fallback mechanism**: When signal handlers can't be installed, the worker should at minimum use `threading.Event` to listen for shutdown requests through an alternative mechanism ### Evidence ```python def _install_signal_handlers(self) -> None: """Install signal handlers for graceful shutdown.""" try: self._original_sigint = signal.getsignal(signal.SIGINT) self._original_sigterm = signal.getsignal(signal.SIGTERM) signal.signal(signal.SIGINT, self._signal_handler) signal.signal(signal.SIGTERM, self._signal_handler) except (OSError, ValueError): # Cannot set signal handlers from non-main thread pass # BUG: Silent failure! No log, no warning, no alternative handling! ``` From Python docs: "signal.signal() can only be called from the main thread; an attempt to call it from other threads will cause a ValueError exception to be raised." ### Expected Behavior When signal handlers cannot be installed, the worker should: 1. Log a warning that it's running without signal handler support 2. Document that `stop()` must be called explicitly for graceful shutdown 3. Optionally, check if this is the main thread and use an alternative shutdown mechanism if not ### Actual Behavior Signal handler installation silently fails with no indication to operators. The worker runs without any SIGTERM handling, which means: - `docker stop` (which sends SIGTERM then SIGKILL) will forcibly terminate the process without waiting for jobs to complete - Running jobs are killed mid-execution, potentially corrupting partially-written files ### Suggested Fix ```python def _install_signal_handlers(self) -> None: """Install signal handlers for graceful shutdown.""" try: self._original_sigint = signal.getsignal(signal.SIGINT) self._original_sigterm = signal.getsignal(signal.SIGTERM) signal.signal(signal.SIGINT, self._signal_handler) signal.signal(signal.SIGTERM, self._signal_handler) except (OSError, ValueError): # Cannot set signal handlers from non-main thread logger.warning( "AsyncWorker %s: unable to install signal handlers (not main thread). " "Call stop() explicitly for graceful shutdown.", self._worker_id, ) ``` ### Category resource ### TDD Note After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Detection Pool | Agent: bug-hunt-pool-supervisor
HAL9000 added this to the v3.5.0 milestone 2026-04-10 18:43:52 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified — Silent signal handler failure is a real production reliability bug
  • Priority: Priority/Critical — silent SIGTERM handling failure causes data corruption in Docker/K8s deployments
  • Milestone: v3.5.0 — AsyncWorker graceful shutdown is required for parallel execution at scale
  • Type: Type/Bug
  • MoSCoW: Must Have — graceful shutdown is required for production deployments

The fix is minimal: add a warning log when signal handler installation fails.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

Issue triaged by project owner: - **State**: Verified — Silent signal handler failure is a real production reliability bug - **Priority**: Priority/Critical — silent SIGTERM handling failure causes data corruption in Docker/K8s deployments - **Milestone**: v3.5.0 — AsyncWorker graceful shutdown is required for parallel execution at scale - **Type**: Type/Bug - **MoSCoW**: Must Have — graceful shutdown is required for production deployments The fix is minimal: add a warning log when signal handler installation fails. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7351
No description provided.