BUG-HUNT: [security] Minimal URI validation in ACMS vocabulary models #2590

Open
opened 2026-04-03 19:03:12 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/security-acms-uri-validation
  • Commit Message: fix(acms): strengthen URI validation in vocabulary models to prevent future SSRF risk
  • Milestone: v3.7.0
  • Parent Epic: #400

Bug Report: [security] — Minimal URI validation in ACMS vocabulary models

Severity Assessment

  • Impact: If the usage of these URIs changes in the future to involve network requests, this could become a Server-Side Request Forgery (SSRF) vulnerability.
  • Likelihood: Low, as the URIs are currently only used as identifiers.
  • Priority: Low

Location

  • File: src/cleveragents/acms/uko/vocabularies.py
  • Function/Class: _validate_http_uri
  • Lines: 56-71

Description

The _validate_http_uri function only checks if the URI starts with http:// or https://. It does not perform any other validation, such as checking for a valid hostname or preventing the use of IP addresses. The docstring acknowledges this and states that it is not a security risk because the URIs are never fetched. However, this could become a vulnerability if the usage of these URIs changes in the future.

Evidence

def _validate_http_uri(uri: str, *, field_name: str = "URI") -> str:
    """Validate that *uri* uses an HTTP or HTTPS scheme.

    This is a **scheme-only** check.  These URIs are used as ontology
    identifiers and are never fetched at runtime, so full URL parsing
    and SSRF mitigation are not required here.

    Args:
        uri: The URI string to validate.
        field_name: Human-readable field name for error messages.

    Returns:
        The validated URI string (unchanged).

    Raises:
        ValueError: If *uri* does not start with `http://` or `https://`.
    """
    if not uri.startswith(("http://", "https://")):
        raise ValueError(f"{field_name} must use http(s) scheme, got: {uri!r}")
    return uri

Expected Behavior

The URI validation should be more robust to prevent potential SSRF vulnerabilities in the future.

Suggested Fix

Use a well-vetted URI validation library to perform more thorough validation of the URIs.

Category

security

Subtasks

  • Write a failing Behave scenario that reproduces the weak URI validation (TDD)
  • Implement stricter URI validation using a well-vetted library (e.g., urllib.parse with hostname checks, or a dedicated validation library)
  • Ensure private/loopback IP addresses and bare IP hostnames are rejected
  • Update docstring to reflect the new, stricter validation behaviour
  • Verify all existing tests still pass with the new validation logic
  • Update changelog

Definition of Done

  • _validate_http_uri rejects URIs with missing or invalid hostnames
  • _validate_http_uri rejects URIs pointing to private/loopback IP addresses
  • A Behave scenario covers the new validation rules
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/security-acms-uri-validation` - **Commit Message**: `fix(acms): strengthen URI validation in vocabulary models to prevent future SSRF risk` - **Milestone**: v3.7.0 - **Parent Epic**: #400 ## Bug Report: [security] — Minimal URI validation in ACMS vocabulary models ### Severity Assessment - **Impact**: If the usage of these URIs changes in the future to involve network requests, this could become a Server-Side Request Forgery (SSRF) vulnerability. - **Likelihood**: Low, as the URIs are currently only used as identifiers. - **Priority**: Low ### Location - **File**: `src/cleveragents/acms/uko/vocabularies.py` - **Function/Class**: `_validate_http_uri` - **Lines**: 56-71 ### Description The `_validate_http_uri` function only checks if the URI starts with `http://` or `https://`. It does not perform any other validation, such as checking for a valid hostname or preventing the use of IP addresses. The docstring acknowledges this and states that it is not a security risk because the URIs are never fetched. However, this could become a vulnerability if the usage of these URIs changes in the future. ### Evidence ```python def _validate_http_uri(uri: str, *, field_name: str = "URI") -> str: """Validate that *uri* uses an HTTP or HTTPS scheme. This is a **scheme-only** check. These URIs are used as ontology identifiers and are never fetched at runtime, so full URL parsing and SSRF mitigation are not required here. Args: uri: The URI string to validate. field_name: Human-readable field name for error messages. Returns: The validated URI string (unchanged). Raises: ValueError: If *uri* does not start with `http://` or `https://`. """ if not uri.startswith(("http://", "https://")): raise ValueError(f"{field_name} must use http(s) scheme, got: {uri!r}") return uri ``` ### Expected Behavior The URI validation should be more robust to prevent potential SSRF vulnerabilities in the future. ### Suggested Fix Use a well-vetted URI validation library to perform more thorough validation of the URIs. ### Category security ## Subtasks - [ ] Write a failing Behave scenario that reproduces the weak URI validation (TDD) - [ ] Implement stricter URI validation using a well-vetted library (e.g., `urllib.parse` with hostname checks, or a dedicated validation library) - [ ] Ensure private/loopback IP addresses and bare IP hostnames are rejected - [ ] Update docstring to reflect the new, stricter validation behaviour - [ ] Verify all existing tests still pass with the new validation logic - [ ] Update changelog ## Definition of Done - [ ] `_validate_http_uri` rejects URIs with missing or invalid hostnames - [ ] `_validate_http_uri` rejects URIs pointing to private/loopback IP addresses - [ ] A Behave scenario covers the new validation rules - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: ca-new-issue-creator
freemo added this to the v3.7.0 milestone 2026-04-03 19:03:26 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • MoSCoW: Could Have

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **MoSCoW**: Could Have --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#400 Epic: Post-MVP Security
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#2590
No description provided.