shared/redaction: mask_database_url() regex fails when password contains @ symbol — password leak #10582

Open
opened 2026-04-18 17:51:57 +00:00 by HAL9000 · 0 comments
Owner

Metadata

Commit Message: Fix mask_database_url() to handle passwords containing @ symbol

Branch: main


Background and Context

The mask_database_url() function in src/cleveragents/shared/redaction.py (lines 195-210) uses a regex pattern that assumes the password doesn't contain the @ symbol. This is a critical security vulnerability because:

  1. The @ symbol is a valid character in URL-encoded passwords
  2. When a database URL has a password containing @, the regex matches incorrectly
  3. This causes the function to fail to properly mask the password, potentially leaking sensitive credentials in logs and error messages

Current Implementation Issue:
The regex pattern at line 203: r"(://[^:]+:)([^@]+)(@)" matches everything up to the first @ symbol, assuming that's the password/username separator. If the password itself contains @, the pattern will match only the portion before the first @ in the password, leaving the rest exposed.

Example of the vulnerability:

url = "postgresql://user:pass@word@localhost:5432/mydb"
# Current behavior: postgresql://user:***@word@localhost:5432/mydb
# Expected behavior: postgresql://user:***@localhost:5432/mydb
# The password "pass@word" is not fully masked — "word" is exposed!

Expected Behavior

The mask_database_url() function should:

  1. Properly identify and mask the entire password, regardless of whether it contains @ symbols
  2. Preserve the URL structure (scheme, username, hostname, port, database)
  3. Return a safely redacted URL suitable for logging and error messages
  4. Handle edge cases: missing password, missing username, special characters in password

Acceptance Criteria

  • Passwords containing @ symbols are fully masked
  • URL structure is preserved (scheme, host, port, database path)
  • No password characters leak into the output
  • Function handles edge cases: no password, no username, special characters
  • All existing tests continue to pass
  • New unit tests cover passwords with @ symbols
  • Code review approved
  • Merged to main branch

Subtasks

  • Replace regex-based approach with urllib.parse for robust URL parsing
  • Implement the fixed mask_database_url() function using proper URL parsing
  • Add unit tests for passwords containing @ symbol
  • Add unit tests for other special characters in passwords (%, &, #, etc.)
  • Add integration tests with real database URLs
  • Update docstring with examples of edge cases
  • Run full test suite to ensure no regressions
  • Code review and approval

Definition of Done

The issue is complete when:

  1. The mask_database_url() function uses urllib.parse instead of regex
  2. All unit tests pass, including new tests for @ in passwords
  3. Code coverage remains >= 97%
  4. The fix has been reviewed and approved
  5. Changes are merged to the main branch
  6. No password leakage occurs in any test case

Automated by CleverAgents Bot
Agent: new-issue-creator

## Metadata **Commit Message:** Fix mask_database_url() to handle passwords containing @ symbol **Branch:** main --- ## Background and Context The `mask_database_url()` function in `src/cleveragents/shared/redaction.py` (lines 195-210) uses a regex pattern that assumes the password doesn't contain the @ symbol. This is a critical security vulnerability because: 1. The @ symbol is a valid character in URL-encoded passwords 2. When a database URL has a password containing @, the regex matches incorrectly 3. This causes the function to fail to properly mask the password, potentially leaking sensitive credentials in logs and error messages **Current Implementation Issue:** The regex pattern at line 203: `r"(://[^:]+:)([^@]+)(@)"` matches everything up to the **first** @ symbol, assuming that's the password/username separator. If the password itself contains @, the pattern will match only the portion before the first @ in the password, leaving the rest exposed. **Example of the vulnerability:** ```python url = "postgresql://user:pass@word@localhost:5432/mydb" # Current behavior: postgresql://user:***@word@localhost:5432/mydb # Expected behavior: postgresql://user:***@localhost:5432/mydb # The password "pass@word" is not fully masked — "word" is exposed! ``` --- ## Expected Behavior The `mask_database_url()` function should: 1. Properly identify and mask the entire password, regardless of whether it contains @ symbols 2. Preserve the URL structure (scheme, username, hostname, port, database) 3. Return a safely redacted URL suitable for logging and error messages 4. Handle edge cases: missing password, missing username, special characters in password --- ## Acceptance Criteria - [ ] Passwords containing @ symbols are fully masked - [ ] URL structure is preserved (scheme, host, port, database path) - [ ] No password characters leak into the output - [ ] Function handles edge cases: no password, no username, special characters - [ ] All existing tests continue to pass - [ ] New unit tests cover passwords with @ symbols - [ ] Code review approved - [ ] Merged to main branch --- ## Subtasks - [ ] Replace regex-based approach with `urllib.parse` for robust URL parsing - [ ] Implement the fixed `mask_database_url()` function using proper URL parsing - [ ] Add unit tests for passwords containing @ symbol - [ ] Add unit tests for other special characters in passwords (%, &, #, etc.) - [ ] Add integration tests with real database URLs - [ ] Update docstring with examples of edge cases - [ ] Run full test suite to ensure no regressions - [ ] Code review and approval --- ## Definition of Done The issue is complete when: 1. ✅ The `mask_database_url()` function uses `urllib.parse` instead of regex 2. ✅ All unit tests pass, including new tests for @ in passwords 3. ✅ Code coverage remains >= 97% 4. ✅ The fix has been reviewed and approved 5. ✅ Changes are merged to the main branch 6. ✅ No password leakage occurs in any test case --- **Automated by CleverAgents Bot** Agent: new-issue-creator
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10582
No description provided.