UAT: Multiple yaml.dump calls missing allow_unicode=True causing non-ASCII data to be escaped as \uXXXX sequences #4084

Open
opened 2026-04-06 10:10:11 +00:00 by freemo · 0 comments
Owner

Metadata

  • Branch: fix/yaml-allow-unicode
  • Commit Message: fix(i18n): add allow_unicode=True to all yaml.dump calls to preserve non-ASCII characters
  • Milestone: (none — backlog)
  • Parent Epic: #399 (Post-MVP Server & Clients — nearest applicable CLI epic)

Bug Report

What was tested: Whether the CLI correctly handles and outputs non-ASCII content (e.g., project names, actor descriptions, persona names, or any user-supplied text containing Unicode characters like accented letters, CJK characters, or emoji).

Expected behavior:

When a user has non-ASCII content in their data (e.g., a project named "Café API" or an actor description in Japanese), YAML output should preserve those characters as-is:

name: Café API
description: カフェAPIプロジェクト

Actual behavior (from code analysis):

Several yaml.safe_dump() and yaml.dump() calls are missing allow_unicode=True, causing PyYAML to escape non-ASCII characters as \uXXXX sequences:

name: "Caf\xE9 API"
description: "\u30AB\u30D5\u30A7API\u30D7\u30ED\u30B8\u30A7\u30AF\u30C8"

This makes YAML output unreadable for non-ASCII content and breaks round-trip fidelity when the YAML is re-imported.

Affected files and locations:

  1. src/cleveragents/cli/commands/_resolve_actor.py:106

    yaml_text = yaml.safe_dump(config_blob, default_flow_style=False)
    # Missing: allow_unicode=True
    
  2. src/cleveragents/cli/commands/repl.py:594

    yaml.safe_dump(payload, sort_keys=False, default_flow_style=False)
    # Missing: allow_unicode=True
    # Used for persona export — persona names/descriptions may contain non-ASCII
    
  3. src/cleveragents/cli/commands/actor_context.py:294

    yaml.dump(export_data, fh, default_flow_style=False, sort_keys=False)
    # Missing: allow_unicode=True
    # Used for actor context export
    
  4. src/cleveragents/actor/yaml_template_engine.py:226

    return yaml.dump(value, default_flow_style=False).strip()
    # Missing: allow_unicode=True
    # Used in template engine — templates may contain non-ASCII content
    
  5. src/cleveragents/actor/schema.py:901

    yaml.safe_dump(self.model_dump(mode="json", exclude_none=True), f, ...)
    # Missing: allow_unicode=True
    # Used for actor schema serialization to YAML files
    
  6. src/cleveragents/tui/persona/registry.py:70

    yaml.safe_dump(payload, handle, sort_keys=False, default_flow_style=False)
    # Missing: allow_unicode=True
    # Used for persona registry persistence
    

Contrast with correct usage: The output rendering framework correctly uses allow_unicode=True in src/cleveragents/cli/output/materializers.py:666,674 and src/cleveragents/cli/formatting.py:79. The affected files are inconsistent with this established pattern.

Steps to reproduce:

import yaml
# Without allow_unicode (broken):
yaml.safe_dump({"name": "Café"})
# Output: "name: \"Caf\\xE9\"\n"  ← escaped

# With allow_unicode (correct):
yaml.safe_dump({"name": "Café"}, allow_unicode=True)
# Output: "name: Café\n"  ← preserved

i18n impact: Any user with non-ASCII content in actor configs, persona names, project names, or context data will see garbled YAML output. This affects users in non-English locales and any use of Unicode in identifiers.

Subtasks

  • Add allow_unicode=True to yaml.safe_dump() in _resolve_actor.py:106
  • Add allow_unicode=True to yaml.safe_dump() in repl.py:594
  • Add allow_unicode=True to yaml.dump() in actor_context.py:294
  • Add allow_unicode=True to yaml.dump() in yaml_template_engine.py:226
  • Add allow_unicode=True to yaml.safe_dump() in actor/schema.py:901
  • Add allow_unicode=True to yaml.safe_dump() in tui/persona/registry.py:70
  • Add BDD test scenarios verifying non-ASCII content round-trips correctly through YAML export/import

Definition of Done

  • All yaml.dump/yaml.safe_dump calls in the codebase include allow_unicode=True
  • Non-ASCII content in actor configs, persona names, and project data is preserved as-is in YAML output
  • Existing tests pass
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.

Backlog note: This issue was discovered during autonomous operation
on milestone M3–M6 (active milestones). It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/yaml-allow-unicode` - **Commit Message**: `fix(i18n): add allow_unicode=True to all yaml.dump calls to preserve non-ASCII characters` - **Milestone**: (none — backlog) - **Parent Epic**: #399 (Post-MVP Server & Clients — nearest applicable CLI epic) ## Bug Report **What was tested:** Whether the CLI correctly handles and outputs non-ASCII content (e.g., project names, actor descriptions, persona names, or any user-supplied text containing Unicode characters like accented letters, CJK characters, or emoji). **Expected behavior:** When a user has non-ASCII content in their data (e.g., a project named "Café API" or an actor description in Japanese), YAML output should preserve those characters as-is: ```yaml name: Café API description: カフェAPIプロジェクト ``` **Actual behavior (from code analysis):** Several `yaml.safe_dump()` and `yaml.dump()` calls are missing `allow_unicode=True`, causing PyYAML to escape non-ASCII characters as `\uXXXX` sequences: ```yaml name: "Caf\xE9 API" description: "\u30AB\u30D5\u30A7API\u30D7\u30ED\u30B8\u30A7\u30AF\u30C8" ``` This makes YAML output unreadable for non-ASCII content and breaks round-trip fidelity when the YAML is re-imported. **Affected files and locations:** 1. `src/cleveragents/cli/commands/_resolve_actor.py:106` ```python yaml_text = yaml.safe_dump(config_blob, default_flow_style=False) # Missing: allow_unicode=True ``` 2. `src/cleveragents/cli/commands/repl.py:594` ```python yaml.safe_dump(payload, sort_keys=False, default_flow_style=False) # Missing: allow_unicode=True # Used for persona export — persona names/descriptions may contain non-ASCII ``` 3. `src/cleveragents/cli/commands/actor_context.py:294` ```python yaml.dump(export_data, fh, default_flow_style=False, sort_keys=False) # Missing: allow_unicode=True # Used for actor context export ``` 4. `src/cleveragents/actor/yaml_template_engine.py:226` ```python return yaml.dump(value, default_flow_style=False).strip() # Missing: allow_unicode=True # Used in template engine — templates may contain non-ASCII content ``` 5. `src/cleveragents/actor/schema.py:901` ```python yaml.safe_dump(self.model_dump(mode="json", exclude_none=True), f, ...) # Missing: allow_unicode=True # Used for actor schema serialization to YAML files ``` 6. `src/cleveragents/tui/persona/registry.py:70` ```python yaml.safe_dump(payload, handle, sort_keys=False, default_flow_style=False) # Missing: allow_unicode=True # Used for persona registry persistence ``` **Contrast with correct usage:** The output rendering framework correctly uses `allow_unicode=True` in `src/cleveragents/cli/output/materializers.py:666,674` and `src/cleveragents/cli/formatting.py:79`. The affected files are inconsistent with this established pattern. **Steps to reproduce:** ```python import yaml # Without allow_unicode (broken): yaml.safe_dump({"name": "Café"}) # Output: "name: \"Caf\\xE9\"\n" ← escaped # With allow_unicode (correct): yaml.safe_dump({"name": "Café"}, allow_unicode=True) # Output: "name: Café\n" ← preserved ``` **i18n impact:** Any user with non-ASCII content in actor configs, persona names, project names, or context data will see garbled YAML output. This affects users in non-English locales and any use of Unicode in identifiers. ## Subtasks - [ ] Add `allow_unicode=True` to `yaml.safe_dump()` in `_resolve_actor.py:106` - [ ] Add `allow_unicode=True` to `yaml.safe_dump()` in `repl.py:594` - [ ] Add `allow_unicode=True` to `yaml.dump()` in `actor_context.py:294` - [ ] Add `allow_unicode=True` to `yaml.dump()` in `yaml_template_engine.py:226` - [ ] Add `allow_unicode=True` to `yaml.safe_dump()` in `actor/schema.py:901` - [ ] Add `allow_unicode=True` to `yaml.safe_dump()` in `tui/persona/registry.py:70` - [ ] Add BDD test scenarios verifying non-ASCII content round-trips correctly through YAML export/import ## Definition of Done - All `yaml.dump`/`yaml.safe_dump` calls in the codebase include `allow_unicode=True` - Non-ASCII content in actor configs, persona names, and project data is preserved as-is in YAML output - Existing tests pass - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. > **Backlog note:** This issue was discovered during autonomous operation > on milestone M3–M6 (active milestones). It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
HAL9000 added this to the v3.5.0 milestone 2026-04-09 03:11:16 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#399 Epic: Post-MVP Server & Clients
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#4084
No description provided.