Release v0.7.2 produces a corrupted CLAUDE.md file #1
Labels
No labels
bug
documentation
duplicate
enhancement
good first issue
help wanted
invalid
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
foravo/claude-code-dev-studio-comment-proof-20260519044203#1
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub issue
ggrace519/claude-code-dev-studio#9.Source: https://github.com/ggrace519/claude-code-dev-studio/issues/9
Original author: @ggrace519
Original state: closed
The latest release seems to add hundreds of KB of corrupted character encoding.
Imported from GitHub issue comment
ggrace519/claude-code-dev-studio#9:4362208526.Source: https://github.com/ggrace519/claude-code-dev-studio/issues/9#issuecomment-4362208526
Original author: @ggrace519
@claude please review this issue
Imported from GitHub issue comment
ggrace519/claude-code-dev-studio#9:4362209083.Source: https://github.com/ggrace519/claude-code-dev-studio/issues/9#issuecomment-4362209083
Original author: @claude[bot]
Claude finished @ggrace519's task in 7m 12s —— View job •
claude/issue-9-20260501-2358• Create PR ➔Issue Review: Corrupted CLAUDE.md in Release v0.7.2
Tasks:
Get-Contentencoding issuesSummary
Root Cause
Install-Playbook.ps1—Get-Contentwithout explicit encoding (4 locations)Set-ClaudePlaybookBlock,Remove-ClaudePlaybookBlock, andInstall-Completionsall read files using PowerShell'sGet-Content -Rawwithout specifying-Encoding:In PowerShell 5.1 (Windows PowerShell, still shipping as the default on Windows),
Get-Contentwithout-Encodingreads BOM-less files using the system ANSI codepage (typically Windows-1252). UTF-8 encoded multi-byte sequences — accented characters, smart quotes (""), em-dashes (—), arrows (→), any non-ASCII — are decoded as the wrong characters. These corrupted characters are then written back byWriteAllText(..., UTF8Encoding::new($false))as garbled UTF-8 byte sequences.Why this produces "hundreds of KB of corrupted character encoding":
The CLAUDE.md file written by the installer contains content injected from
jit-claude.md. On ANSI codepage read, the Unicode arrows and smart quotes in the JIT block (→,—,", etc.) are decoded as Windows-1252 two-character sequences (e.g.,’,â†'). When written back as UTF-8, these expanded sequences permanently inflate the file. Each reinstall compounds the corruption — even with the marker-stripping idempotency guard, the user content outside the markers accumulates the ANSI misread bytes on every run.Secondary Finding
build-release.ps1— null-byte check excludes.mdand.jsonThe preflight null-byte check only covers
.ps1and.shfiles:Agent files (
.md) andcatalog.jsonwere not checked. If a binary-corrupted agent or config file reached the stage directory, it would ship in the release ZIP undetected.Changes Made
Install-Playbook.ps1— all fourGet-Content -Rawcalls replaced with[System.IO.File]::ReadAllText($path, [System.Text.Encoding]::UTF8):Set-ClaudePlaybookBlocklines 371–372 (readsjit-claude.mdandCLAUDE.md)Remove-ClaudePlaybookBlockline 495 (readsCLAUDE.mdfor block removal)Install-Completionsline 444 /Remove-CompletionBlockline 469 (reads PS profile)[System.Text.Encoding]::UTF8is BOM-aware — it correctly handles both UTF-8 with BOM (written by editors like Notepad/VS Code on Windows) and UTF-8 without BOM (written by the installer). This exactly matches the write path (UTF8Encoding::new($false)) already in use throughout the file.build-release.ps1— null-byte preflight extended to include.mdand.json:Testing Recommendations
To verify the fix on Windows PowerShell 5.1:
~/.claude/CLAUDE.mdcontaining non-ASCII content (e.g.,My instructions — use "smart quotes" → always).CLAUDE.mdretains the original non-ASCII content unchanged.The bash installer (
ccds-user-setup.sh) is not affected — it usesawk+catand relies on the shell's default locale (UTF-8 on macOS/Linux).|