CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
Automated, show-agnostic podcast/audio production pipeline using ElevenLabs TTS API. Turns a markdown production script into a podcast-ready MP3.
Package Structure
The project is packaged as xil-pipeline (import name xil_pipeline) using hatchling. All pipeline and utility scripts live under src/xil_pipeline/:
src/xil_pipeline/ # Python package (37 modules)
__init__.py # version + key re-exports
xil.py # Unified `xil` command dispatcher
models.py # Pydantic data models, slug/path resolution
mix_common.py # Shared mixing utilities
sfx_common.py # SFX library management, ID3 tagging
timeline_viz.py # Timeline visualization
xil_init.py # Project scaffolding (xil-init command, --type aware)
XILP000_*.py … XILP012_*.py # Pipeline stages
XILU001_*.py … XILU014_*.py # Utility scripts
tests/ # Pytest test suite
docs/ # MkDocs documentation
pyproject.toml # Packaging config (hatchling)
project.json # Show name config (runtime, read from CWD)
speakers.json # Speaker definitions (optional, overrides built-in defaults)
cast_*.json, sfx_*.json # Episode configs (workspace data, stays at root)
Install for development: pip install -e ".[all,dev]"
All internal imports use the package namespace: from xil_pipeline.models import ...
Environment
- Python 3.12+, virtualenv at
venv/ - WSL2 (Linux on Windows)
- Activate:
source venv/bin/activate - Install:
pip install -e ".[all,dev]"(editable install with all optional deps) - Core packages:
elevenlabs,pydub,pydantic,mutagen,httpx - Optional:
google-genai,gTTS,pyttsx3,ollama - ElevenLabs API key via
ELEVENLABS_API_KEYenv var - Audio playback via
mpg123in WSL
Workspace Root (XIL_PROJECTROOT)
All pipeline commands resolve content paths relative to the workspace root. By default this is the current working directory (existing behaviour). Set XIL_PROJECTROOT to point at a content directory from anywhere:
export XIL_PROJECTROOT=/path/to/xil-content
xil produce --episode S04E02 # works from any directory
xil-gui # GUI shows correct workspace
Resolution order for get_workspace_root() (in models.py):
XIL_PROJECTROOTenvironment variable (absolute path, tilde-expanded)Path.cwd()— current working directory (no env var set)
xil-init with no directory argument scaffolds into XIL_PROJECTROOT when set, otherwise into the current directory.
logs/, configs/, parsed/, stems/, SFX/, daw/, masters/, cues/, scripts/, posts/ all resolve under the workspace root. This enables a clean separation of the installed software from user content — install xil-pipeline once via pip, point XIL_PROJECTROOT at a content-only directory.
Project Configuration
project.json at the workspace root declares the show name and optional season title used across the pipeline:
All scripts accept a --show CLI flag to override the show name. Resolution order: --show arg > project.json > hardcoded fallback "sample".
The season_title key in project.json is the workspace-level default for the season/arc title. When a script header contains Arc: "…", that value takes precedence; when absent, project.json season_title fills in. Resolution order: script header Arc: > project.json season_title > None. The {season_title} placeholder in PREAMBLE/POSTAMBLE script text resolves from this value.
The season key in project.json is the workspace-level default season number. When a script header contains Season N:, that value takes precedence; when absent, project.json season fills in. Resolution order: script header Season N: > project.json season > None.
Content Type
project.json supports an optional "type" field (default: "podcast"):
Four types are supported: podcast, audiobook, drama, special. The type drives:
- Section map used by xil-parse (e.g. CHAPTER ONE for audiobook, PROLOGUE/ACT ONE for drama)
- Default gap_ms for xil-assemble / xil-daw (400 ms audiobook, 800 ms drama, 600 ms others)
- Sample script, speakers.json, and subdirectory layout generated by xil-init --type
- Audiobook type adds "tag_format": "V{volume:02d}C{chapter:02d}" to project.json for V01C01-style tags
project.json without a type field defaults to "podcast" with no change in behavior.
Workspace Layout (0.1.8+)
New workspaces created with xil-init use a normalized directory layout:
configs/{slug}/speakers.json ← was speakers.json at root
configs/{slug}/cast_{tag}.json ← was cast_{slug}_{tag}.json at root
configs/{slug}/sfx_{tag}.json ← was sfx_{slug}_{tag}.json at root
parsed/{slug}/parsed_{tag}.json ← was parsed/parsed_{slug}_{tag}.json
daw/{slug}/{tag}/ ← was daw/{tag}/
masters/{slug}/{tag}_master.mp3 ← was masters/{slug}_{tag}_master.mp3
cues/{slug}/cues_{tag}.md ← was cues/cues_{slug}_{tag}.md
stems/{slug}/{tag}/ ← unchanged
Existing pre-0.1.8 workspaces continue to work automatically — derive_paths() detects the legacy layout (cast config at root) and returns legacy paths. Run xil migrate-workspace to move files to the new layout.
File paths are derived dynamically via derive_paths(slug, tag). The slug is the show name lowercased with all non-alphanumeric characters removed (e.g., "THE 413" → "the413", "Night Owls" → "nightowls").
Project Scaffolding
xil-init scaffolds a new show workspace with sample content:
xil-init my-show --show "Night Owls"
xil-init my-show --show "Night Owls" --type drama
xil-init my-show --show "Night Owls" --type audiobook
The --type flag (choices: podcast [default], audiobook, drama, special) selects:
- A type-specific sample script with appropriate sections and cast
- A type-specific speakers.json (single narrator for audiobook, full cast for drama, etc.)
- project.json "type" field + "tag_format" for audiobook (V01C01)
Creates: project.json, speakers.json, scripts/sample_{tag}.md, and empty subdirectories in the normalized layout. The sample script exercises all parser features so the user can immediately run xil-scan and xil-parse --dry-run.
Speaker Configuration
configs/{slug}/speakers.json provides optional enrichment (voice_id, pan, filter, role, etc.) for the parser. Since 0.2.0 it is not required for speaker recognition — characters declared in the script's CAST: block are always recognized.
[
{"display": "ADAM", "key": "adam"},
{"display": "MR. PATTERSON", "key": "mr_patterson"},
{"display": "FILM AUDIO (MARGARET'S VOICE)", "key": "film_audio"}
]
Recognition order (merged, all sources combined): CAST: block entries from the script itself → configs/{slug}/speakers.json (JSON key always wins over auto-derived key) → built-in defaults (only when neither CAST entries nor JSON file exist). The list is auto-sorted longest-first for compound-name matching. Both xil-scan (XILP000) and xil-parse (XILP001) accept the --speakers flag.
CAST: block format
Every script should declare its cast in the header using dialogue-label names (the ALL-CAPS prefix used in dialogue lines), with optional — role descriptions:
CAST:
* ADAM — Adam Santos, Host
* MR. PATTERSON — Recurring Caller
* FILM AUDIO (MARGARET'S VOICE) — Archive audio
Characters listed here are automatically recognized during parsing even if absent from speakers.json. For new series or one-off characters added mid-series, adding them to the CAST: block is sufficient to parse correctly; run xil scan --harvest-cast afterwards to propagate them to speakers.json for enrichment.
Pre-Flight Script Scanner
XILP000_script_scanner.py — Scans a raw markdown script and reports recognized/unrecognized speakers and sections before running XILP001. Use this whenever onboarding a new script to catch missing speakers or SECTION_MAP entries early.
xil scan "scripts/<script>.md"
xil scan "scripts/<script>.md" --json
xil scan --harvest-cast # union CAST: blocks across all scripts → speakers.json diff
xil scan --harvest-cast --yes # auto-add new entries to speakers.json
xil scan --backfill-cast --dry-run # preview CAST: block insertion for old scripts
xil scan --backfill-cast --yes # write CAST: blocks to all scripts that lack one
- No
--episodeflag required for single-script scan — reads only the script file, no side effects - Exit code 0 = all recognized (safe to run XILP001); exit code 1 = action needed
- Imports XILP001's pure functions directly — no duplicated logic
--jsonoutputs machine-readable scan results--speakers PATHoverrides the speaker list (see Speaker Configuration)--harvest-castscans allscripts/*.md, collects CAST: entries, and reports characters missing from speakers.json;--yesadds them automatically;--scripts-dir DIRoverrides the default scripts directory--backfill-castadds CAST: blocks to scripts that don't have one, inferring speakers from existing parsed JSON (most reliable) or body scan against speakers.json;--yeswrites files, default is dry-run;--scripts-dir DIRoverrides the default scripts directory
Architecture: Nine-Stage Pipeline (+ Cues Ingester Pre-Processing)
Stage 1: Script Parsing
XILP001_script_parser.py — Parses markdown production scripts into structured JSON.
- Input: Markdown scripts in
scripts/— supports both plain text (S01E01) and markdown-formatted (S01E02+) scripts transparently - Two-pass normalization:
strip_markdown_escapes()removes\[,\], etc.;strip_markdown_formatting()removes**,##/###headings, trailing double-space line breaks - Handles both single-line dialogue (
SPEAKER (dir) Text) and multi-line dialogue (speaker, direction, text on separate lines) via pending-speaker state machine - Standalone parenthetical acting notes like
(beat)or(pause)within dialogue continuations are filtered from spoken text - Square-bracket stage directions with unrecognized
direction_type(acting notes like[drawn out],[quietly]) are silently skipped rather than emitted astype: direction, direction_type: Nonenoise entries - Dividers: accepts both
===(plain text) and---(markdown horizontal rules) - End markers: stops at
END OF EPISODEorEND OF PRODUCTION SCRIPT - Output:
parsed/parsed_<slug>_S01E01.json— entries with seq, type, section, scene, speaker, direction, text, direction_type - Output path derived from script header metadata (season/episode); override with
--output --episode S01E01(optional) validates that the script header matches the intended episode tag--showoverrides the show name used for slug derivation (see Project Configuration)- When
--episodeis provided andcast_<slug>_S01E01.json/sfx_<slug>_S01E01.jsondon't exist, auto-generates skeleton configs withvoice_id=TBDand default SFX prompts; the cast skeleton includesseason_titlepopulated from the script header'sArc: "…"declaration (ornullwhen absent) season_titleis extracted from theArc: "…"token in the script header (e.g.THE 413 Season 1: Episode 1: "The Empty Booth" Arc: "The Holiday Shift") and stored in the parsed JSON; it is available as{season_title}in PREAMBLE/POSTAMBLE script text- Supports
--quiet(JSON only, skip summary) and--debug(write diagnostic CSV alongside JSON) - Auto-generates BEAT variants (
BEAT — 3 SECONDSetc.) astype: "silence"with duration parsed from the text (e.g. 3.0s) - Auto-generates
AMBIENCE: STOPandAMBIENCE: * FADES OUTdirectives astype: "silence", duration_seconds: 0.0stop markers — no audio asset needed - Known speakers resolved in priority order:
CAST:block in the script header (auto-discovered, no setup required) →speakers.jsonenrichment (JSON key overrides auto-derived key) → built-in defaults (only when both sources are absent); new characters declared inCAST:are recognized immediately without editingspeakers.json --speakers PATHoverrides the speakers.json path (see Speaker Configuration)- Sections: COLD OPEN, OPENING CREDITS, ACT ONE, ACT TWO, MID-EPISODE BREAK, CLOSING
Stage 1.5: Cues Sheet Ingestion (Pre-processing)
XILP006_cues_ingester.py — Parses a sound cues & music prompts markdown file into a structured asset manifest, audits the shared SFX library, and optionally enriches the episode sfx config or generates new assets.
xil cues --episode S02E03 --cues "cues/<file>.md"
xil cues --episode S02E03 --cues "cues/<file>.md" --enrich-sfx-config
xil cues --episode S02E03 --cues "cues/<file>.md" --generate
xil cues --episode S02E03 --cues "cues/<file>.md" --generate --enrich-sfx-config
--episodeor--tag(one required) derives the sfx config path (sfx_<slug>_S02E03.json)--showoverrides the show name used for slug derivation (see Project Configuration)--cues PATHexplicit path to the cues markdown file; auto-detected fromcues/if omitted and exactly one.mdexists there (canonical name:cues/cues_<slug>_S02E03.md)- Always writes
cues/cues_manifest_<TAG>.json— structured JSON catalog of all parsed assets - Always prints an audit report: EXISTS / REUSE / NEW status per asset, credit estimate for NEW generation
--enrich-sfx-config— updatessfx_<slug>_<TAG>.jsonentries that reference a cues-sheet asset ID: replaces stub prompts with the full cues-sheet prompt and corrects duration (capped at 30s API limit)--generate— calls ElevenLabs Sound Effects API to generate NEW assets intoSFX/<asset-id>.mp3(e.g.SFX/sfx-boots-stamp-01.mp3); skips assets already on disk; REUSE assets are never generated here--dry-run— suppresses API calls and sfx config writes; shows enrichment diff and generation credit estimate- Parses three cue sheet sections: MUSIC CUES (heading blocks), AMBIENCE (heading blocks), SOUND EFFECTS (markdown tables per scene)
- Duration cap: assets longer than 30s are generated at 30s and flagged
[CAPPED]in the audit
Stage 2: Voice Generation
XILP002_producer.py — Calls ElevenLabs API to generate voice stems.
--episodeor--tag(one required) derivescast_<slug>_S01E01.jsonandsfx_<slug>_S01E01.json--showoverrides the show name used for slug derivation (see Project Configuration)- Reads: parsed JSON + cast config; always loads SFX config (for INTRO/OUTRO MUSIC source lookup)
- Outputs:
stems/<slug>/<TAG>/{seq:03d}_{section}[-{scene}]_{speaker}.mp3(e.g.stems/the413/S01E01/003_cold-open_adam.mp3) - Preamble/Postamble:
PREAMBLEandPOSTAMBLEare first-class script sections parsed by XILP001 and written into the parsed JSON with contiguous seq numbers. XILP002 generates their voice stems through the standardgenerate_voices()loop, applying a per-section speed override from the cast configpreamble.speed/postamble.speedfield. No separate injection step — preamble/postamble entries exist in the parsed JSON from the start - INTRO/OUTRO MUSIC source stems are copied from
sfx_config.effects["INTRO MUSIC"/"OUTRO MUSIC"].sourceat produce time - Supports
--start-from Nfor resuming interrupted runs;--stop-at Nto halt after a specific seq (useful for previewing a section without regenerating the full episode) - Supports
--dry-runto preview lines and TTS character cost without API calls; includes a per-speaker breakdown table (lines + chars to generate vs. already on disk) sorted by chars descending; per-entry marker:[ ]= will generate,[=]= stem exists/skip,[x]= out of range - Supports
--terseto truncate each line to 3 words (minimizes TTS character cost) - Supports
--gen-sfx,--gen-music,--gen-ambienceto generate only the specified categories of stems (replaces deprecated--sfx-musicwhich is kept as a shorthand for all three) - Supports
--local-only(used with--gen-sfx/--gen-music/--gen-ambience) to skip any effect that would require an API call — only assets already inSFX/(CACHED) or silence entries are placed; no credits spent - Supports
--backend elevenlabs|gtts|chatterbox(default:elevenlabs):gttsroutes all dialogue voice stems through Google Translate TTS at no cost — flat single voice, useful for duration checks;chatterboxuses local Chatterbox TTS with per-character zero-shot voice cloning fromvoice_refs/<key>.wavreference clips — near-production quality, GPU-accelerated, free after setup; SFX/music/ambience generation is unaffected by--backend; eleven_v3 inline tags are stripped before gTTS/Chatterbox calls --backend chatterboxoptions:--chatterbox-python PATH(default: auto-detect./venv-chatterbox/bin/python3);--voice-refs DIR(default:voice_refs/) for per-speaker.wavreference clips;--exaggeration FLOATemotion level 0.0–1.0 (default: 0.5); missing voice refs fall back to Chatterbox default voice- Intro music (
INTRO MUSICsource entry): trimmed at copy time usingplay_durationpercentage from sfx config, so the stem file reflects the actual playback length - Skips stems that already exist on disk
Stage 3: Audio Assembly
XILP003_audio_assembly.py — Two-pass multi-track mix into a final master MP3.
xil assemble --episode S01E01
xil assemble --episode S01E01 --parsed parsed/parsed_<slug>_S01E01.json
- When a parsed script JSON is available (auto-derived or via
--parsed), runs a two-pass multi-track mix: - Foreground pass: dialogue + one-shot SFX/BEAT stems concatenated sequentially
- Background pass: AMBIENCE stems looped across scene boundaries (ducked -10 dB); MUSIC stings overlaid at cue points (-6 dB)
- Foreground and background combined via
AudioSegment.overlay() - Falls back to single-pass sequential concatenation when no parsed JSON is found
- Stem classification uses
direction_typefrom the parsed JSON, keyed by seq number in the filename - Shared mixing logic lives in
mix_common.py— also used by XILP005 - Applies per-speaker effects (pan, audio filters) from cast config;
filterfield acceptsfalse/null(none),true/"phone"(phone filter),"vintage"(vintage filter), or a comma-separated combination such as"vintage,phone" - Applies scene-scoped vintage filter to all dialogue in scenes listed in
sfx_config.vintage_scenes; applied after the per-speaker filter chain --showoverrides the show name used for slug derivation (see Project Configuration)- Supports
--outputto set the master MP3 path (default:<slug>_S01E01_master.mp3) --gap-ms Nsets the silence gap between foreground stems in milliseconds (default: 600); reducing to 200–300 can shorten episode runtime by 1.5–2 minutes- No ElevenLabs API key required — safe to re-run freely
Stage 4: Studio Project Onboarding
XILP004_studio_onboard.py — Creates an ElevenLabs Studio project from parsed episode data.
xil studio-onboard --episode S01E02 --dry-run
xil studio-onboard --episode S01E02
xil studio-onboard --episode S01E02 --quality high
--episodeor--tag(one required) derivesparsed_<slug>_S01E02.jsonandcast_<slug>_S01E02.json--showoverrides the show name used for slug derivation (see Project Configuration)- Builds
from_content_jsonpayload for the Studio Projects API with per-nodevoice_idassignments - Solves the speaker-name problem: voice assignments are embedded directly — no speaker names in TTS text
- Content mapping: sections → chapters, dialogue →
tts_nodeblocks, scene headers →h2blocks, directions → skipped --dry-rundisplays chapter/block summary with voice assignments without calling the API--qualitysets quality preset (standard/high/ultra/ultra_lossless, default: standard)--modelsets TTS model (default: eleven_v3)- Validates no TBD voice_ids in cast config before proceeding
- Requires
ELEVENLABS_API_KEYenv var for non-dry-run mode
Stage 5: DAW Layer Export
XILP005_daw_export.py — Exports four isolated, full-length WAV layers for human mixing in Audacity.
xil daw --episode S01E01 --dry-run
xil daw --episode S01E01
xil daw --episode S01E01 --macro
xil daw --episode S01E01 --output-dir exports/S01E01/
xil daw --episode S01E01 --dry-run --timeline
xil daw --episode S01E01 --timeline --timeline-html
--episodeor--tag(one required) derivescast_<slug>_S01E01.jsonandparsed/parsed_<slug>_S01E01.json--showoverrides the show name used for slug derivation (see Project Configuration)- Outputs four WAV files to
daw/{TAG}/— all identical duration, all aligned at t=0: {TAG}_layer_dialogue.wav— spoken dialogue (audio filter chain + pan applied per speaker){TAG}_layer_ambience.wav— environmental background looped to fill scene durations{TAG}_layer_music.wav— music stings/themes at cue positions{TAG}_layer_sfx.wav— one-shot SFX and BEAT silences- Each WAV is tagged with ID3 metadata (Album, Genre, Year, Title, Artist) via
tag_wav()fromsfx_common.py - Generates four Audacity label track files (
{TAG}_labels_dialogue.txt, etc.) — tab-separated start/end/text - Generates
{TAG}_open_in_audacity.py— prints WAV import instructions (labels listed separately as optional) --macrowrites an Audacity macro (THE413_{TAG}.txt) to%APPDATA%\audacity\Macros\for one-click WAV import viaTools > Macros--dry-runshows stem counts and output paths without writing files--gap-ms Nsets the silence gap between foreground stems in milliseconds (default: 600); reducing to 200–300 can shorten episode runtime by 1.5–2 minutes--save-aup3includes aSaveProject2command in the generated{TAG}_open_in_audacity.pyhelper script (requires mod-script-pipe in Audacity)--timelineprints an ASCII multitrack timeline to stdout (works with--dry-runvia fast mutagen header reads)--timeline-htmlwrites a self-contained interactive HTML timeline todaw/{TAG}/{TAG}_timeline.html(hover tooltips, Ctrl+scroll zoom)- Preamble/postamble stems are picked up automatically via
collect_stem_plans()— they are regular seq-numbered entries in the parsed JSON (no special negative-seq handling) - No ElevenLabs API key required — no API calls made
- Shared mixing logic imported from
mix_common.py; visualization viatimeline_viz.py
Stage 6: Stem Migration (Punch-In Workflow)
XILP007_stem_migrator.py — Migrates episode stems when a parsed script is revised. Compares an old and new parsed JSON, copies unchanged stems to their new seq-numbered filenames, and reports which entries need fresh TTS/SFX generation. Run XILP002 afterwards to fill only the gaps.
xil migrate --episode S02E03 --dry-run
xil migrate --episode S02E03
xil migrate \
--old parsed/orig_parsed_<slug>_S02E03.json \
--new parsed/parsed_<slug>_S02E03.json \
--stems stems/S02E03 [--dry-run] [--strict]
--episode TAGderives--old(parsed/orig_parsed_<slug>_{TAG}.json),--new(parsed/parsed_<slug>_{TAG}.json), and--stems(stems/{TAG}) automatically--showoverrides the show name used for slug derivation (see Project Configuration)--orig-prefix(default:orig_) sets the filename prefix for the old parsed JSON--dry-run— shows the full plan without copying any files--strict— exact text match only; default is fuzzy (normalises em-dash, ellipsis, curly quotes so punctuation-only edits don't force unnecessary regen)--quiet— prints only the summary, not per-stem details- Status codes printed per stem:
COPY(unchanged, will be/was copied),SPEAKER(text matches but speaker reassigned → regen),NEW(no old entry matches → generate),MISSING(match found but old file absent → generate); each status line is followed by a truncated text snippet (first 55 chars) for visual content verification - Two-phase matching: phase 1 matches on (text, speaker); phase 2 (dialogue only) falls back to text-only to detect speaker reassignments
- After running (without
--dry-run), runXILP002_producer.py --episode TAG— it skips stems already on disk, so only SPEAKER/NEW/MISSING slots get API calls - No ElevenLabs API key required — no API calls made
Stage 7: Stale Stem Cleanup
XILP008_stale_stem_cleanup.py — Removes stale stems left behind after a parsed script revision and stem migration. After XILP007 copies unchanged stems to new seq-numbered filenames, old stems whose seq numbers now map to a different entry type remain on disk. This script finds and deletes them.
xil cleanup --episode S02E03 --dry-run
xil cleanup --episode S02E03
xil cleanup \
--parsed parsed/parsed_<slug>_S02E03.json \
--stems stems/S02E03 [--dry-run]
--episode TAGderives--parsed(parsed/parsed_<slug>_{TAG}.json) and--stems(stems/{TAG}) automatically--showoverrides the show name used for slug derivation (see Project Configuration)--parsedand--stemsoverride individual paths (both required if--episodeis omitted)--dry-run— lists stale stems without deleting them- A stem is stale when its filename disagrees with the current parsed entry: entry type is a header (
section_header/scene_header),_sfxsuffix but entry is nowdialogue, speaker suffix but entry is nowdirection, dialogue stem whose speaker suffix doesn't match the parsed speaker, or seq not present in parsed JSON at all - Duplicate detection: when multiple stems share the same seq, keeps only the one whose basename matches the expected
{seq}_{section}[-{scene}]_{speaker|sfx}pattern - Uses
extract_seq()andload_entries_index()frommix_common.py - No ElevenLabs API key required — no API calls made
Stage 8: Studio Export Import
XILP010_studio_import.py — Extracts dialogue and direction stems from an ElevenLabs Studio export ZIP and renames them to the pipeline's stem naming convention.
xil import --episode S02E02 --zip "ElevenLabs_exports/export.zip" --dry-run
xil import --episode S02E02 --zip "ElevenLabs_exports/export.zip"
xil import --episode S02E02 --zip "ElevenLabs_exports/export.zip" --gen-sfx --gen-music --gen-beats
xil import --episode S02E02 --zip "ElevenLabs_exports/export.zip" --all --force
--episode TAG(required) derives parsed JSON path and stems output directory--showoverrides the show name used for slug derivation (see Project Configuration)--zip PATH(required) path to the ElevenLabs Studio export ZIP--parsed PATHoverrides parsed JSON path (default:parsed/parsed_<slug>_{TAG}.json)--stems-dir PATHoverrides stems output directory (default:stems/{TAG})--dry-run— shows extraction plan without writing files--force— overwrites existing stems on disk (default: skip if exists)--gen-sfx— include SFX direction entries (extracted as_sfxstems)--gen-music— include MUSIC direction entries (extracted as_sfxstems)--gen-beats— include BEAT direction entries (extracted as_sfxstems)--all— include all direction types (SFX, MUSIC, BEAT, AMBIENCE); headers are always skipped- Dialogue entries are always extracted; direction entries require one of the
--gen-*or--allflags - ElevenLabs Studio exports one MP3 per parsed entry (
NNN_Chapter N.mp3) - Reuses
make_stem_name()from XILP007 for canonical stem filename generation - No ElevenLabs API key required — no API calls made
Stage 9: Final Master MP3 Export
XILP011_master_export.py — Overlays the four DAW layer WAVs from XILP005 into a single podcast-ready MP3.
xil master --episode S02E03 --dry-run
xil master --episode S02E03
xil master --episode S02E03 --show "Night Owls"
--episodeor--tag(one required) derives DAW layer paths and cast config--showoverrides the show name (default: fromproject.json)--daw-diroverrides the DAW layer directory (default:daw/<TAG>/)--outputoverrides the output MP3 path (default:masters/<TAG>_<slug>_<YYYY-MM-DD>.mp3)--dry-runshows layer summary without writing files- Output format: stereo, 48 kHz, VBR MP3 (~145–185 kbps, LAME quality 2)
- Output filename:
S02E03_the413_2026-03-24.mp3(episode tag, show slug, run date) - Overlays all four layers at unity gain (XILP005 handles mix balance)
- Reads cast config for ID3 metadata (album, title, artist)
- No ElevenLabs API key required — no API calls made
Stage 10: Social Media Post Draft Generator
XILP012_publish.py — Reads a parsed episode JSON, builds a structured episode summary, and calls the Claude API (Haiku) to produce three ready-to-edit Facebook/Instagram post variants. Output is an editable markdown file.
xil publish --episode S04E01 --dry-run
xil publish --episode S04E01
xil publish --episode S04E01 --platform instagram
xil publish --all
--episodeor--tag(one required unless--allis used) derives parsed JSON and cast config--showoverrides the show name (default: fromproject.json)--platform facebook|instagram— affects post length/style guidance (default:facebook)--dry-run— prints the Claude prompt and estimated token count; no API call, no file written--all— generate posts for every parsed episode under the current show slug (retroactive batch)--model— override the Claude model ID (default:claude-haiku-4-5-20251001)- Output:
posts/{slug}/{tag}_posts.md— editable markdown with three variants - Three post variants: Hype (new episode teaser, no spoilers past cold open), Quote (pull quote from cold open + tune-in CTA), Spotlight (one cast member feature, cycles by episode number mod cast count)
- Requires
ANTHROPIC_API_KEYenvironment variable (only for non-dry-run mode) - Requires
[publish]optional extra:pip install 'xil-pipeline[publish]'(installsanthropic>=0.40) - System prompt is tagged with
cache_control: ephemeralto minimize cost on--allbatch runs - No ElevenLabs API key required — no ElevenLabs calls made
ElevenLabs API Cost Controls
Every script that calls the API includes three guard functions (duplicated per file, not shared):
check_elevenlabs_quota()— displays current character usage vs limithas_enough_characters(text)— per-line quota check before each API callget_best_model_for_budget()— always returnseleven_v3; logs a warning when balance is low (no longer falls back toeleven_flash_v2_5, which does not support[pause]and other native audio tags)
Always use --dry-run before running voice generation on a new script to verify TTS character budget.
File Naming Convention
All scripts live under src/xil_pipeline/ and are installed as xil-* console entry points plus a unified xil command via pyproject.toml (example: xil parse ... routes to xil-parse). Scripts use prefix XIL (ElevenLabs, avoiding numeric prefixes). The suffix pattern is:
XILP000_*— pre-flight script scanner (no API, no side effects)XILU001_*— voice discovery (browse ElevenLabs voices;--update-castback-fills role/language_code into a cast JSON)XILU002_*— standalone SFX stem generationXILU003_*— CSV + SFX/cast annotation utility (joins parsed episode CSV with SFX JSON and cast JSON for review)XILU004_*— voice sample generator (audition cast voices)XILU005_*— SFX library discovery (--localscansSFX/directory, default;--apiqueries ElevenLabs history)XILU006_*— parsed JSON splice utility (insert/delete entries with automatic seq renumbering)XILU007_*— MP3 hash utility (compute SHA256 checksums for stem files)XILU008_*— stem log report (parse daily logs → chronological stem generation CSV with backend/model/hash)XILU009_*— workspace migration tool (move pre-0.1.8 legacy layout to normalizedconfigs//parsed//daw/structure)XILU010_*— MP3 loudness profiler (peak, average, and minimum dBFS per stem or directory)XILU011_*— SFX config CSV flattener (sfx_.json → one-row-per-effect CSV for audit/debug) XILU012_*— parsed JSON CSV exporter (parsed_.json → one-row-per-entry CSV for audit/debug) XILU013_*— SFX config hydrator (writes pipe-hintsourcefields from parsed JSON into the SFX config)XILU014_*— episode summary CSV generator (scans all parsed JSONs → one-row-per-episode CSV with dialogue_lines, words, tts_chars)xil_gui.py— Gradio web dashboard (xil-guientry point); requires[gui]extra; five tabs: Episodes, Audio Preview, Run Stage, Setup (with content type selector), TimelineXILP001_*— script parserXILP002_*— voice generation (ElevenLabs TTS)XILP003_*— audio assembly (stems → master MP3, two-pass multi-track mix)XILP004_*— Studio project onboarding (ElevenLabs Studio Projects API)XILP005_*— DAW layer export (stems → per-layer WAVs for Audacity)XILP006_*— cues sheet ingester (cues markdown → SFX library + sfx config enrichment)XILP007_*— stem migrator (diff old vs new parsed JSON, copy unchanged stems, report what needs regen);--dry-runreport shows truncated text snippets alongside COPY/NEW/SPEAKER/MISSING entries for visual content verification without cross-referencing JSON filesXILP008_*— stale stem cleanup (delete stems whose seq no longer matches the current parsed JSON)XILP009_*— reverse script generator (parsed JSON → production script markdown)XILP010_*— Studio export importer (ElevenLabs Studio ZIP → pipeline stems)XILP011_*— final master MP3 export (overlay 4 DAW layer WAVs → single stereo 48 kHz VBR MP3)XILP012_*— social media post draft generator (parsed JSON + Claude Haiku →posts/{slug}/{tag}_posts.md; 3 variants: Hype, Quote, Spotlight; requires[publish]extra +ANTHROPIC_API_KEY)mix_common.py— shared mixing utilities (timeline, layer builders, fast label helpers) used by XILP003 and XILP005;StemPlan.scene(str|None): scene label from parsed JSON, used for scene-scoped vintage filter;StemPlan.loopfield:True(default) tiles audio,Falseplays once up to scene boundary;StemPlan.pre_trimmedflag: skips play_duration trim for source-based stems already trimmed at copy time;StemPlan.volume_percentage(float|None): volume as a percentage (100 = unity, None = no change);StemPlan.ramp_in_seconds/StemPlan.ramp_out_seconds: fade durations in seconds (None = no fade);_resolve_audio_params()resolves volume/ramp from per-effect config or category defaults for MUSIC, AMBIENCE, SFX, and BEAT direction types;volume_percentage,ramp_in_seconds, andramp_out_secondseach fall back to the global key when no category-specific key exists (e.g. SFX/MUSIC whensfx_volume_percentage/music_ramp_in_secondsare absent from the config defaults);collect_stem_plans()skips stale stems (header entries, type mismatch, speaker mismatch), deduplicates by seq number, and injects synthetic stop-markerStemPlanentries (filepath="") forAMBIENCE: STOPandAMBIENCE: * FADES OUTdirectives found in the entries index;build_sfx_layer()andbuild_foreground()applyvolume_percentageto SFX/BEAT stems;build_ambience_layer()skips corrupt or unreadable stem files with a warning rather than crashing;apply_vintage_filter()applies a 1960s-era HF roll-off + 1 dB reduction;_apply_speaker_filters(segment, filter_val)resolves the cast configfilterstring and applies the named filter chain (false/None= none,true/"phone"= phone,"vintage"= vintage,"vintage,phone"= both);_vf_engaged_seqs(stem_plans)returns the set of dialogue seq numbers that fall within aVINTAGE FILTER: ENGAGES…VINTAGE FILTER: DISENGAGESspan — used bybuild_foreground()andbuild_dialogue_layer()to apply the vintage EQ to dialogue within those spans (script-direction spans take precedence overvintage_sceneslist fallback)sfx_common.py— shared SFX library management, ID3 tagging (tag_mp3,tag_wav), effect generation;ensure_shared_asset()retries on 429 rate-limit errors (up to 5 times, linear backoff);load_sfx_entries()acceptsdirection_typesfilter set, returnsdirection_typefield in each entry dict, skips entries withduration_seconds=0.0;dry_run_sfx()shows per-category credit subtotals in the SUMMARY blocktimeline_viz.py— multitrack timeline visualization;render_terminal_timeline()(ASCII) andrender_html_timeline()(interactive HTML); no pydub dependency; HTML bar badges:ri(↑ ramp in, left),ro(↓ ramp out, right-top),pd(% play duration, center),vb(🔊 volume%, right-bottom, shown whenvolume_pct != 100); applies to music, ambience, and SFX spansmodels.py— Pydantic data models plusget_workspace_root()(respectsXIL_PROJECTROOTenv var),show_slug(),derive_paths(),resolve_slug()for dynamic show-based path derivation;DEFAULT_SLUG = "sample"fallback;ProjectConfigmodel withtype/tag_formatfields;TYPE_DEFAULTSdict with gap_ms and stability per content type;derive_paths_legacy()returns pre-0.1.8 paths anchored to workspace root (used by migration tool);derive_paths()auto-detects layout (legacy if root cast config exists, normalized otherwise);load_project_config()/resolve_project_type()helpersxil.py— unified dispatcher that maps subcommands (scan,parse,produce, etc.) to existing modulemain()entry points; prints command list onxil --help;xil-*commands remain supportedxil_init.py— project scaffolding;--type podcast|audiobook|drama|specialselects sample script, speakers.json, and project.jsontypefield; creates new normalized workspace layout (configs/{slug}/)
Cast Configuration
cast_<slug>_S01E01.json (e.g. cast_the413_S01E01.json) contains show-level metadata (show, season, episode, title) and a cast dict mapping character keys to settings:
{
"show": "THE 413", "season": 1, "episode": 1, "title": "The Holiday Shift",
"cast": {
"adam": { "full_name": "Adam Santos", "voice_id": "...", "pan": 0.0, "filter": false, "role": "Host/Narrator" },
"mr_patterson": { "full_name": "Mr. Patterson", "voice_id": "...", "pan": -0.3, "filter": "phone", "role": "Caller" },
"young_adam": { "full_name": "Young Adam", "voice_id": "...", "pan": 0.0, "filter": "vintage", "role": "Flashback" }
}
}
Voice IDs are discovered via XILU001_discover_voices_T2S.py (filters to premade category).
Optional preamble and postamble blocks define the speaker and speed for those sections. The actual dialogue text lives in the script (PREAMBLE/POSTAMBLE sections); the cast config only needs speaker and speed:
{
"preamble": {
"speaker": "tina",
"speed": 0.85
},
"postamble": {
"speaker": "tina",
"speed": 0.85
}
}
Intro/outro music lives in the SFX config under "INTRO MUSIC" / "OUTRO MUSIC" keys — not in the cast config. All TTS generation uses eleven_v3 unconditionally.
SFX Configuration
sfx_<slug>_S01E01.json (e.g. sfx_the413_S01E01.json) maps parsed direction entry text to ElevenLabs Sound Effects API parameters:
{
"show": "THE 413", "season": 1, "episode": 1,
"defaults": { "prompt_influence": 0.3 },
"effects": {
"INTRO MUSIC": { "source": "SFX/The Porch Light.mp3" },
"SFX: PHONE BUZZING": { "prompt": "Phone vibrating buzz", "duration_seconds": 2.0 },
"BEAT": { "type": "silence", "duration_seconds": 1.0 }
}
}
- Keys match the
textfield of parsed direction entries exactly "INTRO MUSIC"/"OUTRO MUSIC"are reserved keys for preamble/postamble music; XILP002 reads theirsourcefield to copy the audio file into the appropriate seq-numbered stem — no API generationtype: "sfx"(default) entries callclient.text_to_sound_effects.convert()with theprompttype: "silence"entries (BEAT/LONG BEAT) generate local silent audio — no API callloop: falseentries play the audio file once up to the scene boundary (no tiling);loop: true(default) tiles the file to fill the full scene durationvolume_percentage— per-effect volume as a percentage (100 = unity, 50 = half volume); applies to SFX, BEAT, MUSIC, and AMBIENCE entries; overrides the category default (sfx_volume_percentage,music_volume_percentage,ambience_volume_percentage) in thedefaultsblockplay_duration— percentage of file to play (e.g.45= play 45% of file duration); for INTRO MUSIC, the trim is applied when copying to the stem file so all downstream tools see the correct duration- Stop markers:
AMBIENCE: STOPandAMBIENCE: * FADES OUTentries usetype: "silence", duration_seconds: 0.0; they inject a boundary marker into the mixing timeline without generating audio vintage_scenes— optional top-level list of scene labels (e.g.["scene-3", "scene-4", "scene-5"]); all dialogue in those scenes receives the vintage audio filter (HF roll-off + 1 dB reduction) during assembly; applied after per-speaker filters; omit or leave empty for no vintage treatment; the same scene labels used in the parsed JSONscenefield- SFX stems use
_sfxsuffix:002_cold-open_sfx.mp3
Shared SFX Library
Each unique sound effect is generated once into the SFX/ directory as a shared asset (e.g. SFX/beat.mp3, SFX/sfx_phone-buzzing.mp3). Episode stems in stems/<slug>/<TAG>/ are copies of these shared assets with sequence-numbered filenames. This avoids regenerating the same effect for repeated uses (e.g. BEAT appears 26 times in S01E01). See docs/sfx-reuse-guide.md for a workflow guide on maximizing SFX reuse and minimizing API credit spend.
- Shared asset naming:
slugify_effect_key()insfx_common.pyconverts direction text to filesystem-safe slugs --dry-runshows three statuses:EXISTS(episode stem on disk),CACHED(shared asset exists, will be copied),NEW(needs API generation)- Common SFX functions live in
sfx_common.py— both XILU002 and XILP002 delegate to it tag_mp3()writes ID3 metadata (Album, Genre, Year, Title, Artist, Lyrics) to MP3 stemstag_wav()writes ID3 metadata (Album, Genre, Year, Title, Artist) to WAV layer exports
Standalone SFX Utility
XILU002_generate_SFX.py — Generates SFX stems independently of XILP002 voice generation.
xil sfx --episode S01E01 --dry-run
xil sfx --episode S01E01 --gen-sfx
xil sfx --episode S01E01 --gen-music
xil sfx --episode S01E01 --gen-ambience
xil sfx --episode S01E01 --max-duration 5.0
xil sfx --episode S01E01 --local-only
xil sfx --episode S01E01
--episodeor--tag(one required) derivescast_<slug>_S01E01.jsonandsfx_<slug>_S01E01.json--showoverrides the show name used for slug derivation (see Project Configuration)- Reads: parsed script JSON + SFX config + cast config (for episode tag)
- Outputs: shared assets to
SFX/, episode stems tostems/<slug>/<TAG>/ --dry-runshows EXISTS/CACHED/NEW status per stem with credit estimates--gen-sfx,--gen-music,--gen-ambiencefilter generation to the specified categories; omitting all three processes all categories--dry-runSUMMARY now shows per-category credit subtotals (MUSIC / AMBIENCE / SFX / silence)--max-duration Nfilters to effects ≤ N seconds (controls API credit spend)--local-onlyskips any effect not already present inSFX/; only CACHED assets and silence entries are placed, no API calls made- 429 rate-limit errors are retried automatically up to 5 times with linear backoff (10s, 20s, 30s, 40s, 50s)
- Skips stems that already exist on disk
CSV Annotation Utility
XILU003_csv_sfx_join.py — Joins a parsed episode CSV with the SFX JSON and cast JSON, producing an annotated review CSV with SFX prompt, duration, and cast metadata columns appended alongside each dialogue and direction entry.
--episodeor--tag(one required) derivesparsed/parsed_<slug>_{TAG}.csv,sfx_<slug>_{TAG}.json, andcast_<slug>_{TAG}.json--showoverrides the show name used for slug derivation (see Project Configuration)--csv,--sfx,--castoverride individual input paths--outputoverrides the output CSV path (default:parsed/annotated_<slug>_{TAG}.csv)- No API key required — read-only join utility
Voice Sample Utility
XILU004_sample_voices_T2S.py — Generates a short TTS sample for each cast member to audition voice assignments.
xil sample --episode S02E03 --dry-run
xil sample --episode S02E03
xil sample --episode S02E03 --backend chatterbox
xil sample --episode S02E03 --backend gtts
xil sample --episode S02E03 --force
--episodeor--tag(one required) or--cast PATHto specify the cast config--showoverrides the show name used for slug derivation (see Project Configuration)--backend elevenlabs|gtts|chatterbox(default:elevenlabs): selects TTS backend for sample generation- Default sample text:
"I am {name} not yo momma"; override with--sample-text(use{name}placeholder) - Output:
voice_samples/{TAG}/{backend}/{actor}.mp3— backend subdirectory enables side-by-side comparison - Skips members with
voice_id=TBD(ElevenLabs only);--forceregenerates existing samples --chatterbox-python PATH,--voice-refs DIR,--exaggeration FLOAT— Chatterbox-specific options (same asxil-produce)- Requires
ELEVENLABS_API_KEYfor--backend elevenlabs
SFX Library Discovery
XILU005_discover_SFX.py — Lists and searches the local shared SFX asset library.
xil sfx-lib # local scan (default)
xil sfx-lib --local # explicit local scan
xil sfx-lib --sfx-dir SFX/ # override local scan directory
xil sfx-lib --search "diner" # filter by keyword
xil sfx-lib --json # machine-readable output
xil sfx-lib --api # attempt API history (not publicly accessible)
xil sfx-lib --api --all # paginate full API history (default: most recent 100)
- Default mode: scans
SFX/directory (equivalent to--local) and reports all assets with duration and file size --local/--apiare mutually exclusive mode flags;--localis the default--sfx-dir DIRoverrides the local scan directory (default:SFX/)--search TEXTfilters results by case-insensitive substring match on filename/prompt--jsonoutputs results as a JSON array--verbose/-vprints all metadata fields per asset--apiattempts to query ElevenLabs sound generation history (endpoint is not publicly accessible as of March 2026 regardless of API key permissions)--all(API mode only) paginates through the full account history; default retrieves only the most recent 100 results--export-kit [DIR]generates an SFX inventory JSON (sfx_inventory.json) and copies the scriptwriter reference doc (claude-scriptwriter-reference.md) into DIR (default: current directory); attach both files to a Claude project as knowledge files to enable SFX-aware script writing
Parsed JSON Splice Utility
XILU006_splice_parsed.py — Inserts entries into or deletes entries from a parsed episode JSON with automatic seq renumbering.
xil splice --episode S02E03 --insert-after 322 \
--from-parsed parsed/parsed_the413_S02E02.json --from-seq-range 232-233 \
--section post-interview --dry-run
xil splice --episode S02E03 --delete-seq-range 100-105 --dry-run
xil splice --episode S02E03 --insert-after 322 \
--from-json new_entries.json
--episodeor--tag(one required) derives target parsed JSON path--showoverrides the show name used for slug derivation (see Project Configuration)--parsed PATHoverrides target parsed JSON path--insert-after N— seq number to insert after--from-parsed PATH+--from-seq-range N-M— extract entries from another parsed JSON by seq range--from-json PATH— read entries from a standalone JSON array file--section/--scene— override section/scene on inserted entries (default: inherit from insertion point)--delete-seq-range N-M— remove entries in range and renumber (can combine with insertion: delete first, then insert)--dry-run— show plan without writing files--no-backup— skip writing backup file--quiet— summary only, no per-entry detail- Before modifying, writes
parsed/pre_splice_parsed_<slug>_<TAG>.jsonas a backup (compatible withXILP007 --orig-prefix pre_splice_) - Preamble entries (seq <= 0) are never renumbered or deleted
- Recomputes the
statsblock after modification - No ElevenLabs API key required — no API calls made
Stem Log Report
XILU008_stem_log_report.py — Parses daily pipeline log files into a chronological stem generation CSV. Useful for auditing what was generated, when, with which backend, and confirming SHA256 checksums.
xil-stem-log --episode S03E03
xil-stem-log --episode S03E03 --since 2026-04-01
xil-stem-log --slug the413
xil-stem-log --logs-dir logs/ --output stem_log_report.csv
xil-stem-log --episode S03E03 --audit
xil-stem-log --episode S03E03 --audit --audit-threshold 20
--episode TAG(optional) filters records to a specific episode tag (e.g.S03E03); matched againststem_path--slug SLUG(optional) filters records to a specific show slug (e.g.the413); matched againststem_path--logs-dir DIRpath to log directory (default:logs/)--output PATHoutput CSV path (default:stem_log_report.csv); use-for stdout--since DATEfilter to logs on or after the given date (YYYY-MM-DD)--showprint CSV to stdout (equivalent to--output -)--auditcross-references loggedchar_countvalues against the current parsed JSON and flags stems whose logged character count differs from the current text length by more than--audit-thresholdpercent (default: 10); useful for detecting stale stems after script edits--audit-threshold Npercentage threshold for the--auditflag (default: 10)- Parses
logs/xil_YYYY-MM-DD.logfiles; three regex patterns match ElevenLabs, gTTS, and Chatterbox generation lines - State machine: generation line → saved line → SHA256 line → emits one record
run_indexcounter increments perPhase 1: Generatingmarker, grouping stems by production run- Output columns:
log_date,log_file,run_index,log_line,seq,speaker,backend,char_count,stem_path,stem_filename,sha256 - No ElevenLabs API key required — reads local log files only
Web Dashboard
xil_gui.py — Gradio browser dashboard that supplements the CLI. Five tabs: Episodes (workspace overview), Audio Preview (browse and play stems), Run Stage (launch pipeline stages with live log streaming), Setup (initialize a new workspace with content-type selector), Timeline (interactive HTML timeline). Requires the [gui] optional extra.
pip install 'xil-pipeline[gui]'
xil-gui # opens http://localhost:7860
xil-gui --share # generates a public ngrok URL for partner access (72h, no auth)
xil-gui --port 8080 # custom port
xil gui # via unified dispatcher
Episode detection checks both legacy root cast_{slug}_{tag}.json and normalized configs/{slug}/cast_{tag}.json locations, so both old and new workspaces show up.
Workspace Migration
XILU009_migrate_workspace.py — moves pre-0.1.8 workspace files to the normalized layout in one idempotent pass.
- Episodes tab: workspace overview — all detected episodes with parse/stems/DAW/master status; episode dropdowns show
[Arc] — Titlenext to the tag (read from cast config); Episodes table has aTitle [Arc]column - Audio Preview tab: episode + filter selector (all/dialogue/sfx/music/ambience), stem dropdown, in-browser MP3 playback via
gr.Audio; stem labels enriched from parsed JSON when available - Run Stage tab: select episode + stage (assemble/daw/master/produce/parse), dry-run checkbox (default on), extra flags field; live stdout streaming via generator +
demo.queue() - Timeline tab: embeds the
daw/{TAG}/{TAG}_timeline.htmliframe if generated; promptsxil daw --timeline-htmlwhen absent --shareuses Gradio's ngrok tunnel — open access, suitable for trusted collaborators during a session onlyallowed_paths=[os.getcwd()]enables Gradio to serve local MP3/WAV/HTML files- Subprocess isolation: each stage run is a fresh
subprocess.Popenso the GUI stays alive if a stage errors - No ElevenLabs API key required for the GUI itself (stages may require it depending on flags)
Developer/Maintainer Rules
Automated testing via Python and Bash serves as the fundamental mechanism for the Verification Loop. The project mandates that Claude must mention how it will verify its work before it begins any task.
Use tests for everything it implements:
- Determine which tests are appropriate; the model will then generate a test for every single feature it builds
- Test-Driven Development (TDD): A key best practice is implementing a verification-led technique where tests for a new feature are written first, followed by the actual code implementation
Documentation Currency Rule
After executing any plan that changes pipeline behaviour, CLI flags, file formats, or module interfaces, both CLAUDE.md (root) and docs/pipeline.md must be updated to reflect those changes before committing. This applies equally to Claude and human contributors. Specifically:
- New CLI flags or flag removals → update the relevant stage description in CLAUDE.md and the corresponding section/sequence diagram in pipeline.md
- New module fields or dataclass additions → update
mix_common.py/sfx_common.pybullet points in CLAUDE.md - New SFX config keys or behaviours → update the SFX Configuration section
- New pipeline stages or utilities → add a XILP/XILU entry under File Naming Convention and a stage section in pipeline.md
- Any behavioural change visible to operators → update the relevant stage bullets in CLAUDE.md
If a plan is large enough to have its own plan file, tick this as the final step before closing the plan.
Script Entry Point Style
Always use the if __name__ == "__main__": idiom. All application logic that would otherwise follow it must live inside a main() function — the dunder-main block must contain only the call to main():
def main():
parser = argparse.ArgumentParser(...)
args = parser.parse_args()
# all application logic here
if __name__ == "__main__":
main()
This keeps the __main__ block to a single line, makes the entry point testable by calling main() directly, and prevents module-level side effects when the file is imported.
Running Tests
Man Pages
Unix man pages for all 23 CLI commands are pre-generated and committed to man/man1/. They are installed automatically when the package is built into a wheel and installed via pip.
Regenerating after CLI changes (run whenever flags or descriptions change):
pip install -e ".[dev]" # includes argparse-manpage
python docs/build_man.py # regenerate all 20 argparse-based pages
# xil.1 is hand-crafted — edit man/man1/xil.1 directly when the dispatcher changes
Regenerate a single page: python docs/build_man.py xil-parse
Always commit the regenerated .1 files alongside any CLI flag change. The get_parser() function in each module (extracted from main()) is what build_man.py calls to obtain the parser — keep it in sync with any add_argument changes.
Post-install access on Debian (for pip install --user):
# Pages land in ~/.local/share/man/man1/ — add to ~/.bashrc:
export MANPATH="$HOME/.local/share/man:$(manpath 2>/dev/null)"
# Then:
man xil-parse
# For apropos/whatis support:
mandb --user-db ~/.local/share/man
System-wide installs (sudo pip install) land in /usr/local/share/man/man1/ and are indexed by default (sudo mandb to refresh).
Key Directories
src/xil_pipeline/— Python package (all pipeline and utility scripts, shared modules)tests/— Automated test suite (pytest)scripts/— Source markdown production scripts (authored manually)parsed/— Parser JSON output (generated, cacheable)cues/— Sound cues & music prompts markdown files (authored manually);cues_manifest_<TAG>.jsongenerated by XILP006stems/<slug>/<TAG>/— Individual voice/SFX audio files per episode (generated, expensive to recreate)SFX/— Shared SFX asset library (generated once, reused across episodes); cues-sheet assets named by asset ID (e.g.sfx-boots-stamp-01.mp3)daw/<TAG>/— Per-layer WAV exports for DAW mixing (generated by XILP005)venv/— Python virtualenv (do not commit)