Xilp007 Stem Migrator
src.xil_pipeline.XILP007_stem_migrator
Migrate episode stems when a parsed script is revised.
Compares an old and new parsed JSON, copies unchanged stems to their new seq-numbered filenames, and reports which entries need fresh TTS/SFX generation. Run XILP002 afterwards — it skips stems that already exist on disk, so only the gaps get generated.
Usage:
python XILP007_stem_migrator.py --episode S02E03 [--dry-run] [--strict]
python XILP007_stem_migrator.py \
--old parsed/orig_parsed_<slug>_S02E03.json \
--new parsed/parsed_<slug>_S02E03.json \
--stems stems/S02E03 [--dry-run] [--strict]
MigrationAction
dataclass
Describes what should happen to one new parsed entry.
Source code in src/xil_pipeline/XILP007_stem_migrator.py
normalize_text
Normalize entry text for comparison.
Always collapses whitespace. In fuzzy mode (default) also normalises em-dashes, ellipsis, and curly quotes so that punctuation-only edits don't force unnecessary regeneration.
Source code in src/xil_pipeline/XILP007_stem_migrator.py
make_stem_name
Return the expected stem filename (no directory) for a parsed entry.
Dialogue: {seq:03d}{section}[-{scene}]{speaker}.mp3 Direction: {seq:03d}{section}[-{scene}]_sfx.mp3 Preamble: n{abs(seq):03d}{section}_{speaker_or_sfx}.mp3
Source code in src/xil_pipeline/XILP007_stem_migrator.py
build_old_index
build_old_index(old_entries: list[dict], stems_dir: str, strict: bool = False) -> tuple[dict[tuple[str, str], dict], dict[str, dict]]
Build two lookups from old entries.
Returns:
-
exact_index(dict[tuple[str, str], dict]) –(normalized_text, role) → record — primary match
-
text_index(dict[str, dict]) –normalized_text → record — fallback for speaker-change detection
When multiple old entries share the same key (e.g. repeated BEATs) the first occurrence is kept so that many-to-one cases favour reuse.
Source code in src/xil_pipeline/XILP007_stem_migrator.py
plan_migration
plan_migration(old_entries: list[dict], new_entries: list[dict], stems_dir: str, strict: bool = False) -> list[MigrationAction]
Compare old and new parsed entries and produce a migration plan.
Each new entry that produces a stem gets one MigrationAction whose status is COPY, SPEAKER, NEW, MISSING, or SKIP.
Matching is two-phase
- Exact: (normalized_text, speaker) — safe to copy or detect MISSING.
- Text-only fallback (dialogue only): same text, different speaker → SPEAKER status so the user knows why regeneration is needed.
Source code in src/xil_pipeline/XILP007_stem_migrator.py
153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 | |
execute_migration
execute_migration(actions: list[MigrationAction], stems_dir: str, dry_run: bool = True) -> dict[str, int]
Copy files according to the plan; return status counts.
Only COPY actions with differing src/dst paths result in file I/O. All other statuses are counted but produce no side effects.
Source code in src/xil_pipeline/XILP007_stem_migrator.py
print_report
Print per-stem migration details.
Source code in src/xil_pipeline/XILP007_stem_migrator.py
print_summary
Print a one-page summary.
Source code in src/xil_pipeline/XILP007_stem_migrator.py
get_parser
Source code in src/xil_pipeline/XILP007_stem_migrator.py
main
CLI entry point for stem migration.