Timeline Viz

src.xil_pipeline.timeline_viz

Multitrack timeline visualization for the audio pipeline.

Renders a visual representation of asset placement across all four audio layers (dialogue, ambience, music, SFX). Two output formats are supported:

Terminal ASCII timeline — printed to stdout, auto-scaled to terminal width.
HTML interactive timeline — self-contained file with hover tooltips and zoom.

No pydub dependency — consumes label tuples only.

Usage (from XILP005): python XILP005_daw_export.py --episode S02E03 --timeline python XILP005_daw_export.py --episode S02E03 --timeline-html

LayerSpan `dataclass`

A single asset placement on the timeline.

Attributes:

start_s (float) –

Start time in seconds.
end_s (float) –

End time in seconds.
label (str) –

Human-readable label (speaker name, SFX text, etc.).
ramp_in_s (float | None) –

Fade-in duration in seconds, or None if not set.
ramp_out_s (float | None) –

Fade-out duration in seconds, or None if not set.
play_duration (float | None) –

Percentage of file to play, or None if not set.
snippet (str | None) –

First 5 words of dialogue text for HTML tooltip, or None.
volume_pct (float | None) –

Volume percentage (100 = unity), or None if not set.
seq (int | None) –

Sequence number from the parsed script, or None.

Source code in src/xil_pipeline/timeline_viz.py

@dataclass
class LayerSpan:
    """A single asset placement on the timeline.

    Attributes:
        start_s: Start time in seconds.
        end_s: End time in seconds.
        label: Human-readable label (speaker name, SFX text, etc.).
        ramp_in_s: Fade-in duration in seconds, or ``None`` if not set.
        ramp_out_s: Fade-out duration in seconds, or ``None`` if not set.
        play_duration: Percentage of file to play, or ``None`` if not set.
        snippet: First 5 words of dialogue text for HTML tooltip, or ``None``.
        volume_pct: Volume percentage (100 = unity), or ``None`` if not set.
        seq: Sequence number from the parsed script, or ``None``.
    """

    start_s: float
    end_s: float
    label: str
    ramp_in_s: float | None = None
    ramp_out_s: float | None = None
    play_duration: float | None = None
    snippet: str | None = None
    volume_pct: float | None = None
    seq: int | None = None
    tts_model: str | None = None

start_s `instance-attribute`

start_s: float

end_s `instance-attribute`

end_s: float

label `instance-attribute`

label: str

ramp_in_s `class-attribute` `instance-attribute`

ramp_in_s: float | None = None

ramp_out_s `class-attribute` `instance-attribute`

ramp_out_s: float | None = None

play_duration `class-attribute` `instance-attribute`

play_duration: float | None = None

snippet `class-attribute` `instance-attribute`

snippet: str | None = None

volume_pct `class-attribute` `instance-attribute`

volume_pct: float | None = None

seq `class-attribute` `instance-attribute`

seq: int | None = None

tts_model `class-attribute` `instance-attribute`

tts_model: str | None = None

init

__init__(start_s: float, end_s: float, label: str, ramp_in_s: float | None = None, ramp_out_s: float | None = None, play_duration: float | None = None, snippet: str | None = None, volume_pct: float | None = None, seq: int | None = None, tts_model: str | None = None) -> None

TimelineData `dataclass`

Complete timeline data for all four layers.

Attributes:

tag (str) –

Episode tag (e.g. "S02E03").
total_duration_s (float) –

Total episode duration in seconds.
layers (dict[str, list[LayerSpan]]) –

Mapping of layer name to list of :class:LayerSpan instances.
sections (list[LayerSpan]) –

Script-section bands (cold-open, act1, …) for the structure ruler. Kept out of layers so they are not counted as assets.
scenes (list[LayerSpan]) –

Script-scene bands (scene-1, scene-2, …) for the structure ruler.

Source code in src/xil_pipeline/timeline_viz.py

@dataclass
class TimelineData:
    """Complete timeline data for all four layers.

    Attributes:
        tag: Episode tag (e.g. ``"S02E03"``).
        total_duration_s: Total episode duration in seconds.
        layers: Mapping of layer name to list of :class:`LayerSpan` instances.
        sections: Script-section bands (cold-open, act1, …) for the structure
            ruler.  Kept out of ``layers`` so they are not counted as assets.
        scenes: Script-scene bands (scene-1, scene-2, …) for the structure ruler.
    """

    tag: str
    total_duration_s: float
    layers: dict[str, list[LayerSpan]] = field(default_factory=dict)
    sections: list[LayerSpan] = field(default_factory=list)
    scenes: list[LayerSpan] = field(default_factory=list)

tag `instance-attribute`

tag: str

total_duration_s `instance-attribute`

total_duration_s: float

layers `class-attribute` `instance-attribute`

layers: dict[str, list[LayerSpan]] = field(default_factory=dict)

sections `class-attribute` `instance-attribute`

sections: list[LayerSpan] = field(default_factory=list)

scenes `class-attribute` `instance-attribute`

scenes: list[LayerSpan] = field(default_factory=list)

init

__init__(tag: str, total_duration_s: float, layers: dict[str, list[LayerSpan]] = dict(), sections: list[LayerSpan] = list(), scenes: list[LayerSpan] = list()) -> None

build_timeline_data

build_timeline_data(tag: str, total_s: float, dlg_labels: list, amb_labels: list, mus_labels: list, sfx_labels: list, vf_labels: list | None = None, *, section_bands: list | None = None, scene_bands: list | None = None) -> TimelineData

Wrap the layer label lists into a :class:TimelineData object.

Label tuples may be 3-element (start_s, end_s, text), 5-element (start_s, end_s, text, ramp_in_s, ramp_out_s), 6-element (start_s, end_s, text, ramp_in_s, ramp_out_s, play_duration), or 7-element (start_s, end_s, text, ramp_in_s, ramp_out_s, play_duration, snippet).

Parameters:

tag (str) –

Episode tag.
total_s (float) –

Total episode duration in seconds.
dlg_labels (list) –

Dialogue label 7-tuples (start_s, end_s, speaker, None, None, None, snippet).
amb_labels (list) –

Ambience label tuples (may carry ramp data).
mus_labels (list) –

Music label tuples (may carry ramp data).
sfx_labels (list) –

SFX label tuples.
vf_labels (list | None, default: None ) –

Vintage filter label tuples (may carry ramp data).
section_bands (list | None, default: None ) –

Script-section (start_s, end_s, label) tuples from :func:mix_common.derive_structure_bands, for the structure ruler.
scene_bands (list | None, default: None ) –

Script-scene (start_s, end_s, label) tuples.

Returns:

TimelineData –

A populated :class:TimelineData instance.

Source code in src/xil_pipeline/timeline_viz.py

def build_timeline_data(
    tag: str,
    total_s: float,
    dlg_labels: list,
    amb_labels: list,
    mus_labels: list,
    sfx_labels: list,
    vf_labels: list | None = None,
    *,
    section_bands: list | None = None,
    scene_bands: list | None = None,
) -> TimelineData:
    """Wrap the layer label lists into a :class:`TimelineData` object.

    Label tuples may be 3-element ``(start_s, end_s, text)``,
    5-element ``(start_s, end_s, text, ramp_in_s, ramp_out_s)``,
    6-element ``(start_s, end_s, text, ramp_in_s, ramp_out_s, play_duration)``, or
    7-element ``(start_s, end_s, text, ramp_in_s, ramp_out_s, play_duration, snippet)``.

    Args:
        tag: Episode tag.
        total_s: Total episode duration in seconds.
        dlg_labels: Dialogue label 7-tuples ``(start_s, end_s, speaker, None, None, None, snippet)``.
        amb_labels: Ambience label tuples (may carry ramp data).
        mus_labels: Music label tuples (may carry ramp data).
        sfx_labels: SFX label tuples.
        vf_labels: Vintage filter label tuples (may carry ramp data).
        section_bands: Script-section ``(start_s, end_s, label)`` tuples from
            :func:`mix_common.derive_structure_bands`, for the structure ruler.
        scene_bands: Script-scene ``(start_s, end_s, label)`` tuples.

    Returns:
        A populated :class:`TimelineData` instance.
    """
    def to_spans(labels):
        spans = []
        for tup in labels:
            s, e, t = tup[0], tup[1], tup[2]
            ri = tup[3] if len(tup) > 3 else None
            ro = tup[4] if len(tup) > 4 else None
            pd = tup[5] if len(tup) > 5 else None
            sn = tup[6] if len(tup) > 6 else None
            vp = tup[7] if len(tup) > 7 else None
            sq = tup[8] if len(tup) > 8 else None
            tm = tup[9] if len(tup) > 9 else None
            spans.append(LayerSpan(s, e, t, ri, ro, pd, sn, vp, sq, tm))
        return spans

    layers = {
        "dialogue":       to_spans(dlg_labels),
        "ambience":       to_spans(amb_labels),
        "music":          to_spans(mus_labels),
        "sfx":            to_spans(sfx_labels),
        "vintage_filter": to_spans(vf_labels or []),
    }
    return TimelineData(
        tag=tag,
        total_duration_s=total_s,
        layers=layers,
        sections=to_spans(section_bands or []),
        scenes=to_spans(scene_bands or []),
    )

render_terminal_timeline

render_terminal_timeline(data: TimelineData, width: int | None = None) -> str

Render a multi-line Unicode timeline string for terminal display.

Parameters:

data (TimelineData) –

Timeline data from :func:build_timeline_data.
width (int | None, default: None ) –

Terminal width in characters. If None, auto-detected via :func:shutil.get_terminal_size.

Returns:

str –

Multi-line string suitable for printing to stdout.

Source code in src/xil_pipeline/timeline_viz.py

def render_terminal_timeline(data: TimelineData, width: int | None = None) -> str:
    """Render a multi-line Unicode timeline string for terminal display.

    Args:
        data: Timeline data from :func:`build_timeline_data`.
        width: Terminal width in characters.  If ``None``, auto-detected
            via :func:`shutil.get_terminal_size`.

    Returns:
        Multi-line string suitable for printing to stdout.
    """
    if width is None:
        width = shutil.get_terminal_size((120, 24)).columns

    total_s = data.total_duration_s
    if total_s <= 0:
        return f"--- Timeline: {data.tag} (0:00) ---\n  (no audio)\n"

    # Layout constants
    label_col = 12  # width of "  DIALOGUE  " left column
    track_width = max(width - label_col - 2, 20)

    # Choose ruler interval: 30s for short episodes, 60s for longer
    if total_s <= 180:
        interval = 30
    elif total_s <= 600:
        interval = 60
    else:
        interval = 120

    lines = []
    lines.append(f"--- Timeline: {data.tag} ({_format_time(total_s)}) ---")
    lines.append("")

    # ── Time ruler ──
    ruler_line = " " * label_col
    num_ticks = int(total_s // interval) + 1
    for i in range(num_ticks):
        t = i * interval
        col = int(t / total_s * track_width) if total_s > 0 else 0
        if col >= track_width:
            break
        time_str = _format_time(t)
        # Place time label at col position
        pad = col - (len(ruler_line) - label_col)
        if pad > 0:
            ruler_line += " " * pad
        ruler_line += time_str

    # Tick marks line
    tick_chars = [" "] * track_width
    for i in range(num_ticks):
        t = i * interval
        col = int(t / total_s * track_width) if total_s > 0 else 0
        if col >= track_width:
            break
        if i == 0:
            tick_chars[col] = "├"
        elif col == track_width - 1:
            tick_chars[col] = "┤"
        else:
            tick_chars[col] = "┼"
    # Fill between ticks with ─
    for idx in range(track_width):
        if tick_chars[idx] == " ":
            tick_chars[idx] = "─"

    lines.append(ruler_line)
    lines.append(" " * label_col + "".join(tick_chars))
    lines.append("")

    # ── Layer rendering ──
    layer_config = [
        ("dialogue",       "DIALOGUE",       "█"),
        ("ambience",       "AMBIENCE",       "▓"),
        ("music",          "MUSIC",          "█"),
        ("sfx",            "SFX",            "█"),
        ("vintage_filter", "VTG FILTER",     "▒"),
    ]

    for layer_key, layer_name, fill_char in layer_config:
        spans = data.layers.get(layer_key, [])
        if not spans:
            continue

        # Build the bar row
        bar = [" "] * track_width
        label_positions: list[tuple[int, str]] = []

        for span in spans:
            col_start = int(span.start_s / total_s * track_width)
            col_end = int(span.end_s / total_s * track_width)
            col_start = max(0, min(col_start, track_width - 1))
            col_end = max(col_start + 1, min(col_end, track_width))

            # Short items (< 1 col) get a dot for SFX/BEAT
            if col_end - col_start <= 1 and layer_key == "sfx":
                char = "·" if span.end_s - span.start_s < 1.5 else fill_char
            else:
                char = fill_char

            for c in range(col_start, col_end):
                bar[c] = char

            # Truncate label to fit
            label = span.label
            if len(label) > 12:
                label = label[:11] + "…"
            label_positions.append((col_start, label))

        # Build label row
        label_row = [" "] * track_width
        for col, lbl in label_positions:
            end = min(col + len(lbl), track_width)
            # Don't overwrite existing labels
            if all(label_row[i] == " " for i in range(col, end)):
                for i, ch in enumerate(lbl):
                    if col + i < track_width:
                        label_row[col + i] = ch

        # Format output
        name_padded = f"  {layer_name:<{label_col - 2}}"
        lines.append(name_padded + "".join(bar))
        lines.append(" " * label_col + "".join(label_row))
        lines.append("")

    return "\n".join(lines)

render_text_timeline_map

render_text_timeline_map(data: TimelineData, output_path: str, *, slug: str = '') -> str

Write a human-readable cue sheet of the episode's foreground timing.

Dialogue and SFX spans interleaved chronologically; music and ambience are omitted — they are background layers that do not move the foreground timeline.

Parameters:

data (TimelineData) –

Timeline data from :func:build_timeline_data.
output_path (str) –

Path to write the text map.
slug (str, default: '' ) –

Show slug for the header (omitted when empty).

Returns:

str –

The path written (same as output_path).

Source code in src/xil_pipeline/timeline_viz.py

def render_text_timeline_map(
    data: TimelineData,
    output_path: str,
    *,
    slug: str = "",
) -> str:
    """Write a human-readable cue sheet of the episode's foreground timing.

    Dialogue and SFX spans interleaved chronologically; music and ambience
    are omitted — they are background layers that do not move the
    foreground timeline.

    Args:
        data: Timeline data from :func:`build_timeline_data`.
        output_path: Path to write the text map.
        slug: Show slug for the header (omitted when empty).

    Returns:
        The path written (same as *output_path*).
    """
    spans = list(data.layers.get("dialogue", [])) + list(data.layers.get("sfx", []))
    spans.sort(key=lambda sp: (sp.start_s, sp.seq if sp.seq is not None else 0))
    dialogue_ids = {id(sp) for sp in data.layers.get("dialogue", [])}

    show_part = f" — {slug}" if slug else ""
    lines = [
        f"# Timeline map: {data.tag}{show_part} ({_format_time(data.total_duration_s)})",
        "# dialogue + SFX foreground timing; music/ambience omitted",
        "#",
        "#  START      END       LAYER  SEQ   WHO/WHAT",
    ]
    for sp in spans:
        layer = "DLG" if id(sp) in dialogue_ids else "SFX"
        seq = f"#{sp.seq:03d}" if sp.seq is not None else "    "
        who = sp.label
        if layer == "DLG" and sp.snippet:
            who = f"{sp.label}  “{sp.snippet}…”"
        lines.append(
            f" {_fmt_mmss_tenths(sp.start_s)} – {_fmt_mmss_tenths(sp.end_s)}   "
            f"{layer}   {seq}  {who}"
        )

    os.makedirs(os.path.dirname(output_path) or ".", exist_ok=True)
    with open(output_path, "w", encoding="utf-8") as f:
        f.write("\n".join(lines) + "\n")
    return output_path

render_html_timeline

render_html_timeline(data: TimelineData, output_path: str, stems_dir: str | None = None, *, slug: str = '', tag: str = '', layers_dir: str | None = None) -> str

Write a self-contained HTML timeline file.

Parameters:

data (TimelineData) –

Timeline data from :func:build_timeline_data.
output_path (str) –

Path to write the HTML file.
stems_dir (str | None, default: None ) –

Directory of episode stem MP3 files. When provided, clicking a timeline block plays the corresponding stem via an embedded audio player (served by Gradio's /gradio_api/file= endpoint).
slug (str, default: '' ) –

Show slug — embedded as XIL_SLUG JS constant for the right-click sound profile editor.
tag (str, default: '' ) –

Episode tag — embedded as XIL_TAG JS constant.
layers_dir (str | None, default: None ) –

Directory of the DAW layer WAVs ({tag}_layer_{key}.wav). Existing layer files are embedded as LAYER_AUDIO (with an mtime cache-buster) to power the full-mix transport; when absent the transport UI is hidden.

Returns:

str –

The path written (same as output_path).

Source code in src/xil_pipeline/timeline_viz.py

def render_html_timeline(
    data: TimelineData,
    output_path: str,
    stems_dir: str | None = None,
    *,
    slug: str = "",
    tag: str = "",
    layers_dir: str | None = None,
) -> str:
    """Write a self-contained HTML timeline file.

    Args:
        data: Timeline data from :func:`build_timeline_data`.
        output_path: Path to write the HTML file.
        stems_dir: Directory of episode stem MP3 files. When provided, clicking
            a timeline block plays the corresponding stem via an embedded audio
            player (served by Gradio's ``/gradio_api/file=`` endpoint).
        slug: Show slug — embedded as ``XIL_SLUG`` JS constant for the
            right-click sound profile editor.
        tag: Episode tag — embedded as ``XIL_TAG`` JS constant.
        layers_dir: Directory of the DAW layer WAVs
            (``{tag}_layer_{key}.wav``). Existing layer files are embedded as
            ``LAYER_AUDIO`` (with an mtime cache-buster) to power the
            full-mix transport; when absent the transport UI is hidden.

    Returns:
        The path written (same as *output_path*).
    """
    # Audio URLs are stored RELATIVE to the timeline file's own directory so
    # the artifact is root-agnostic: the same file works whether the GUI
    # serves it from a local root or a NAS mount (the iframe loads it via
    # /gradio_api/file=<abs timeline path>, and the browser resolves relative
    # refs against that document URL). A ?v={mtime} cache-buster still keys the
    # browser blob cache so a regenerated asset self-invalidates.
    out_dir = os.path.dirname(os.path.abspath(output_path))

    def _rel_audio(full_abs: str) -> str:
        try:
            rel = os.path.relpath(full_abs, out_dir)
        except ValueError:
            rel = full_abs  # different drive (Windows) — fall back to absolute
        rel = rel.replace(os.sep, "/")
        return f"{rel}?v={int(os.path.getmtime(full_abs))}"

    # Build seq → relative path mapping for click-to-play
    import re as _re
    _seq_re = _re.compile(r"^(n?)(\d+)_")
    clips: dict[str, str] = {}
    if stems_dir and os.path.isdir(stems_dir):
        for fname in sorted(os.listdir(stems_dir)):
            if not fname.endswith(".mp3"):
                continue
            m = _seq_re.match(fname)
            if m:
                seq = -int(m.group(2)) if m.group(1) == "n" else int(m.group(2))
                full = os.path.abspath(os.path.join(stems_dir, fname))
                clips[str(seq)] = _rel_audio(full)
    clips_json = json.dumps(clips)

    # Build JSON-serializable structure
    json_data = {
        "tag": data.tag,
        "total_duration_s": data.total_duration_s,
        "layers": {
            key: [
                {
                    "start_s": sp.start_s,
                    "end_s": sp.end_s,
                    "label": sp.label,
                    "ramp_in_s": sp.ramp_in_s,
                    "ramp_out_s": sp.ramp_out_s,
                    "play_duration": sp.play_duration,
                    "snippet": sp.snippet,
                    "volume_pct": sp.volume_pct,
                    "seq": sp.seq,
                    "tts_model": sp.tts_model,
                }
                for sp in spans
            ]
            for key, spans in data.layers.items()
        },
        # Structure bands are deliberately separate from "layers" so they are
        # not counted as assets and don't need COLORS/LABELS entries.
        "sections": [
            {"start_s": sp.start_s, "end_s": sp.end_s, "label": sp.label}
            for sp in data.sections
        ],
        "scenes": [
            {"start_s": sp.start_s, "end_s": sp.end_s, "label": sp.label}
            for sp in data.scenes
        ],
    }

    span_count = sum(len(spans) for spans in data.layers.values())
    slug_js = f"const XIL_SLUG = {json.dumps(slug or data.tag)};\nconst XIL_TAG  = {json.dumps(tag or data.tag)};"

    # Full-mix transport: embed existing DAW layer WAVs with an mtime
    # cache-buster so a regenerated mix is never served from browser cache.
    layer_audio: dict[str, str] = {}
    if layers_dir and os.path.isdir(layers_dir):
        for key in ("dialogue", "sfx", "music", "ambience", "vintage_filter"):
            wav = os.path.join(layers_dir, f"{data.tag}_layer_{key}.wav")
            if os.path.exists(wav):
                layer_audio[key] = _rel_audio(os.path.abspath(wav))

    content = _HTML_TEMPLATE.format(
        tag=html.escape(data.tag),
        duration_fmt=_format_time(data.total_duration_s),
        span_count=span_count,
        data_json=json.dumps(json_data),
        clips_json=clips_json,
        generated_at=datetime.now().strftime("%Y-%m-%d %H:%M"),
        slug_js=slug_js,
        layer_audio_json=json.dumps(layer_audio),
        modal_css=_MODAL_CSS,
        modal_html=_MODAL_HTML,
        modal_js=_MODAL_JS,
        transport_css=_TRANSPORT_CSS,
        transport_html=_TRANSPORT_HTML,
        transport_js=_TRANSPORT_JS,
        loader_css=_LOADER_CSS,
        loader_html=_LOADER_HTML,
        loader_js=_LOADER_JS,
    )

    os.makedirs(os.path.dirname(output_path) or ".", exist_ok=True)
    with open(output_path, "w", encoding="utf-8") as f:
        f.write(content)

    return output_path

Timeline Viz

src.xil_pipeline.timeline_viz

LayerSpan dataclass

start_s instance-attribute

end_s instance-attribute

label instance-attribute

ramp_in_s class-attribute instance-attribute

ramp_out_s class-attribute instance-attribute

play_duration class-attribute instance-attribute

snippet class-attribute instance-attribute

volume_pct class-attribute instance-attribute

seq class-attribute instance-attribute

tts_model class-attribute instance-attribute

__init__

TimelineData dataclass

tag instance-attribute

total_duration_s instance-attribute

layers class-attribute instance-attribute

sections class-attribute instance-attribute

scenes class-attribute instance-attribute

__init__

build_timeline_data

render_terminal_timeline

render_text_timeline_map

render_html_timeline

LayerSpan `dataclass`

start_s `instance-attribute`

end_s `instance-attribute`

label `instance-attribute`

ramp_in_s `class-attribute` `instance-attribute`

ramp_out_s `class-attribute` `instance-attribute`

play_duration `class-attribute` `instance-attribute`

snippet `class-attribute` `instance-attribute`

volume_pct `class-attribute` `instance-attribute`

seq `class-attribute` `instance-attribute`

tts_model `class-attribute` `instance-attribute`

init

TimelineData `dataclass`

tag `instance-attribute`

total_duration_s `instance-attribute`

layers `class-attribute` `instance-attribute`

sections `class-attribute` `instance-attribute`

scenes `class-attribute` `instance-attribute`

init