Skip to content

Timeline Viz

src.xil_pipeline.timeline_viz

Multitrack timeline visualization for the audio pipeline.

Renders a visual representation of asset placement across all four audio layers (dialogue, ambience, music, SFX). Two output formats are supported:

  • Terminal ASCII timeline — printed to stdout, auto-scaled to terminal width.
  • HTML interactive timeline — self-contained file with hover tooltips and zoom.

No pydub dependency — consumes label tuples only.

Usage (from XILP005): python XILP005_daw_export.py --episode S02E03 --timeline python XILP005_daw_export.py --episode S02E03 --timeline-html

LayerSpan dataclass

A single asset placement on the timeline.

Attributes:

  • start_s (float) –

    Start time in seconds.

  • end_s (float) –

    End time in seconds.

  • label (str) –

    Human-readable label (speaker name, SFX text, etc.).

  • ramp_in_s (float | None) –

    Fade-in duration in seconds, or None if not set.

  • ramp_out_s (float | None) –

    Fade-out duration in seconds, or None if not set.

  • play_duration (float | None) –

    Percentage of file to play, or None if not set.

  • snippet (str | None) –

    First 5 words of dialogue text for HTML tooltip, or None.

  • volume_pct (float | None) –

    Volume percentage (100 = unity), or None if not set.

  • seq (int | None) –

    Sequence number from the parsed script, or None.

Source code in src/xil_pipeline/timeline_viz.py
@dataclass
class LayerSpan:
    """A single asset placement on the timeline.

    Attributes:
        start_s: Start time in seconds.
        end_s: End time in seconds.
        label: Human-readable label (speaker name, SFX text, etc.).
        ramp_in_s: Fade-in duration in seconds, or ``None`` if not set.
        ramp_out_s: Fade-out duration in seconds, or ``None`` if not set.
        play_duration: Percentage of file to play, or ``None`` if not set.
        snippet: First 5 words of dialogue text for HTML tooltip, or ``None``.
        volume_pct: Volume percentage (100 = unity), or ``None`` if not set.
        seq: Sequence number from the parsed script, or ``None``.
    """

    start_s: float
    end_s: float
    label: str
    ramp_in_s: float | None = None
    ramp_out_s: float | None = None
    play_duration: float | None = None
    snippet: str | None = None
    volume_pct: float | None = None
    seq: int | None = None
    tts_model: str | None = None

start_s instance-attribute

start_s: float

end_s instance-attribute

end_s: float

label instance-attribute

label: str

ramp_in_s class-attribute instance-attribute

ramp_in_s: float | None = None

ramp_out_s class-attribute instance-attribute

ramp_out_s: float | None = None

play_duration class-attribute instance-attribute

play_duration: float | None = None

snippet class-attribute instance-attribute

snippet: str | None = None

volume_pct class-attribute instance-attribute

volume_pct: float | None = None

seq class-attribute instance-attribute

seq: int | None = None

tts_model class-attribute instance-attribute

tts_model: str | None = None

__init__

__init__(start_s: float, end_s: float, label: str, ramp_in_s: float | None = None, ramp_out_s: float | None = None, play_duration: float | None = None, snippet: str | None = None, volume_pct: float | None = None, seq: int | None = None, tts_model: str | None = None) -> None

TimelineData dataclass

Complete timeline data for all four layers.

Attributes:

Source code in src/xil_pipeline/timeline_viz.py
@dataclass
class TimelineData:
    """Complete timeline data for all four layers.

    Attributes:
        tag: Episode tag (e.g. ``"S02E03"``).
        total_duration_s: Total episode duration in seconds.
        layers: Mapping of layer name to list of :class:`LayerSpan` instances.
    """

    tag: str
    total_duration_s: float
    layers: dict[str, list[LayerSpan]] = field(default_factory=dict)

tag instance-attribute

tag: str

total_duration_s instance-attribute

total_duration_s: float

layers class-attribute instance-attribute

layers: dict[str, list[LayerSpan]] = field(default_factory=dict)

__init__

__init__(tag: str, total_duration_s: float, layers: dict[str, list[LayerSpan]] = dict()) -> None

build_timeline_data

build_timeline_data(tag: str, total_s: float, dlg_labels: list, amb_labels: list, mus_labels: list, sfx_labels: list, vf_labels: list | None = None) -> TimelineData

Wrap the layer label lists into a :class:TimelineData object.

Label tuples may be 3-element (start_s, end_s, text), 5-element (start_s, end_s, text, ramp_in_s, ramp_out_s), 6-element (start_s, end_s, text, ramp_in_s, ramp_out_s, play_duration), or 7-element (start_s, end_s, text, ramp_in_s, ramp_out_s, play_duration, snippet).

Parameters:

  • tag (str) –

    Episode tag.

  • total_s (float) –

    Total episode duration in seconds.

  • dlg_labels (list) –

    Dialogue label 7-tuples (start_s, end_s, speaker, None, None, None, snippet).

  • amb_labels (list) –

    Ambience label tuples (may carry ramp data).

  • mus_labels (list) –

    Music label tuples (may carry ramp data).

  • sfx_labels (list) –

    SFX label tuples.

  • vf_labels (list | None, default: None ) –

    Vintage filter label tuples (may carry ramp data).

Returns:

  • TimelineData

    A populated :class:TimelineData instance.

Source code in src/xil_pipeline/timeline_viz.py
def build_timeline_data(
    tag: str,
    total_s: float,
    dlg_labels: list,
    amb_labels: list,
    mus_labels: list,
    sfx_labels: list,
    vf_labels: list | None = None,
) -> TimelineData:
    """Wrap the layer label lists into a :class:`TimelineData` object.

    Label tuples may be 3-element ``(start_s, end_s, text)``,
    5-element ``(start_s, end_s, text, ramp_in_s, ramp_out_s)``,
    6-element ``(start_s, end_s, text, ramp_in_s, ramp_out_s, play_duration)``, or
    7-element ``(start_s, end_s, text, ramp_in_s, ramp_out_s, play_duration, snippet)``.

    Args:
        tag: Episode tag.
        total_s: Total episode duration in seconds.
        dlg_labels: Dialogue label 7-tuples ``(start_s, end_s, speaker, None, None, None, snippet)``.
        amb_labels: Ambience label tuples (may carry ramp data).
        mus_labels: Music label tuples (may carry ramp data).
        sfx_labels: SFX label tuples.
        vf_labels: Vintage filter label tuples (may carry ramp data).

    Returns:
        A populated :class:`TimelineData` instance.
    """
    def to_spans(labels):
        spans = []
        for tup in labels:
            s, e, t = tup[0], tup[1], tup[2]
            ri = tup[3] if len(tup) > 3 else None
            ro = tup[4] if len(tup) > 4 else None
            pd = tup[5] if len(tup) > 5 else None
            sn = tup[6] if len(tup) > 6 else None
            vp = tup[7] if len(tup) > 7 else None
            sq = tup[8] if len(tup) > 8 else None
            tm = tup[9] if len(tup) > 9 else None
            spans.append(LayerSpan(s, e, t, ri, ro, pd, sn, vp, sq, tm))
        return spans

    layers = {
        "dialogue":       to_spans(dlg_labels),
        "ambience":       to_spans(amb_labels),
        "music":          to_spans(mus_labels),
        "sfx":            to_spans(sfx_labels),
        "vintage_filter": to_spans(vf_labels or []),
    }
    return TimelineData(tag=tag, total_duration_s=total_s, layers=layers)

render_terminal_timeline

render_terminal_timeline(data: TimelineData, width: int | None = None) -> str

Render a multi-line Unicode timeline string for terminal display.

Parameters:

  • data (TimelineData) –

    Timeline data from :func:build_timeline_data.

  • width (int | None, default: None ) –

    Terminal width in characters. If None, auto-detected via :func:shutil.get_terminal_size.

Returns:

  • str

    Multi-line string suitable for printing to stdout.

Source code in src/xil_pipeline/timeline_viz.py
def render_terminal_timeline(data: TimelineData, width: int | None = None) -> str:
    """Render a multi-line Unicode timeline string for terminal display.

    Args:
        data: Timeline data from :func:`build_timeline_data`.
        width: Terminal width in characters.  If ``None``, auto-detected
            via :func:`shutil.get_terminal_size`.

    Returns:
        Multi-line string suitable for printing to stdout.
    """
    if width is None:
        width = shutil.get_terminal_size((120, 24)).columns

    total_s = data.total_duration_s
    if total_s <= 0:
        return f"--- Timeline: {data.tag} (0:00) ---\n  (no audio)\n"

    # Layout constants
    label_col = 12  # width of "  DIALOGUE  " left column
    track_width = max(width - label_col - 2, 20)

    # Choose ruler interval: 30s for short episodes, 60s for longer
    if total_s <= 180:
        interval = 30
    elif total_s <= 600:
        interval = 60
    else:
        interval = 120

    lines = []
    lines.append(f"--- Timeline: {data.tag} ({_format_time(total_s)}) ---")
    lines.append("")

    # ── Time ruler ──
    ruler_line = " " * label_col
    num_ticks = int(total_s // interval) + 1
    for i in range(num_ticks):
        t = i * interval
        col = int(t / total_s * track_width) if total_s > 0 else 0
        if col >= track_width:
            break
        time_str = _format_time(t)
        # Place time label at col position
        pad = col - (len(ruler_line) - label_col)
        if pad > 0:
            ruler_line += " " * pad
        ruler_line += time_str

    # Tick marks line
    tick_chars = [" "] * track_width
    for i in range(num_ticks):
        t = i * interval
        col = int(t / total_s * track_width) if total_s > 0 else 0
        if col >= track_width:
            break
        if i == 0:
            tick_chars[col] = "├"
        elif col == track_width - 1:
            tick_chars[col] = "┤"
        else:
            tick_chars[col] = "┼"
    # Fill between ticks with ─
    for idx in range(track_width):
        if tick_chars[idx] == " ":
            tick_chars[idx] = "─"

    lines.append(ruler_line)
    lines.append(" " * label_col + "".join(tick_chars))
    lines.append("")

    # ── Layer rendering ──
    layer_config = [
        ("dialogue",       "DIALOGUE",       "█"),
        ("ambience",       "AMBIENCE",       "▓"),
        ("music",          "MUSIC",          "█"),
        ("sfx",            "SFX",            "█"),
        ("vintage_filter", "VTG FILTER",     "▒"),
    ]

    for layer_key, layer_name, fill_char in layer_config:
        spans = data.layers.get(layer_key, [])
        if not spans:
            continue

        # Build the bar row
        bar = [" "] * track_width
        label_positions: list[tuple[int, str]] = []

        for span in spans:
            col_start = int(span.start_s / total_s * track_width)
            col_end = int(span.end_s / total_s * track_width)
            col_start = max(0, min(col_start, track_width - 1))
            col_end = max(col_start + 1, min(col_end, track_width))

            # Short items (< 1 col) get a dot for SFX/BEAT
            if col_end - col_start <= 1 and layer_key == "sfx":
                char = "·" if span.end_s - span.start_s < 1.5 else fill_char
            else:
                char = fill_char

            for c in range(col_start, col_end):
                bar[c] = char

            # Truncate label to fit
            label = span.label
            if len(label) > 12:
                label = label[:11] + "…"
            label_positions.append((col_start, label))

        # Build label row
        label_row = [" "] * track_width
        for col, lbl in label_positions:
            end = min(col + len(lbl), track_width)
            # Don't overwrite existing labels
            if all(label_row[i] == " " for i in range(col, end)):
                for i, ch in enumerate(lbl):
                    if col + i < track_width:
                        label_row[col + i] = ch

        # Format output
        name_padded = f"  {layer_name:<{label_col - 2}}"
        lines.append(name_padded + "".join(bar))
        lines.append(" " * label_col + "".join(label_row))
        lines.append("")

    return "\n".join(lines)

render_html_timeline

render_html_timeline(data: TimelineData, output_path: str, stems_dir: str | None = None) -> str

Write a self-contained HTML timeline file.

Parameters:

  • data (TimelineData) –

    Timeline data from :func:build_timeline_data.

  • output_path (str) –

    Path to write the HTML file.

  • stems_dir (str | None, default: None ) –

    Directory of episode stem MP3 files. When provided, clicking a timeline block plays the corresponding stem via an embedded audio player (served by Gradio's /gradio_api/file= endpoint).

Returns:

  • str

    The path written (same as output_path).

Source code in src/xil_pipeline/timeline_viz.py
def render_html_timeline(
    data: TimelineData,
    output_path: str,
    stems_dir: str | None = None,
) -> str:
    """Write a self-contained HTML timeline file.

    Args:
        data: Timeline data from :func:`build_timeline_data`.
        output_path: Path to write the HTML file.
        stems_dir: Directory of episode stem MP3 files. When provided, clicking
            a timeline block plays the corresponding stem via an embedded audio
            player (served by Gradio's ``/gradio_api/file=`` endpoint).

    Returns:
        The path written (same as *output_path*).
    """
    # Build seq → absolute path mapping for click-to-play
    import re as _re
    _seq_re = _re.compile(r"^(n?)(\d+)_")
    clips: dict[str, str] = {}
    if stems_dir and os.path.isdir(stems_dir):
        for fname in sorted(os.listdir(stems_dir)):
            if not fname.endswith(".mp3"):
                continue
            m = _seq_re.match(fname)
            if m:
                seq = -int(m.group(2)) if m.group(1) == "n" else int(m.group(2))
                clips[str(seq)] = os.path.abspath(os.path.join(stems_dir, fname))
    clips_json = json.dumps(clips)

    # Build JSON-serializable structure
    json_data = {
        "tag": data.tag,
        "total_duration_s": data.total_duration_s,
        "layers": {
            key: [
                {
                    "start_s": sp.start_s,
                    "end_s": sp.end_s,
                    "label": sp.label,
                    "ramp_in_s": sp.ramp_in_s,
                    "ramp_out_s": sp.ramp_out_s,
                    "play_duration": sp.play_duration,
                    "snippet": sp.snippet,
                    "volume_pct": sp.volume_pct,
                    "seq": sp.seq,
                    "tts_model": sp.tts_model,
                }
                for sp in spans
            ]
            for key, spans in data.layers.items()
        },
    }

    span_count = sum(len(spans) for spans in data.layers.values())

    content = _HTML_TEMPLATE.format(
        tag=html.escape(data.tag),
        duration_fmt=_format_time(data.total_duration_s),
        span_count=span_count,
        data_json=json.dumps(json_data),
        clips_json=clips_json,
        generated_at=datetime.now().strftime("%Y-%m-%d %H:%M"),
    )

    os.makedirs(os.path.dirname(output_path) or ".", exist_ok=True)
    with open(output_path, "w", encoding="utf-8") as f:
        f.write(content)

    return output_path