Tutorial

A guided tour of the open-source SANA-Streaming reference app, which demonstrates every important pattern for building on SANA-Streaming with the typed @reactor-models/sana-streaming SDK. By the end you’ll know how to edit your webcam feed in real time, edit an uploaded clip of your choice, steer the edit mid-stream, snap clips, and surface model errors.

Installation and setup

Get the example running before reading further. Every section below points back at code in the example repo. You will need:

Node.js 18+.
pnpm (the example’s lockfile is pnpm’s; npm or yarn work too).
A Reactor API key (starts with rk_).
Familiarity with the Next.js App Router.

Clone the example

The example lives alongside our other reference apps in reactor-team/js-sdk under examples/.

git clone https://github.com/reactor-team/js-sdk
cd js-sdk/examples/sana-streaming

Add your API key

Your rk_… key must never reach the browser; the example reads it server-side and mints a short-lived JWT for the client (the standard broker pattern); for now, drop the key into .env.local:

cp .env.example .env.local
# then edit .env.local and set REACTOR_API_KEY to your API key

See a “Setup Required” screen? Your REACTOR_API_KEY isn’t loaded. The check lives in app/page.tsx → app/SetupRequired.tsx.

Install dependencies and start the dev server

pnpm install
pnpm dev

Open http://localhost:3000, click Connect, allow camera access, pick a preset prompt (or type your own edit), and press Start live.

How SANA-Streaming works

Building with SANA-Streaming is different from Reactor’s other models. Helios, LingBot, and LongLive-2.0 generate video from a prompt; SANA-Streaming edits the video you bring. You open a long-lived connection, give the model a source (your webcam or an uploaded clip) and an edit instruction, and it streams back transformed frames in 24-frame chunks, one every ~1-1.5s. Re-prompt at any time and the new edit lands at the next chunk boundary, with no re-render and no break in the stream. Opening the connection isn’t instant. Reactor provisions a GPU for your session, so the client moves through four states before media starts flowing:

Connection lifecycle: disconnected → connecting → waiting → ready

The waiting state is when the GPU is being assigned, which takes a few seconds. Once the status reaches ready, commands take effect and the session lifecycle begins in its idle state. StatusBadge.tsx surfaces every connection state with a label and a Connect / Disconnect toggle. See Sessions for the full breakdown. Two properties of the API are worth internalizing before you read on:

Commands are asynchronous; messages are the source of truth. Calling setVideo doesn’t mean the model has a source yet; it confirms with a video_accepted message and a state snapshot whose has_video flips to true.
Errors arrive out-of-band. A broken precondition like start with no source surfaces later as a command_error message (useSanaStreamingCommandError), not a thrown exception.

The app drives the model through the typed @reactor-models/sana-streaming SDK, which wraps the base SDK with the model name and tracks baked in. SanaStreamingApp.tsx mounts <SanaStreamingProvider getJwt={fetchToken}>; components read status and call one typed method per command off useSanaStreaming() (setPrompt({ prompt }) rather than sendCommand("set_prompt", { prompt })), and subscribe to messages with per-message hooks like useSanaStreamingState. The wire shapes behind every method and message are in the schema.

Clip recording is the one exception. It is model-agnostic and not re-exported by the typed package, so SnapClip.tsx uses the base @reactor-team/js-sdk directly (useReactor, <ClipPlayer>). Drop it into any model example unchanged.

The model is the source of truth

The browser sends commands and renders the state the model reports back; it never tracks generation state on its own. That discipline lives in one reducer in app/lib/state.ts, which projects the model’s typed state messages into a small SanaState. Because it is fed only by the useSanaStreamingState hook, the reducer never has to filter by message type, and it reads the snapshot’s flat, typed fields directly:

app/lib/state.ts

import type { SanaStreamingStateMessage } from "@reactor-models/sana-streaming";

// Projects model `state` snapshots into SanaState. Returns the previous
// object when nothing changed so React can bail out of re-rendering the
// whole tree on the model's frequent identical echoes.
export function reduce(state: SanaState, msg: SanaStreamingStateMessage): SanaState {
  const next: SanaState = {
    running: msg.running,
    started: msg.started,
    paused: msg.paused,
    currentChunk: msg.current_chunk,
    // current_prompt is typed `unknown` on the wire; the model only ever
    // sends a string or null.
    currentPrompt: (msg.current_prompt as string | null) ?? null,
    hasVideo: msg.has_video,
    seed: msg.seed,
  };
  const changed = (Object.keys(next) as (keyof SanaState)[]).some((k) => next[k] !== state[k]);
  return changed ? next : state;
}

The Workspace shell in SanaStreamingApp.tsx subscribes with three typed message hooks: useSanaStreamingState feeds the reducer, and useSanaStreamingCommandError and useSanaStreamingGenerationReset handle the two messages that need side effects, imperatively:

app/SanaStreamingApp.tsx

const [state, setState] = useState(DEFAULT_STATE);

// The only reducer input: the typed `state` snapshot.
useSanaStreamingState((msg) => {
  setState((s) => reduce(s, msg));
});

// Transient set_video "decode failed" errors are auto-retried by FileInput;
// don't flash the banner for them.
useSanaStreamingCommandError((msg) => {
  if (!isTransientDecodeFailure(msg)) showCommandError(msg.reason);
});

useSanaStreamingGenerationReset(() => {
  // Model reset clears its source video + prompt; mirror that locally.
  setSourceUrl(null);
  setResetNonce((n) => n + 1);
  setStageCleared(true);
});

Every command can fail a precondition (start with no source, resume while not paused); the shell turns each command_error into a banner that dismisses itself after six seconds. Every control in the app gates off the reduced SanaState: the file-mode Start button on state.hasVideo, the mode toggle and clip picker on state.started (the source is fixed once a run begins), the pause/resume and reset controls on state.started and state.paused. Informational messages (video_accepted, prompt_accepted, chunk_complete) are not state inputs; whatever they report also arrives in the next state snapshot, which is the canonical payload.

command_error is one of several messages SANA-Streaming emits, each with its own typed hook. See the Messages table for the full list, including generation_started, generation_complete, and the per-chunk chunk_complete.

The same discipline shapes the commands going out. A command only takes effect once the model echoes it back in a state snapshot, so the start path doesn’t confirm anything itself: every start flow, live or file, fires the same two methods and lets the reducer report when generation is running:

app/lib/state.ts

// Typed slice of useSanaStreaming() the start flow needs.
interface StartControls {
  setMode: (params: { mode: SanaMode }) => Promise<void>;
  start: () => Promise<void>;
}

// The start flow is always set_mode -> start. Re-sending set_mode keeps the
// flow self-contained regardless of which mode the model is in; the model
// treats a repeated set_mode as idempotent.
export async function startGeneration(model: StartControls, mode: SanaMode) {
  await model.setMode({ mode });
  await model.start();
}

Live mode: editing your webcam

Live mode is the headline feature and the app’s default. Send your webcam to SANA-Streaming by publishing your camera to the model’s camera input track, then setMode({ mode: "live" }) and start(). Edited frames come back on the main_video track about a second later. LiveInput.tsx owns the camera acquisition rather than reaching for a declarative webcam component:

app/components/LiveInput.tsx

const stream = await navigator.mediaDevices.getUserMedia({
  video: { width: { ideal: 640 }, height: { ideal: 360 }, facingMode: "user" },
});
const videoTrack = stream.getVideoTracks()[0];
videoTrack.contentHint = "detail"; // hold resolution; adapt framerate

Browsers shrink and grow a camera track’s resolution mid-stream to cope with bandwidth, and a resolution change mid-chunk crashes the live session. Setting contentHint = "detail" before publish makes the browser hold resolution steady and adapt the frame rate instead, which the model handles fine. Owning the MediaStreamTrack is what makes that line possible; set the hint in any client you build. See Tracks.

Publishing uses publish and unpublish from useSanaStreaming() in an effect keyed on the track and the connection status: it publishes once the session is ready, re-publishes after a reconnect, and unpublishes on unmount. The typed SDK also ships a declarative <SanaStreamingCameraView> that acquires and publishes the webcam for you, but it gives no hook to set contentHint, so LiveInput owns the track and publishes it by hand. The Start live button calls startGeneration({ setMode, start }, "live") and is disabled until status === "ready", the publish has resolved, and no generation is running. Switching the mode toggle to File unmounts LiveInput, which unpublishes the track and stops the webcam, so a mode switch can’t leave the camera running.

File mode: editing an uploaded clip

File mode trades the camera for an uploaded clip of at least 33 frames. The flow in FileInput.tsx is uploadFile → setVideo → start, with the model’s state as the gate in the middle:

app/components/FileInput.tsx

const { setVideo, uploadFile, setMode, start, status } = useSanaStreaming();

// Shared upload path for manual picks and preset clips.
async function uploadVideo(file: File) {
  const ref = await uploadFile(file);
  lastRefRef.current = ref;
  retriesRef.current = DECODE_RETRIES;
  await setVideo({ video: ref });
  onSource(URL.createObjectURL(file)); // the stage plays this next to the output
}

The model accepts the upload without decoding it (frames decode during generation) and replies with video_accepted plus a state snapshot whose has_video is true. The Start edit button is disabled on !state.hasVideo, which flips when the model accepts the clip, not when the upload promise resolves. See set_video for the command contract and File Uploads for what the SDK does with the bytes. One quirk is worth handling in any client you build: the model sometimes rejects a perfectly valid clip with a decode failed error. It’s a timing glitch on the model side, not a problem with your file, and re-sending the same setVideo almost always clears it. So FileInput watches for that one error with useSanaStreamingCommandError and retries up to twice with the already-uploaded clip before treating it as real:

app/components/FileInput.tsx

useSanaStreamingCommandError((msg) => {
  if (msg.command !== "set_video" || !msg.reason.startsWith("decode failed")) return;
  if (retriesRef.current > 0 && lastRefRef.current) {
    retriesRef.current -= 1;
    setVideo({ video: lastRefRef.current }).catch(() => {
      setError("Upload failed: " + msg.reason);
    });
  } else {
    setError("Upload failed: " + msg.reason);
  }
});

There’s no need to re-upload; the clip is still on the server. And while FileInput retries, the error banner stays silent, so the user never sees a flash of failure for something the app is about to fix on its own. (The banner skips these by checking an isTransientDecodeFailure helper in app/lib/state.ts, the same condition FileInput matches above.) Two behaviors that follow from the model latching its source at start:

Clip picks are disabled while a run is in progress. A mid-run setVideo would not take effect until the next start, and the UI would show a clip the model isn’t using. Reset first, then pick a new clip.
A file-mode run ends on its own. Once every source frame is transformed, the model emits generation_complete and returns to idle with the clip, prompt, and seed still staged. start replays the clip from the top; reset is only needed to swap clips. On completion the main_video track freezes on the last transformed frame rather than going dark; the next section covers what the stage does with frozen frames.

The stage

Stage.tsx renders the model output with the typed <SanaStreamingMainVideoView>:

app/components/Stage.tsx

import { SanaStreamingMainVideoView } from "@reactor-models/sana-streaming";

<SanaStreamingMainVideoView
  videoObjectFit="contain"
  className="absolute inset-0 h-full w-full"
/>;

It is <ReactorView> with track="main_video" pre-bound, and manages the <video> element, srcObject binding, and browser autoplay quirks for you. Apply your styling to the container around it, not to the video element it renders. In file mode with a source loaded, the stage splits into two panes: the original clip on the left, the transformed stream on the right. The local clip is driven off the reducer state (play when running, pause when paused, rewind when the source clears) as an approximate sync by design, with no seeking or drift correction. A status row along the bottom reads running / paused, currentChunk, and currentPrompt straight off the reduced state. After a reset the model emits nothing new, so the view would freeze on the last transformed frame; the shell blacks the stage out until state.running flips back to true. A completed file-mode run freezes the view the same way, but there the example leaves the last frame visible (the status row drops back to idle) until the next start or reset.

Steering the prompt mid-stream

Prompts are editing instructions, not scene descriptions: “apply a Van Gogh oil painting style,” not “a Van Gogh painting of a room.” Prompt.tsx is one textarea, one Apply button, and a row of preset chips, and every path funnels into the same call:

app/components/Prompt.tsx

const { setPrompt, status } = useSanaStreaming();
const [text, setText] = useState("");

const apply = (prompt: string) => {
  if (status !== "ready") return;
  setPrompt({ prompt }).catch(console.error);
};

setPrompt works before start and at any point mid-stream; the model applies it at the next chunk boundary. A prompt is optional: start without one and the model streams a near-reconstruction until you set a prompt to steer it. The textarea’s placeholder (“Describe the edit. Changes apply live, about one chunk later.”) spells out the latency. The active-prompt readout under the button renders state.currentPrompt, so it reflects what the model is using rather than what was last typed. The preset chips come from app/lib/examples.ts. They are deliberately short style tags (“Van Gogh oil painting, swirling brushstrokes, vivid colors”) that show how a whole-frame restyle reads against a live feed, leaning on the model’s default to carry everything else through. For surgical edits, a garment swap or a removal where you must spell out what stays fixed, write the fuller instructions the prompt guide lays out, with its anatomy and a recipe per edit type.

Playback, seed, and reset

The Input panel (ModeInput.tsx) is phase-aware, driven by the model’s started flag. Before a run it shows the mode toggle, the active input (webcam or file picker), and the seed field; once started flips true it keeps the input slot mounted (so live mode keeps publishing) and swaps the setup controls for Playback.tsx: pause, resume, and reset. Each is a typed method off useSanaStreaming(), gated on the reduced state:

app/components/Playback.tsx

const { pause, resume, reset, status } = useSanaStreaming();
const notReady = status !== "ready";

{paused ? (
  <IconButton icon="play" label="Resume" disabled={notReady}
    onClick={() => resume().catch(console.error)} />
) : (
  <IconButton icon="pause" label="Pause" disabled={notReady}
    onClick={() => pause().catch(console.error)} />
)}
<IconButton icon="reset" label="Reset" tone="danger" disabled={notReady}
  onClick={() => reset().catch(console.error)} />;

The seed lives in SeedField.tsx, a setup-phase control that calls setSeed({ seed }) on blur. The model reads the seed at start, so the same source, prompt, and seed reproduce a run; the field is keyed to the model-reported seed, so a reset or an external setSeed refreshes it with the model’s value. reset does the most work: it aborts the run and clears the model’s source, prompt, and progress, emitting generation_reset. The shell’s handler (from The model is the source of truth) mirrors that on the client, dropping the side-by-side source URL, blacking out the stage, and clearing the prompt draft and file selection so the UI matches the model.

What’s intentionally left out

The demo covers the connect + edit + steer + capture loop. Clip capture is a shared base-SDK feature, so Recordings covers it, including continuous recording, programmatic capture, and retention. A few other patterns are out of scope, and each is a small addition:

Feature	How to add it
Screen capture as the source	Swap `getUserMedia` for `getDisplayMedia` in `LiveInput.tsx` and publish the track to `camera` the same way. The `contentHint = "detail"` line still applies; any unhinted track can crash the live session.
Limiting drift on long runs	Surface `setAnchorInterval({ chunks })` from `useSanaStreaming()` as a control. It re-grounds the edit on the source every N chunks (`0` disables), trading a brief visible refresh for less accumulated drift.
Swapping clips between runs	The model latches its source at `start`, so call `reset`, then upload and `setVideo` the new clip. The demo’s UI guides users there by disabling clip picks while a run is in progress.

For the full design rationale and the patterns to follow when extending the app, including the typed-SDK surface and the manual camera publish, read skill/SKILL.md in the example repo.

Overview

HappyOyster

LingBot World 2

LingBot

X2

Helios

LongLive-2.0

SANA-Streaming

Installation and setup

How SANA-Streaming works

The model is the source of truth

Live mode: editing your webcam

File mode: editing an uploaded clip

The stage

Steering the prompt mid-stream

Playback, seed, and reset

What’s intentionally left out

​Installation and setup

​How SANA-Streaming works

​The model is the source of truth

​Live mode: editing your webcam

​File mode: editing an uploaded clip

​The stage

​Steering the prompt mid-stream

​Playback, seed, and reset

​What’s intentionally left out

Installation and setup

How SANA-Streaming works

The model is the source of truth

Live mode: editing your webcam

File mode: editing an uploaded clip

The stage

Steering the prompt mid-stream

Playback, seed, and reset

What’s intentionally left out