Skip to main content
The StreamDiffusionV2 model is Reactor’s first Video-to-Video (V2V) transformation system that enables real-time editing of video streams with ultra-low latency. Connect any video source—whether it’s your webcam, a live stream, or any other video feed—and transform it in real-time based on your text prompts.

Key Features

Real-Time V2V

Transform live video streams in real-time with minimal latency

Dynamic Prompting

Change prompts on the fly and see instant transformations in your video feed

Flexible Input Sources

Connect webcams, streaming sources, or any video input for transformation
The model processes video frames continuously, applying AI-driven transformations based on your text prompts. Longer, more detailed prompts tend to produce the best results, and you can update prompts dynamically to see the video transformation change in real-time.

Quick Start

Get started with StreamDiffusionV2 in seconds:
npx create-reactor-app my-sd2-app stream-diffusion-v2

Video Input

StreamDiffusionV2 is a video-to-video (V2V) model that requires a video input stream to transform. You must provide a video source (typically from a webcam or screen capture) for the model to process.

Using WebcamStream Component

The easiest way to provide video input is using the built-in WebcamStream component:
import { 
  ReactorProvider, 
  ReactorView, 
  WebcamStream 
} from "@reactor-team/js-sdk";

export default function App() {
  return (
    <ReactorProvider
      modelName="stream-diffusion-v2"
      insecureApiKey={process.env.REACTOR_API_KEY!}
      autoConnect={true}
    >
      <div className="flex gap-4">
        {/* Your webcam input */}
        <WebcamStream 
          className="w-1/2 aspect-video rounded-lg"
          videoObjectFit="cover"
        />
        
        {/* AI-transformed output */}
        <ReactorView 
          className="w-1/2 aspect-video rounded-lg"
          videoObjectFit="cover"
        />
      </div>
    </ReactorProvider>
  );
}
The WebcamStream component automatically:
  • Requests camera permissions
  • Publishes the video stream when Reactor connects
  • Unpublishes when disconnected
  • Handles errors and cleanup

Manual Video Publishing

For more control, you can manually publish video streams using the imperative API:
import { Reactor } from "@reactor-team/js-sdk";

const reactor = new Reactor({
  modelName: "stream-diffusion-v2",
  insecureApiKey: process.env.REACTOR_API_KEY
});

// Capture webcam
const stream = await navigator.mediaDevices.getUserMedia({
  video: { width: 1280, height: 720 }
});

// Connect and publish
await reactor.connect();

reactor.on("statusChanged", async (status) => {
  if (status === "ready") {
    await reactor.publishVideoStream(stream);
  }
});
The model requires video input to function. Make sure to publish a video stream after connecting and before starting generation.

Getting Started

When you first connect to the StreamDiffusionV2 model, it will be ready to receive commands but won’t start processing video until you follow the proper initialization sequence:
  1. Set Initial Prompt: Before starting, you must set at least one prompt using set_prompt
  2. Start Generation: Once you have a prompt set, call start to begin the video transformation
  3. Dynamic Control: While running, you can change prompts in real-time or reset the system as needed
If you call start before setting an initial prompt, the command will be ignored and the model won’t begin processing.

Model Name

stream-diffusion-v2

Commands

Once connected, send commands using reactor.sendMessage() to control the video transformation process. Below are all available commands:
  • set_prompt
  • start
  • reset
  • set_denoising_step_list

set_prompt

Description: Set the prompt for video generation and transformation.Parameters:
  • prompt (string, required): The text prompt describing the desired video transformation
Behavior:
  • Sets the active prompt that will be used to transform the incoming video stream
  • Can be called at any time to change the transformation style
  • Longer, more detailed prompts typically produce better results
  • Changes take effect immediately if generation is already running
Best Practices:
  • Describe the desired scene: Focus on what should be present in the final video, not the transformation process
  • Provide context and setting: Include details about the environment, lighting, atmosphere, and overall composition
  • Specify style and mood: Describe the artistic style, color palette, lighting conditions, and emotional tone
  • Be descriptive about elements: Instead of “a dog turns into a cat,” write “a cat is sitting in the scene”
  • Include scene details: Mention backgrounds, objects, textures, and visual elements that should be present
  • Use comprehensive descriptions: Longer, more detailed prompts typically produce better and more consistent results
Example:
// Set a detailed scene description prompt
await reactor.sendMessage({
  type: "set_prompt",
  data: {
    prompt: "A cyberpunk cityscape at night with towering skyscrapers covered in neon signs, rain-soaked streets reflecting purple and blue lights, flying cars moving between buildings, and a person in a futuristic coat walking through the scene with dramatic lighting and atmospheric fog"
  }
});

// Change to a different scene and style
await reactor.sendMessage({
  type: "set_prompt",
  data: {
    prompt: "A serene watercolor painting scene with a person sitting by a peaceful lake surrounded by cherry blossom trees, soft pastel colors throughout, gentle brushstroke textures, warm golden hour lighting, and mountains in the distant background"
  }
});

Credits

StreamDiffusionV2 is developed by Tianrui Feng, Zhi Li, Haocheng Xi, Muyang Li, Shuo Yang, Xiuyu Li, Lvmin Zhang, Kelly Peng, Song Han, Maneesh Agrawala, Kurt Keutzer, Akio Kodaira, and Chenfeng Xu (UC Berkeley, MIT, Stanford University, First Intelligence, UT Austin) Project Page - View on GitHub
I