Core Concepts

This page covers the technical details of how Reactor Runtime works. You’ll learn about the runtime architecture, model lifecycle, component orchestration, and best practices for implementation.

The Manifest File

The manifest.json file configures your model for the runtime:

What class implements your model
What arguments to pass during initialization
Which weights to download
Version compatibility information

Example manifest.json:

{
  "reactor-runtime": "0.0.0",
  "class": "model_longlive:LongLiveVideoModel",
  "model_name": "longlive",
  "model_version": "1.0.0",
  "args": {
    "fps": 30,
    "size": [480, 640]
  },
  "weights": ["LongLive", "Wan2_1_VAE", "WanVideo_comfy"]
}

Breaking it down:

reactor-runtime: The runtime version this model requires
class: Points to your VideoModel implementation (filename:ClassName)
model_name: Unique identifier for your model
model_version: Semantic version of your model
args: Arguments passed to your model’s __init__ method
weights: List of weight folders to download from Model Registry

The VideoModel Class

The VideoModel class is an abstract base class that provides the interface between your model and the runtime:

from reactor_runtime import VideoModel, command

class YourModel(VideoModel):
    def __init__(self, fps=30, size=(480, 640), **kwargs):
        # Load YOUR existing model here
        self.my_model = load_my_existing_model()

    def start_session(self):
        # Called when a user connects
        # Run your generation loop here
        self.my_model.generate()

Your model file can import anything - your entire existing codebase, external libraries, custom modules. The VideoModel is just the thin wrapper that makes it work with Reactor.

How the Runtime Works

The Reactor Runtime orchestrates three key components:

1. Runtime Server (Port 8081)

The main FastAPI server that hosts your model. When started:

Instantiates your VideoModel class (calls __init__)
Loads all weights into GPU memory
Exposes API endpoints for session management
Keeps your model warm and ready for connections

2. Local Coordinator (Port 8080)

Manages session lifecycle and client connections:

Receives connection requests from frontends
Starts/stops sessions on the runtime
Handles WebSocket communication
Mimics production coordinator behavior for local testing

3. LiveKit Server (Port 7880)

Powers real-time video streaming:

Creates “rooms” for each session
Handles WebRTC connections between client and model
Streams video frames with minimal latency
Runs entirely locally during development

The VideoModel Lifecycle

Understanding when methods are called is crucial for optimal performance:

RUNTIME STARTUP
  ↓
__init__() called - HEAVY LOADING HERE
  ↓
[Model stays in memory]
  ↓
USER CONNECTS → start_session() called
  ↓
[Session active, frames stream]
  ↓
USER DISCONNECTS → cleanup in finally block
  ↓
[Model stays in memory, ready for next user]
  ↓
NEXT USER → start_session() called again (instant!)

Phase 1: Initialization (`init`)

When: Once at runtime startup, before any users connect Purpose: Load everything heavy Duration: Can take minutes (only happens once!)

def __init__(self, fps=30, size=(480, 640), **kwargs):
    """
    Called ONCE when runtime starts.
    Do ALL heavy lifting here:
    - Load model architectures
    - Download/load weights into GPU
    - Compile models
    - Initialize pipelines
    """
    self._device = torch.device("cuda")

    # Load weights from Model Registry
    weights_path = VideoModel.weights("YourModel-Weights")

    # Initialize your model (heavy operation)
    self.model = YourHeavyModel()
    self.model.load_state_dict(torch.load(weights_path / "model.pt"))
    self.model.to(self._device)

Critical: Everything expensive goes here. This instance stays in memory forever.

Phase 2: Session Start (`start_session`)

When: Every time a user connects Purpose: Run your generation loop Duration: Runs until user disconnects or generation completes

def start_session(self):
    """
    Called when a user connects.
    Should be FAST to start (model already loaded).
    Run your generation loop here.
    """
    try:
        # Light session-specific setup
        noise = torch.randn(self._latent_shape, device=self._device)

        # Your generation loop
        for frame in self.model.generate(noise):
            # Emit frame to user in real-time
            get_ctx().emit_frame(frame)

    except Exception as e:
        # Reset model to initial state before raising
        self.model.reset()
        raise e  # Notify runtime of error
    finally:
        # Cleanup session resources
        self.model.reset()
        torch.cuda.empty_cache()

Important: Keep this method lightweight. Heavy loading should be in __init__.

Phase 3: Session End

When: User disconnects or generation completes Purpose: Clean up session-specific resources and reset model

def start_session(self):
    try:
        # Generation loop
        pass
    finally:
        # Reset model to initial state for next user
        self.model.reset()

        # Lightweight cleanup
        torch.cuda.empty_cache()

        # DO NOT unload model weights!
        # Model stays loaded for next user

The Session Context (`get_ctx()`)

During an active session, you can access the runtime context using get_ctx(). This gives you access to methods for communicating with the frontend and streaming video frames.

from reactor_runtime import VideoModel, get_ctx

class YourModel(VideoModel):
    def start_session(self):
        # Call methods directly on get_ctx()
        for frame in self.generate():
            get_ctx().emit_block(frame)

Important: get_ctx() can only be called during an active session (inside start_session() or any methods called from it). Calling it outside a session will raise a RuntimeError.

Emitting Frames (`emit_block`)

Stream video frames to the connected client:

def start_session(self):
    # Emit a single frame
    frame = np.random.rand(480, 640, 3)  # (H, W, 3) in RGB
    get_ctx().emit_block(frame)
    
    # Emit multiple frames at once
    frames = np.random.rand(10, 480, 640, 3)  # (N, H, W, 3)
    get_ctx().emit_block(frames)
    
    # Send a black frame
    get_ctx().emit_block(None)

Frame Format:

Single frame: np.ndarray with shape (H, W, 3) in RGB color space
Multiple frames: np.ndarray with shape (N, H, W, 3) where N is the number of frames
Black frame: Pass None to display a black frame on the client

Don’t worry about FPS! You can emit frames as fast as your model generates them. The runtime automatically buffers and smooths frame delivery to maintain consistent playback, even if your generation rate varies. Just focus on producing frames - the runtime handles timing.

Sending Messages (`send`)

Send arbitrary data from your model to the frontend:

def start_session(self):
    # Send generation progress
    get_ctx().send({
        "type": "progress",
        "value": 0.5,
        "message": "Halfway done"
    })
    
    # Send model state updates
    get_ctx().send({
        "type": "state_changed",
        "new_state": "generating",
        "metadata": {"step": 42}
    })
    
    # Send any custom data structure
    get_ctx().send({
        "type": "custom_event",
        "data": {"key": "value"}
    })

Message Format:

Must be a Python dict
Will be automatically wrapped in an ApplicationMessage envelope by the runtime
The frontend receives this via WebSocket and can handle it in your React components

Key Features

Automatic Weight Management

Weights are stored in the Model Registry (S3 bucket). When you run your model:

Runtime checks the weights array in your manifest
Downloads missing weights to ~/.cache/reactor_registry/
Caches them for future runs

No manual Hugging Face downloads. No path configuration. Just works.

Don’t have AWS access? You can load weights normally from local paths during development. When you’re ready to deploy, the Reactor team will handle uploading your weights to S3 and updating your model to use the Model Registry.

Built-in Networking

LiveKit handles all WebRTC complexity:

Bidirectional video streaming
Audio support
Data channels for commands
Adaptive bitrate streaming

You just emit frames. The runtime handles delivery.

Seamless Local → Production

The development workflow mirrors production exactly:

Local: All services run on your machine
Production: Services run in the cloud

Same code. Same behavior. Deploy with confidence.

Efficient Resource Usage

The VideoModel instance stays loaded throughout the runtime lifecycle:

Startup: Heavy weight loading happens once in __init__
Session Start: Lightweight setup when user connects
Session End: Quick reset when user disconnects
Next Session: Model already loaded, instant start

Users transition between sessions in milliseconds.

Development Flow

The complete local workflow:

# 1. Start runtime (starts all components)
reactor run

# 2. In another terminal, start your frontend
pnpm run dev

# 3. Connect from browser (ReactorProvider with local flag)
# → Frontend connects to runtime
# → Runtime starts your session
# → Video streams via LiveKit

No tokens. No credentials. No manual coordination.

Model Structure

Every Reactor model consists of:

your-model/
├── manifest.json              # Model configuration
├── model_yourmodel.py         # VideoModel wrapper
├── requirements.txt           # Dependencies
└── your_existing_model/       # Your original codebase (optional)
    ├── models.py
    ├── utils.py
    └── ...

The wrapper is thin - your existing code stays intact.

Next Steps

Now that you understand the core concepts, learn how to implement your own model:

Quickstart

Run your first model in minutes

Coding Models

Convert your model to Reactor with detailed examples

CLI Reference

Complete reference for all Reactor CLI commands

Deploy to Production

Take your model from local dev to worldwide deployment

Getting Started

Fundamentals

Guides

The Manifest File

The VideoModel Class

How the Runtime Works

1. Runtime Server (Port 8081)

2. Local Coordinator (Port 8080)

3. LiveKit Server (Port 7880)

The VideoModel Lifecycle

Phase 1: Initialization (`init`)

Phase 2: Session Start (`start_session`)

Phase 3: Session End

The Session Context (`get_ctx()`)

Emitting Frames (`emit_block`)

Sending Messages (`send`)

Key Features

Automatic Weight Management

Built-in Networking

Seamless Local → Production

Efficient Resource Usage

Development Flow

Model Structure

Next Steps

Quickstart

Coding Models

CLI Reference

Deploy to Production

Getting Started

Fundamentals

Guides

​The Manifest File

​The VideoModel Class

​How the Runtime Works

​1. Runtime Server (Port 8081)

​2. Local Coordinator (Port 8080)

​3. LiveKit Server (Port 7880)

​The VideoModel Lifecycle

​Phase 1: Initialization (__init__)

​Phase 2: Session Start (start_session)

​Phase 3: Session End

​The Session Context (get_ctx())

​Emitting Frames (emit_block)

​Sending Messages (send)

​Key Features

​Automatic Weight Management

​Built-in Networking

​Seamless Local → Production

​Efficient Resource Usage

​Development Flow

​Model Structure

​Next Steps

Quickstart

Coding Models

CLI Reference

Deploy to Production

The Manifest File

The VideoModel Class

How the Runtime Works

1. Runtime Server (Port 8081)

2. Local Coordinator (Port 8080)

3. LiveKit Server (Port 7880)

The VideoModel Lifecycle

Phase 1: Initialization (`init`)

Phase 2: Session Start (`start_session`)

Phase 3: Session End

The Session Context (`get_ctx()`)

Emitting Frames (`emit_block`)

Sending Messages (`send`)

Key Features

Automatic Weight Management

Built-in Networking

Seamless Local → Production

Efficient Resource Usage

Development Flow

Model Structure

Next Steps