Skip to main content
When a session ends, your model needs to be ready for the next user. This means resetting any session-specific state while keeping your weights and pipeline loaded in memory. Done correctly, the next user connects to a clean, warm model with zero load time.

The Stop Signal

Reactor signals your model to stop via get_ctx().should_stop(). Check this regularly in your generation loop:
def start_session(self):
    while not get_ctx().should_stop():
        frames = self.generate()
        get_ctx().emit_block(frames)
But checking only at the loop boundary is not enough. If your forward pass takes a long time (say, 500ms per block), the model will not respond to stop signals until the current block finishes. For better reactivity, check should_stop() inside your generation logic as well:
def generate_block(self):
    for step in self.denoising_steps:
        if get_ctx().should_stop():
            return None  # Exit early
        
        latent = self.denoise_step(latent, step)
    
    return self.decode(latent)

def start_session(self):
    while not get_ctx().should_stop():
        frames = self.generate_block()
        if frames is None:
            return  # Stop was requested mid-generation
        get_ctx().emit_block(frames)
When should_stop() returns True, exit as quickly as possible. Do not throw an error. Just return from start_session() and let the cleanup code run.

Cleanup at the End

Perform cleanup at the end of your session flow, after the stop signal. This means resetting input fields, clearing conditioning, and preparing for the next user:
def start_session(self):
    try:
        # The pipeline runs the generation loop internally,
        # emitting blocks and reading inputs on the fly
        self.pipeline.inference()
    finally:
        # Always runs, even after stop signal or error
        self._reset_session_state()

def _reset_session_state(self):
    self.current_prompt = None
    self.mouse_action = [0, 0]
    self.keyboard_inputs = {}
    self.frame_count = 0
    # Clear any accumulated conditioning
    self.pipeline.reset()
Why cleanup at the end rather than the start? If you clean up before the user starts, the incoming user waits for cleanup to complete. By cleaning up after the previous user disconnects, the next user connects immediately to a ready model.

Weights Stay Loaded

Remember: cleanup resets session state, not model state. Your weights, pipeline objects, and GPU allocations persist across sessions. This is what makes Reactor efficient.
Reset Between SessionsKeep Loaded
User promptsModel weights
Conditioning tensorsPipeline objects
Input state (mouse, keyboard)VAE decoder
Frame countersText encoder
Accumulated buffersCUDA memory pools
The goal is to return your model to a “factory fresh” state for user inputs while keeping all the expensive-to-load components warm.

Error Handling

Errors during generation should be caught, logged, and re-raised so Reactor can report them. This is especially important in deployment where errors need to be tracked:
def start_session(self):
    try:
        # Pipeline runs the block-by-block loop internally
        self.pipeline.inference()
    except Exception as e:
        # Reset state even on error
        self._reset_session_state()
        # Re-raise so Reactor can report the error
        raise e
    finally:
        # Always reset, whether normal exit or error
        self._reset_session_state()
EDITOR: Add diagram showing the session lifecycle: user connects → generation loop → stop signal or error → cleanup → ready for next user.
The pattern is:
  1. Try: Run your generation loop
  2. Except: Reset state and re-raise the error (do not swallow it)
  3. Finally: Reset state (runs on both normal exit and error)
Note that _reset_session_state() runs in both except and finally. This is intentional: the except block ensures cleanup happens before the error propagates, and finally handles the normal stop signal case.

Complete Example

Here is a complete implementation showing both the VideoModel and pipeline structure:
class MyVideoModel(VideoModel):
    def start_session(self):
        try:
            # Pipeline runs the generation loop internally
            self.pipeline.inference()
        except Exception as e:
            self._reset_session_state()
            raise e
        finally:
            self._reset_session_state()

    def _reset_session_state(self):
        self.pipeline.reset()

class MyPipeline:
    def inference(self):
        """Block-by-block generation with stop signal checking."""
        while not get_ctx().should_stop():
            # Read current inputs from class memory
            curr_action = self.mouse_action
            
            # Generate one block (single forward pass)
            frames = self.generate_block(cond=curr_action)
            
            if frames is None:
                return  # Stop was requested mid-generation
            
            get_ctx().emit_block(frames)

    def generate_block(self, cond):
        """Single forward pass with mid-generation stop checks."""
        for step in self.denoising_steps:
            if get_ctx().should_stop():
                return None
            self.latent = self.denoise_step(self.latent, step, cond)
        
        return self.decode(self.latent)

    def reset(self):
        """Clear session state for next user."""
        self.mouse_action = [0, 0]
        self.latent = None
The key points:
  • inference() runs the block-by-block loop, checking should_stop() each iteration
  • generate_block() produces a single block (one forward pass), checking should_stop() mid-generation
  • Inputs are read from class memory each iteration, allowing real-time changes
  • reset() clears session state but leaves weights loaded

Now that your model handles cleanup properly, you are ready to run it locally and see it in action.

Run Your Model Locally

Learn how to start your model with the HTTP runtime and connect to it from a frontend.