When a session ends, your model needs to be ready for the next user. This means resetting any
session-specific state while keeping your weights and pipeline loaded in memory. Done correctly,
the next user connects to a clean, warm model with zero load time.
Reactor signals your model to stop via get_ctx().should_stop(). Check this regularly in your
generation loop:
Copy
def start_session(self): while not get_ctx().should_stop(): frames = self.generate() get_ctx().emit_block(frames)
But checking only at the loop boundary is not enough. If your forward pass takes a long time (say,
500ms per block), the model will not respond to stop signals until the current block finishes. For
better reactivity, check should_stop() inside your generation logic as well:
Copy
def generate_block(self): for step in self.denoising_steps: if get_ctx().should_stop(): return None # Exit early latent = self.denoise_step(latent, step) return self.decode(latent)def start_session(self): while not get_ctx().should_stop(): frames = self.generate_block() if frames is None: return # Stop was requested mid-generation get_ctx().emit_block(frames)
When should_stop() returns True, exit as quickly as possible. Do not throw an error. Just
return from start_session() and let the cleanup code run.
Perform cleanup at the end of your session flow, after the stop signal. This means resetting
input fields, clearing conditioning, and preparing for the next user:
Copy
def start_session(self): try: # The pipeline runs the generation loop internally, # emitting blocks and reading inputs on the fly self.pipeline.inference() finally: # Always runs, even after stop signal or error self._reset_session_state()def _reset_session_state(self): self.current_prompt = None self.mouse_action = [0, 0] self.keyboard_inputs = {} self.frame_count = 0 # Clear any accumulated conditioning self.pipeline.reset()
Why cleanup at the end rather than the start? If you clean up before the user starts, the incoming
user waits for cleanup to complete. By cleaning up after the previous user disconnects, the next
user connects immediately to a ready model.
Remember: cleanup resets session state, not model state. Your weights, pipeline objects,
and GPU allocations persist across sessions. This is what makes Reactor efficient.
Reset Between Sessions
Keep Loaded
User prompts
Model weights
Conditioning tensors
Pipeline objects
Input state (mouse, keyboard)
VAE decoder
Frame counters
Text encoder
Accumulated buffers
CUDA memory pools
The goal is to return your model to a “factory fresh” state for user inputs while keeping all the
expensive-to-load components warm.
Errors during generation should be caught, logged, and re-raised so Reactor can report them. This
is especially important in deployment where errors need to be tracked:
Copy
def start_session(self): try: # Pipeline runs the block-by-block loop internally self.pipeline.inference() except Exception as e: # Reset state even on error self._reset_session_state() # Re-raise so Reactor can report the error raise e finally: # Always reset, whether normal exit or error self._reset_session_state()
EDITOR: Add diagram showing the session lifecycle: user connects → generation loop →
stop signal or error → cleanup → ready for next user.
The pattern is:
Try: Run your generation loop
Except: Reset state and re-raise the error (do not swallow it)
Finally: Reset state (runs on both normal exit and error)
Note that _reset_session_state() runs in both except and finally. This is intentional: the
except block ensures cleanup happens before the error propagates, and finally handles the
normal stop signal case.
Here is a complete implementation showing both the VideoModel and pipeline structure:
Copy
class MyVideoModel(VideoModel): def start_session(self): try: # Pipeline runs the generation loop internally self.pipeline.inference() except Exception as e: self._reset_session_state() raise e finally: self._reset_session_state() def _reset_session_state(self): self.pipeline.reset()class MyPipeline: def inference(self): """Block-by-block generation with stop signal checking.""" while not get_ctx().should_stop(): # Read current inputs from class memory curr_action = self.mouse_action # Generate one block (single forward pass) frames = self.generate_block(cond=curr_action) if frames is None: return # Stop was requested mid-generation get_ctx().emit_block(frames) def generate_block(self, cond): """Single forward pass with mid-generation stop checks.""" for step in self.denoising_steps: if get_ctx().should_stop(): return None self.latent = self.denoise_step(self.latent, step, cond) return self.decode(self.latent) def reset(self): """Clear session state for next user.""" self.mouse_action = [0, 0] self.latent = None
The key points:
inference() runs the block-by-block loop, checking should_stop() each iteration
generate_block() produces a single block (one forward pass), checking should_stop() mid-generation
Inputs are read from class memory each iteration, allowing real-time changes
reset() clears session state but leaves weights loaded
Now that your model handles cleanup properly, you are ready to run it locally and see it in action.