A session is the period during which a single user is actively connected to your model. Sessions
start when a user connects and end when they disconnect. Your model loads once, then serves many
sessions without ever reloading weights.
The Taxi Analogy
Think of your model like a self-driving taxi:
- The car (your loaded model with weights in memory) stays running all day
- Passengers (users) get in and out throughout the day
- Each passenger has their own ride (session), independent from other passengers
- When one passenger exits, the car does not turn off. It just waits for the next passenger
- The car should be clean for each new passenger (no leftover luggage from previous riders)
Session Lifecycle
- Model loads:
__init__ runs once, weights are loaded
- Idle: Model waits for a user to connect
- Session starts: User connects,
start_session() is called
- Running: Model generates frames, handles commands
- Session ends: User disconnects, cleanup runs
- Back to idle: Model waits for the next user
The model remains loaded and ready. Steps 2-6 repeat for each user.
Session Independence
Each session is completely independent. User A should never see or experience anything from User
B’s session: prompts, control inputs, and conditioning do not carry over between users.
| Scoped to Session | Scoped to Model (Persistent) |
|---|
| User inputs and commands | Loaded weights |
| Generated frames | Pipeline objects |
| Prompt embeddings | GPU memory allocations |
| Conditioning latents | Configuration values |
| Frame counter | Static resources |
Session-scoped state must be reset between sessions. Model-scoped state persists across all
sessions.
Starting a Session
When a user connects, Reactor calls your model’s start_session() method. Inside this method, you
use the session Context to communicate with the client: emit frames,
check for stop signals, and send messages.
def start_session(self):
while not get_ctx().should_stop():
frame = self.generate_frame()
get_ctx().emit_block(frame)
Sessions end when the user disconnects or when get_ctx().should_stop() returns True. Your loop
should check this signal regularly. For details on structuring your generation loop, see
Pipeline.
When the session ends, reset any session-specific state so the next user gets a fresh experience.
For cleanup patterns, see Model Cleanup.
start_session() runs in a background thread. You do not need to worry about blocking.
Single Session at a Time
In the runtime, only one session can be active at a time. The session is decoupled from the media
stream: if a new video client connects while a session is running, the media stream switches to the
new user, but the session itself continues. The session only ends when explicitly terminated (via
API or when start_session() returns).