- Stream video to any frontend at generation speed
- Accept real-time input from users while generating
- Load weights once, serve many users with zero cold start
- Auto-generate UI controls from your model’s command schema
How It Works
Your model loads weights once in__init__. When a user connects, start_session() runs your
generation loop — streaming frames while @command methods handle input on a separate thread. When
the user disconnects, session state resets and the model is immediately ready for the next user.
Weights stay loaded.
Your @command decorators define the input schema. Any frontend that speaks this schema can connect:
web apps, game engines, mobile apps. You define the API once.
Quickstart
Install, run a model, and connect a frontend in 5 minutes.