What is the Treatment-Lock Method for keeping AI video visually consistent across scenes?
Last updated June 26, 2026
The Treatment-Lock Method means loading a complete visual treatment document — camera, lighting, palette, composition, atmosphere, mood — into an AI agent once at project start, so the invideo agent holds every style directive across every shot without re-prompting. One documented production encoded a director's visual language in 14 sections and held it across an entire short film with no drift.
To use the Treatment-Lock Method, you build three things: a treatment document, a one-time context load, and a fixed prompt assembly order the invideo agent enforces on every generation. invideo is an agentic video creation platform whose agent holds documents in persistent context and routes each shot to the current generation models — Veo, Kling, Seedance 2.0 — so you set the system up once and direct from there.
The treatment document. Write a structured document that codifies the visual language you want — one documented version ran 25 pages across 14 sections covering camera, angles, colour tone, atmosphere, mood, lighting, composition, movement, film palettes, prompt templates, negative prompts, and a quick-reference card. Encode colour philosophy as named tonal modes with exact hex values so palettes are reproducible, and include a 'what never to do' section per emotional stage — explicit exclusions make the invideo agent's autonomous decisions far more reliable.
The lock. Upload the document to the invideo agent before generating a single frame. The agent reads it once and keeps it loaded as persistent context, so every subsequent shot request is evaluated against the document rather than against whatever you typed last. This is the direct alternative to the anti-pattern of re-prompting scene by scene, where style instructions decay and drift accumulates across a project. In one production the invideo agent output 12 key parameters per shot — film reference, shot design, length, lens, lighting plan, color script, atmosphere layers, blocking, final prompt, negative prompt, and revision prompt — all derived from the locked document.
The prompt assembly order. The lock is enforced through a fixed 9-element sequence applied to every generation prompt: camera spec, lens and aspect ratio, lighting source, palette, composition, atmosphere, mood register, film/DP attribution, and negative prompt. Because the order never changes, no stylistic dimension silently drops out of a prompt mid-project, and the invideo agent checks generated frames against the treatment before returning them.
Validate before you spend credits. Stress-test the document by asking the invideo agent to apply the style to a genre or subject the director never worked in. If the invideo agent asks clarifying questions and returns stylistically coherent output, the document has been internalized as grammar rather than surface aesthetics; in one documented production the invideo agent autonomously applied a motion-smear rule from page 17 of the document without being prompted.
Direct from the lock. Once the lock holds, continuation prompts shrink to almost nothing — a documented production maintained character, lighting, lens grammar, and spatial continuity across a multi-shot sequence with the three-word prompt 'Everything should match.' The method scales: one production held a single director's visual grammar across roughly 400 video generations, and a series of three films by three directors ran on one shared agent setup with scene numbering visible past scene 169. For character-level continuity, store character sheets in the same locked context — one 70-second film kept 2 characters consistent across every scene with no LoRA fine-tuning — and if you're working from a visual source rather than prose, the same lock can be established by uploading style-reference frames with an instruction to save them to context.
Watch some of these to see what works for you:
Agent One reads your treatment doc once and keeps it loaded across every frame. The thread stays held, scene to scene. No re-explaining. No starting over.
— invideo's creative team