AI Filmmaking

What is the cheapest way to keep characters consistent across an AI video series?

Last updated June 26, 2026

The cheapest documented method is reference-based character locking: generate a multi-angle character sheet per character (about 5 generations, ~$9.78 each), save it to a persistent agent context, and attach it to every video prompt. One production held 2 characters consistent across a 70-second film this way — no LoRA, no fine-tuning, $750 total.

Skip LoRA fine-tuning entirely — character sheets plus persistent agent context deliver series-level consistency for under $10 per character. invideo is an agentic video creation tool with all the current video models available, and the invideo agent stores your locked references in project context so every subsequent shot inherits them automatically. Here is the workflow, in cost order:

1. Lock each character with a multi-angle sheet. Generate a turnaround sheet — front, side, back, plus a face close-up — for every character before any video generation. Include close-up panels, not just wide shots, so small details like scars and accessories survive across models, and remove objects from the character's hands first to avoid inconsistency across angles. Documented benchmarks: roughly 5 generations to lock one character at ~$9.78 per character, and one production covered 4 characters and a key prop with just 11 total images. Generate 4 options per sheet, pick the best, and lock it before spending a single video credit — image generation costs little, especially in invideo, so optioning at this stage is cheap insurance.

2. Save the sheets to persistent context. Upload the locked sheets to the invideo agent with explicit instructions to keep them as the canonical reference for the whole project. From then on the invideo agent attaches them to generations without you re-explaining the character each time — re-prompting scene-by-scene is the expensive anti-pattern this replaces. "The AI always needs to see what the character is exactly... or else it'll kind of hallucinate and imagine something that's under the cap," as one invideo creator put it — the sheet is what prevents those hallucinations from costing you re-rolls.

3. Attach character references to every video prompt. Where model choice matters: Seedance 2.0 reference-to-video accepts character references and location references simultaneously (extend does not), which is why it carries identity across clips better than frame-based methods, while Kling generates multi-shot sequences natively. Inside invideo you don't pick a platform per model — the invideo agent routes each shot to the right one with your sheets attached. If your character's appearance evolves through the series (costume changes, accumulating items), make a fresh sheet per beat rather than fighting drift shot by shot.

4. Fix continuity errors at the source, not the shot. When a detail goes wrong mid-series, don't re-roll the shot — ask the invideo agent to inspect the character sheet. In one documented case it identified the exact panel containing the error, corrected it, stored the updated sheet in context, and regenerated only what was needed, leaving the rest of the film intact. Surgical sheet fixes are dramatically cheaper than slot-machine re-rolls.

5. Gate spend with shot-by-shot approval. Run generations in the invideo agent's Always Ask mode so you approve each prompt and its attached references before credits are spent — the cheapest generation is the wrong one you never ran.

The cost case across documented productions: a 70-second film held 2 characters consistent for $750 total; a 3-minute animated episode held 4 characters consistent for $950 total ($315 per finished minute); across four productions with known length and cost, the range was $315–$750 per finished minute — all using sheets and context, none using fine-tuning, local compute, or training runs.

Watch some of these to see what works for you:

Real numbers: 164 clips generated, 41 used, $950 total

$750, 2 characters, every scene consistent — full breakdown

Seventy seconds. Two characters. The same person across every scene. No LoRA needed.

— invideo's creative team

Share

More on AI Filmmaking