How do you create consistent AI video characters with the same face and outfit across scenes?
Last updated June 26, 2026
Lock the character before you generate a single video clip. Build a multi-angle character sheet (front, 3/4, side, back, plus a face close-up), lock the outfit in the same sheet with no held props, write a fixed descriptor glossary you reuse verbatim in every prompt, then attach the sheet as a reference to every shot.
Start in the invideo agent, which is an agentic video tool that holds project context and routes shots to whichever model fits — Recraft, Nano Banana, GPT-Image-2 for stills; Seedance 2.0, Kling, Veo for video — so the character sheet you build once travels with every generation.
1. Generate the character portrait first. Use Recraft for the face — it renders pores, lines, and stubble that read as a real person rather than plastic — at 4K. Generate four options per character and pick one before moving on. Lock it.
2. Build a multi-angle character sheet from that portrait. Hand the locked portrait to Nano Banana (Pro where available) and generate a 360° turnaround at 4K: front, 3/4, side profile, back, plus a face close-up and a mid-angle. Remove every held prop before generating — objects in hands break consistency across angles. Include close-up panels for small details (scars, jewelry, stitching) so the model has them to look at, not infer. Generate four sheet variations and pick one. In one documented production, eleven images total covered four characters and one prop using exactly this approach.
3. Lock the outfit on the same sheet. The character sheet IS the costume sheet — same garment, same color, same layering, shown from every angle. If the character changes outfit or accumulates accessories across scenes, build a separate sheet per beat (one production made a new sheet for each city its character passed through because a trinket was added each time). Don't try to describe outfit changes in prompts; show them in sheets.
4. Write a locked descriptor glossary and reuse it verbatim. Pin a short paragraph for each character — hair (exact color and cut), eyes, skin, build, distinguishing marks, garment-by-garment outfit description with colors — and paste that identical block into every shot prompt. Drift starts where wording drifts. Add a negative-prompt line too (no live-action, no photorealistic if you're animating; no outfit changes, no accessories not on sheet).
5. Attach the sheet to every video generation. In the invideo agent, set shot-by-shot approval mode and attach the character sheet plus the descriptor block to every prompt — every prompt, no exceptions. For multi-shot continuity, Seedance 2.0 reference-to-video carries character context across clips by ingesting the full prior clip plus the character sheet; Kling 3.0 accepts up to four reference elements for multi-subject scenes; the invideo agent picks the model that fits the shot. All these models are available inside invideo, so you build the sheet once and route it everywhere.
6. Know where drift shows up and audit for it. Faces drift first at the edges — hairline, jaw, ears. Outfits drift next at stitching, layering, and small accessories. The overall silhouette holds longest. After each generation, scan those zones specifically. If something's off, don't re-roll the shot — ask the invideo agent to inspect the character sheet, identify which panel has the error (it can locate the exact one), correct it there, and re-attach the fixed sheet. The fix propagates; the rest of the film stays intact.
7. For multi-character scenes, give the cast visual contrast. Different silhouettes, different palettes, different hair shapes — models confuse characters that look similar. Build a separate sheet per character and attach both (or all) to multi-character shots.
Hridaye, invideo's creative director, on what the sheet does for you across a film: "Seventy seconds. Two characters. The same person across every scene. No LoRA needed." That production ran on character sheets and persistent context inside the invideo agent — no fine-tuning, ~$750 total, two days.
Expect roughly 5 generations to lock one character (~$9.78 in credits in one tracked run), and budget ~3 generations per usable shot once the sheet is locked. The locking pass is where consistency is won or lost — overspend there and you save it everywhere downstream.
Watch some of these to see what works for you:
Seventy seconds. Two characters. The same person across every scene. No LoRA needed.
— Hridaye, invideo's creative director