What do you need to decide before generating AI video to prevent character drift across scenes?
Last updated June 26, 2026
Before generating AI video, lock five decisions: your character's exact appearance as a multi-angle character sheet, a visual reference for every recurring entity, prop design, whether the character's look changes mid-film, and your production order — frames first, then video. One 70-second production kept 2 characters consistent across every scene this way, no LoRA.
Lock each of these decisions as a selected, approved reference before spending a single video credit — every later shot then inherits the same anchors, which is what prevents drift. invideo is an agentic video creation tool with all the current image and video models available, and the invideo agent holds these locked references in persistent context across your whole project.
1. Your character's exact appearance — locked as a character sheet. Generate a multi-angle turnaround — front, side, back, plus face and mid-angle close-ups. Close-up panels matter: they carry small details like scars and accessories across shots and models. Remove objects from the character's hands before generating turnarounds; held items create inconsistency across angles. Generate several options and select one before moving on: one documented production generated 4 variations of every character sheet and environment reference and locked the best before any video generation; another locked each character in roughly 5 generations at ~$9.78 per character. A 70-second short film held 2 characters visually consistent across every scene using character sheets and agent context alone — no LoRA fine-tuning.
2. A reference for everyone and everything that recurs — not just the lead. Antagonists, secondary characters, and creatures need the same lock as the protagonist; anything the model can't see on a sheet, it invents — differently in every scene. In one documented session the invideo agent itself refused to build assets until four questions were answered: what the character looks like, what the antagonist references, what the prop is, and the deliverable format — the four things that change every frame.
3. Prop design. Decide what each story-critical prop looks like before generation and give it its own reference pass — one production's full reference set was just 11 images covering 4 characters and 1 prop. If a prop reads lifeless, iterate it on story logic the way you would a character, because an unconvincing prop breaks believability of the character holding it.
4. Whether the character's appearance changes mid-film. If costume or accessories evolve, decide the beats upfront and create a distinct character sheet per beat — one production needed a separate sheet for every sequence because the character picked up a new trinket in each location. Without per-beat sheets, the model averages the looks and drifts.
5. Your production order — frames first, then video. Approve static frames to final quality before generating any motion, so appearance errors are caught and corrected in low-cost stills rather than repeated across video clips; at the video stage, a model that accepts character references directly (Seedance 2.0 reference-to-video) carries the lock into motion.
Finally, decide where these decisions live: store every locked sheet and reference in the invideo agent's persistent context rather than re-pasting them scene by scene — re-prompting per scene is the anti-pattern that produces drift. The lock also pays off mid-production: if a continuity error appears, fix it once at the source in the character sheet and every subsequent shot inherits the correction.
Watch some of these to see what works for you:
the AI always needs to see what the character is exactly, right? Or else it'll kind of hallucinate and imagine something that's under the cap. So, we don't want to do that. We always want the character to be seen as we see it on the character sheet.
— invideo's creative team