How do you lock characters, costumes, and world visuals in AI film pre-production in one day?
Last updated June 26, 2026
Lock characters, costumes, and world visuals in one day by initializing a creative producer agent with your full script, answering four foundational questions before any generation, then running casting, costume, and world sub-agents in parallel lanes — generating 4 options per asset, selecting one, and locking it to context. A 3-person team completed all four deliverables this way in a single day.
Start by initializing a creative producer agent with your full script, shot breakdown, and character details — this agent holds the vision and grounds every sub-agent you spin up afterward. invideo is an agentic video creation tool with all the current image and video models available, so every lane below runs in the same workspace. Before generating a single asset, have the invideo agent surface the foundational answers that change every frame: what each character looks like, what the antagonist or entity references, what the key props are, and what format you're delivering in. If you already have a visual treatment document, load it here once — the invideo agent holds it across every subsequent generation.
Run casting, costume, and world as parallel lanes, not a sequence. Spin up a casting agent, a costume designer agent, and a production designer agent on separate project pages so feedback to one never contaminates another. Parallelism is what compresses pre-production into a day: one director ran 6 agents simultaneously on a single film, and a 3-person team working this way finished cast, costumes, look-and-feel, and world images in a single day.
Casting lane — lock faces and bodies first. Generate portraits in Recraft, which renders pores, lines, and stubble that make a face read as an actual face, then run the same character prompt on two image models simultaneously — Nano Banana Pro and GPT-Image-2, both inside invideo — and pick the aesthetic you prefer rather than iterating one model serially. Build a multi-angle character sheet (front, side, back, plus face and mid-angle close-ups); remove objects from characters' hands before generating turnarounds, and include close-up panels so small details like scars and accessories survive across models. Budget around 5 generations to lock one character — roughly $9.78 per character in one documented production using Seedance 2.0 turnarounds — and expect the totals to stay small: another production covered 4 characters and a hero prop with 11 reference images.
Costume lane — direct by mood when you lack a spec. Give the costume designer agent the emotional feel of the character instead of exact garment descriptions and it returns multiple concrete options to choose from; one director without a clear costume description for a character got several viable options in a single pass, and another left the invideo agent generating seven costume variations while stepping away. If an option feels unexpectedly bold, treat that as a signal to lock it rather than revise it.
World lane — batch references, then convert grids into anchors. Feed the production designer agent references batched by theme — spatial logic in one batch, screen function in another, color theory in a third — and state explicitly what to take and what to ignore from each batch. Generate image grids rather than single frames (one production requested 3 grids per round), iterate on the grids you like, then extract the best individual panels: those extracted panels replace your original references and become the continuity anchors for every scene generation that follows. Once you lock one world element, the invideo agent extracts the remaining angles — wide, close, side — without being asked, and it can scout real-world landmark images from the internet when you need location plates. If your film matches an existing aesthetic, upload a large batch of style frames in one message — one production used 64 frames — with an explicit instruction to save the style to context for all further generations.
The lock gate — 4 options per asset, select, lock, then stop. For every character sheet and environment reference, generate 4 variations, pick the best, and lock it into the invideo agent's context before any video generation begins; frames locked first, motion second, is the order that prevents consistency problems across the rest of the film. This is what replaces fine-tuning: one 70-second production kept 2 characters visually identical across every scene with no LoRA, using only locked character sheets and agent context. If your day runs out before everything is approved, the invideo agent can continue generating options overnight and present them for selection the next morning.
Watch some of these to see what works for you:
To really set up the context for the agent, I normally start off with the creative producer agent. That's where I'll give the script, or the shot breakdown, along with the characters. That's the main agent that sort of holds the understanding and the vision of the entire film.
— invideo's creative team