How do you compare multiple AI image models side by side to find the best look for a character?
Last updated June 26, 2026
Run the same character prompt through 2–3 image models in parallel inside one casting sub-agent, generate a 4-option grid per model, compare on identity, skin realism and style fit, then re-test the winner with reference images to confirm the look holds. Lock that model's best panel as the character sheet anchor before any video work.
Start by spinning up a casting sub-agent inside the invideo agent — invideo holds the full image-model roster (Recraft, Nano Banana, GPT-Image-2) plus the video models you'll route to later, so you compare without leaving one tool. Give the casting agent the character's written description, the reference batch (films, photos, palettes), and an explicit instruction to run the identical prompt on two or three models simultaneously rather than sequentially. In one documented production, the casting agent was instructed to run the same character prompt on two image models in parallel and the team picked the preferred aesthetic before building character sheets — that parallel pass is what compresses casting iteration.
Hold the variables constant across models so the comparison is real. Same prompt text, same aspect ratio (whatever your film uses), same framing brief (head-to-toe and headshot), and request a 4-option grid per model rather than single images — image generation is cheap inside invideo, and grids give every model a fair shot at its best output. Across documented productions, four options per asset is the standard locking unit: generate four, pick one, move on. For 4 characters and 1 prop in one production this came to 11 reference images total — enough to cast, not so many that selection stalls.
Review the grids on three axes, in this order. First, identity clarity: does the face read as the SAME person across panels and angles? Second, skin and texture realism — Recraft is specifically chosen for portraits because it renders pores, lines and stubble that make a face read as an actual face, while Nano Banana variants lean cleaner and more stylized; Nano Banana Pro has stronger prompt adherence but can drift toward stock-photo feel. Third, style fit against your film's reference batch — tell the agent what to take from the references AND what to ignore, because exclusion prompting is as load-bearing as inclusion.
De-bias the pick with a blind pass. Ask the casting agent to lay out the best panel from each model side by side without labels and rank them on identity + style fit before you check which model produced which. As Hridaye, invideo's creative director, puts it: "If you feel like it's too off, then it means we should lock it in" — unexpectedness is signal, not noise, when a panel reads as a person rather than a type.
Then run the reference-anchor test on the winner. Re-prompt the winning model with the selected panel as an input reference and ask for three new angles (three-quarter, profile, back) plus a close-up. If identity holds — same face shape, same eye spacing, same skin character at distance and in close-up — that's your character sheet model. If it drifts, drop back to the runner-up and repeat. Remove any objects from the character's hands before the multi-angle pass; props in hand are a known source of turnaround inconsistency.
Lock and propagate. Generate a 4-angle 360 turnaround at 4K plus face and mid-angle close-ups on the chosen model, save it to the invideo agent's context as the canonical character sheet, and tell every downstream agent — DOP agent, storyboard agent — to attach it to every prompt. Across a 70-second short film with 2 characters, this exact pattern (sheets + persistent agent context, no LoRA) held identity across every scene. Once the model is locked and the sheet is canon, fine-tuning becomes optional rather than required — and if you do want a LoRA later, you now have a coherent, model-consistent dataset to train on.
One practical note on the comparison itself: keep it on at most 3 models. More than that and the grid review becomes the bottleneck instead of the casting decision.
Watch some of these to see what works for you:
If you feel like it's too off, then it means we should lock it in.
— Hridaye, invideo's creative director