Recraft vs Midjourney vs Flux for AI film character design — which is best for realistic human faces?
Last updated June 26, 2026
For AI film character design, use Recraft for hero portraits where skin texture sells the realism — pores, lines, stubble that make a face read as a face — and pair it with GPT-Image-2 or Nano Banana for multi-angle character sheets that lock continuity across shots. Inside invideo, the invideo agent routes between them so you don't pick a platform per model.
Treat character design as three jobs, not one model choice, and assign each to the model that does it best.
Hero portrait / close-up where skin sells the shot — use Recraft. In documented productions, Recraft generates portraits with the imperfections that make a face look real: "pores, lines, stubble, like all the little stuff that makes a face look like an actual face." That's the slot where smoother, glossier outputs from other models start to read as plastic — the same plasticky quality teams report fixing in post on Seedance 2.0 video output. Generate the hero portrait at 4K and use it as the anchor for everything downstream.
Multi-angle character sheet for cross-shot consistency — use GPT-Image-2 or Nano Banana, not your portrait model. Build a 4-angle turnaround (front, side, profile, back) plus a face close-up and a mid-angle, at 4K. Include close-up panels, not just wides, so small details — scars, accessories, a necklace — survive across shots. Remove objects from the character's hands before generating turnarounds so the angles stay consistent. In one 3-minute animated episode, 11 reference images covering 4 characters plus a prop were enough to carry character identity across 164 generated clips, and locking one character took roughly 5 generations at about $9.78 per character lock — no LoRA, no fine-tuning.
Concept mood-board and stylized look development — Recraft's strength as a design-forward model makes it strong for batched concept grids and palette exploration. Generate grids of 3 options per round rather than single images; image generation is cheap inside invideo, so use the budget for optionality the way a real director wants options on set.
Picking between them, by face: Recraft V4 currently tops the Hugging Face Text-to-Image Arena overall and is the strongest of the three for textured, photoreal skin in close-up. GPT-Image-2 has the highest prompt adherence of the three — useful when the character description is precise and the wardrobe, props, and ethnicity have to land exactly as written. Nano Banana is the workhorse for character-sheet generation and for accepting hand-drawn references when text prompting fails on complex physical arrangements.
Lock the sheet before any video generation. Generate 4 options per asset, pick the best, and freeze it — this is the single step that prevents continuity drift later. When a continuity error does appear in a shot, don't re-roll the shot: ask the invideo agent to inspect the character sheet, identify the exact panel with the error, fix it there, and store the corrected sheet in context so every subsequent shot inherits the fix.
The film-workflow bridge other comparisons miss. A still portrait is not the deliverable — the video shot is. Inside invideo, the invideo agent holds your locked character sheet as persistent context and routes reference-to-video through Seedance 2.0, Kling, or Veo depending on the shot: Seedance 2.0 reference-to-video carries character and location references simultaneously for continuous takes; Kling handles multi-shot sequences natively; Veo is strong on cinematic atmosphere. Every roster model lives inside invideo, so you don't adopt a second platform to move from Recraft portrait to finished shot.
When models get stuck on character composition — multi-character contact (a character carrying another, ropes, props pressed against bodies) breaks image models faster than almost anything else. Hand-sketch the arrangement, upload the drawing as a reference, and let the invideo agent feed that drawing into Nano Banana to produce the fused character sheet. One production cracked a vampire-carrying-juicebox sheet exactly this way after text prompting failed.
As Hridaye, invideo's creative director, frames the model-choice question: "The better move was to have the agent read the colours and textures of them and prompt for that instead." Models are inputs to a direction process, not the direction itself.
Watch some of these to see what works for you:
ReCraft actually gives you those imperfections like pores, lines, stubble, like all the little stuff that makes a face look like an actual face.
— Hridaye, invideo's creative director