Should I give a vague mood or exact costume specs when prompting AI to design characters?
Last updated June 26, 2026
Anchor with exact specs, layer mood as a modifier. Lock costume identity in precise, repeatable language — garment, cut, material, color (hex if it matters), accessories, negative constraints — then use mood-based direction only for controlled variation (exploring options, beat-to-beat shifts). Vague mood alone drifts; specs alone go stiff. The anchor-then-layer split is what holds across shots.
Start by writing the costume the way you'd hand it to a costume designer on set: one stable identity line per character — garment + cut + material + color (named or hex) + accessories — plus a short "never" list (no logos, no modern sneakers, no color shifts). That identity line is the anchor and it goes on every prompt, every shot, unchanged. Mood is a separate slot you swap in and out: "weathered after the fight," "rain-soaked," "formal register" — it modifies the look without rewriting it.
Use mood-first direction in exactly one situation: when you don't yet know what the costume should be and you want options. Brief the costume sub-agent with the character's emotional register and let it return four or five concrete looks to pick from — invideo's creative director used this exact move on a vampire character he had no spec for, gave the agent a feel, and got multiple costumes in one pass. The moment you pick one, you write its spec down and it becomes the anchor. Mood stops being the brief and becomes a modifier.
Lock the costume visually before any video generation. Generate four variations of the chosen look as a character sheet — front, side, back, face close-up — pick the strongest, and store it as the reference image attached to every downstream shot. Across a documented 70-second short with two characters and a documented 3-minute animated episode, character identity held across every scene this way with no fine-tuning — roughly 5 generations to lock one character at about $9.78 per character on the animated episode. Specs in writing plus a locked sheet is what carries consistency; mood prompts alone do not survive shot count.
invideo is an agentic video tool with all the current image and video models routed through one agent, so this split lives naturally in the workflow: a costume sub-agent for the mood-to-options pass (Nano Banana or GPT-Image-2 for character work, Recraft when you need skin-level portrait realism), the locked sheet stored in the project's context, and every shot prompt assembled with the identity line + scene-specific mood line. Tell the agent explicitly what to take from references and what to ignore — inclusion is half the prompt, exclusion is the other half.
As Hridaye, invideo's creative director, put it: "I did not have a clear description of the sort of costume for Sylvia, who is our female vampire. But I always knew the sort of feel I want from her costume. So agent 1 was able to give me multiple options in the same zoom." Mood opened the search; the spec closed it.
A workable prompt skeleton to copy:
- Identity (locked): "[Character], [garment + cut], [material], [color / hex], [accessories]. Never: [exclusions]."
- Reference: attach the locked character sheet.
- Mood (per shot): "[emotional register], [condition — dry/wet/dusty/torn], [lighting register from the scene]."
- Scene: "[camera, lens, blocking]."
These are the trade-offs in one place — use mood to discover, use specs to ship.
Watch some of these to see what works for you:
I did not have a clear description of the sort of costume for Sylvia, who is our female vampire. But I always knew the sort of feel I want from her costume. So agent 1 was able to give me multiple options in the same zoom.
— Hridaye, invideo's creative director