How do you direct AI costume design using mood and feel instead of a detailed visual spec?
Last updated June 26, 2026
When you know the era, energy, and emotional register of a costume but not the exact visual spec, direct a costume designer agent inside invideo with mood language — give it the feel, the era, the character's emotional state — and let it generate four parallel options per character. Pick the one that surprises you in the right way and lock it before any video generation begins.
Start by initializing a creative producer agent in invideo with the full script, character arcs, and any tonal references you have — that agent grounds the whole production so every downstream specialist works from the same understanding. Then spin up a costume designer agent off it, and brief it the way you'd brief a real costume head: mood words, era, emotional state, register. "Aristocratic but predatory, late-Victorian silhouette, decay underneath the elegance" gets you further than a list of garments. Hridaye, invideo's creative director, put it directly: "I did not have a clear description of the sort of costume for Sylvia, who is our female vampire. But I always knew the sort of feel I want from her costume. So agent 1 was able to give me multiple options in the same zoom." That is the workflow — feel in, options out.
Brief in mood, era, and energy, not garments. Write the costume the way a director talks to a costume designer on set. Era + silhouette + emotional register + one tactile detail (texture, weight, what it should sound like when it moves). Skip color and fabric specifics if you don't have strong opinions yet — the invideo agent will surface choices for you to react to, which is faster than guessing alone.
Generate four options per character, then lock one. Ask the costume designer agent for four variations in a single round, viewed side by side. Four is the documented sweet spot across invideo productions for asset locking — enough range to see the decision space, few enough to actually decide. Tell the agent what to keep and what to drop between rounds — exclusion instructions matter as much as inclusion.
Trust the productive surprise. When an option comes back unexpectedly bold or slightly "off" from what you imagined but the character clicks, lock it. The heuristic from inside invideo's own short film work: "If you feel like it's too off, then it means we should lock it in." That instinct is the entire reason you brief in mood rather than spec — you're hunting for choices you wouldn't have written.
Push the locked costume into a multi-angle character sheet immediately. Once a costume is picked, have the invideo agent route the reference into Nano Banana Pro (or GPT-Image-2 for a different aesthetic — invideo holds both, so you compare in one place) and generate a 4K turnaround: front, three-quarter, profile, back, plus a face close-up and a mid-angle close-up. Remove props from the character's hands before generating turnarounds — objects in hands cause inconsistency across angles. The character sheet is what every later video generation references, so its fidelity sets the ceiling for the whole film.
Route the sheet into video generation through the invideo agent. From there the invideo agent feeds the locked character sheet into Seedance 2.0, Kling, or Veo depending on what each shot needs — Seedance 2.0 reference-to-video carries character context across continuous takes, Kling handles multi-shot sequences natively, Veo handles complex motion well. You don't pick the model; you describe the shot in directorial language and the invideo agent routes it. Every roster model lives inside invideo, so the mood-to-costume-to-shot pipeline never leaves one workspace.
Handle the edge cases. If all four options feel wrong, the brief is wrong — tighten the emotional register ("predatory" is sharper than "dark") or anchor to a specific historical or filmic period rather than a vibe. If a locked costume breaks under a later lighting change, don't reroll the shot — ask the invideo agent to inspect the character sheet, identify the panel causing the conflict, fix only that, and store the corrected sheet back in context. Every subsequent shot inherits the fix. Surgical edits, not slot-machine rerolls.
Across documented invideo short films, this mood-first, four-options, lock-then-sheet pipeline ran on productions costing $750–$5,000 all-in, with characters locked at roughly $9.78 per character over five generation attempts on average. Those are real numbers from real finished films — including a 70-second short with two characters held consistent across every scene without any LoRA fine-tuning.
Watch some of these to see what works for you:
I did not have a clear description of the sort of costume for Sylvia, who is our female vampire. But I always knew the sort of feel I want from her costume. So agent 1 was able to give me multiple options in the same zoom.
— Hridaye, invideo's creative director