How do you make an AI short film using an agent-based workflow from script to final cut?
Last updated June 26, 2026
Run the film as a crew of agents on one canvas: a creative producer agent that holds the script, a storyboard agent that breaks it into shots, casting and production design agents that lock characters and world, DOP agents that direct generation per scene, and a maker-checker pass before you cut in your NLE.
Start by spinning up a creative producer agent and giving it the full script, a shot breakdown, and character notes — this agent holds the vision and grounds every other agent you start after it. invideo is an agentic video creation tool with every major video and image model wired in, so you build this crew inside one project instead of stitching tools together. As Hridaye, invideo's creative director, puts it: "To really set up the context for the agent, I normally start off with the creative producer agent. That's where I'll give the script, or the shot breakdown, along with the characters. That's the main agent that sort of holds the understanding and the vision of the entire film."
Before any pixels, force four pre-production answers: who the protagonist is, what the antagonist/entity looks like, what the key prop is, and what the deliverable format is. These four answers "change every frame" and unlock the rest of the pipeline cleanly.
1. Script and treatment load. Upload the complete screenplay so the producer agent has full narrative context — arcs, themes, motifs. If you have a directorial style reference (a treatment doc, a visual-language guide, batches of style frames), load it once here and instruct the agent to save it to context for all downstream generations. One documented horror short ran a 25-page style guide as a permanent system prompt; the agent then enforced its rules autonomously across every scene.
2. Casting and world lock. Start a casting sub-agent and have it run the same character prompt across two image models in parallel (Recraft for photoreal portraits with pores/stubble; Nano Banana for character sheets) — pick the better aesthetic, then generate four options per asset and lock the winner. Generate four-angle character sheets with face and mid closeups so every later model sees the same person. In parallel, a production design sub-agent builds world references — request grids of three options per round rather than single images, then extract the winning panels and let those replace your originals as the continuity anchors.
3. Shot list and storyboard. A storyboard agent visualizes each shot from the locked script before any DOP direction goes in — this becomes the visual brief the DOP agents work against. A director's assistant sub-agent sequences the shot order so the crew knows what cuts to what before generation starts.
4. Generation, scene by scene. Assign a DOP agent per scene (different scenes want different eyes — don't share one DOP across the film) and direct it in plain on-set language: which character to hold on, when to cut, what the lens is doing. The invideo agent routes each shot to the right video model — Seedance 2.0 reference-to-video for shots that need to carry character and location context across clips; Kling or Veo where their strengths fit; Runway where you want its motion behavior — so you never have to platform-hop. Generate in your film's chunk size and aspect ratio, run the agent in shot-by-shot approval mode so credits only burn on prompts you've signed off, and plan for roughly 3 generations per usable shot. Across documented productions, only about 25% of generated clips made the final cut and many final shots are stitched from the best seconds of two or more generations — overgeneration is a budget line, not waste.
5. Continuity review. When a continuity error shows up (wrong earpiece, costume drift), don't re-roll the shot — ask the agent to inspect the source character sheet, find the panel with the error, fix it there, and store the corrected sheet in context. Subsequent shots inherit the fix automatically.
6. Maker-checker before the cut. Assemble a rough cut and send it back to the producer agent with an open "what's working, what's not" prompt. In one documented production this caught an entity-reveal shot running at the wrong emotional stage — a structural error a human editor missed.
7. Hand-off to the NLE. Export selects and cut the final in DaVinci Resolve or Premiere. A named upscaling sub-agent (call it "upscale artist") batch-runs Topaz Astra on invideo first; then a light blur, grain, and grade pass moves the footage closer to live action.
Documented productions running this kind of agent-crew pipeline have landed at $750 for a 70-second short over 2 days, $950 for a 3-minute animated episode (2 people, 2 days), $870 for a ~90-second horror short, and $1,500 for a 2-minute brand promo a director estimated would have cost $100k–$500k traditionally — a range of roughly $315–$750 per finished minute depending on team and approach. Parallel agent counts in those runs sat between 6 and 8 simultaneous agents per operator.
Watch some of these to see what works for you:
To really set up the context for the agent, I normally start off with the creative producer agent. That's where I'll give the script, or the shot breakdown, along with the characters. That's the main agent that sort of holds the understanding and the vision of the entire film.
— Hridaye, invideo's creative director