Should you upload a full script document or prompt scene by scene when using AI for film production?
Last updated June 26, 2026
Do both — in that order. Upload the full script once so the agent holds story, characters, and tone as persistent context, then prompt scene by scene (or shot by shot) for generation. Full-document loading wins on consistency and character recall; scene-level prompting wins on control, iteration, and cost per call. Combining them is what serious AI film workflows actually do.
Start by loading the complete screenplay into a creative producer agent before any generation begins. The invideo agent is an agentic video creation tool with every current video and image model and upscaler available inside it, so the script you load once becomes the context every downstream sub-agent inherits. With the whole script in context, the agent knows character arcs, themes, props, and tonal shifts — which means later scene-level prompts can stay short because the agent already understands what each scene is doing in the larger film. One documented 70-second short used a 25-page treatment plus the full script as the system prompt and produced 12 key parameters per shot from minimal scene-level requests; another production ran a three-word continuation prompt — "Everything should match" — and the agent maintained character, lighting, lens grammar, and spatial continuity across multiple shots because the document context was already loaded.
Then prompt scene by scene for generation itself. Once the script and treatment are in, decompose into per-scene direction: send the agent the specific scene with shot intent, lens, lighting note, and any reference attachments — not the whole script again. This is where you get granular control, cheaper iterations, and clean approvals before credits burn. Use the invideo agent's always-ask mode so each scene-level prompt gets shot-by-shot approval before generation; in one Arcane-style production, the team generated 164 clips at 15 seconds each and kept 41 in the final cut (~25% selection rate), so scene-level approval is what kept overgeneration deliberate rather than wasteful.
A practical rule for where the line sits: full-document up front for setup, character locking, and style; scene-level prompts for shot generation; act-by-act batching when the film is long. Working act by act — fully storyboard, generate, and approve one act before starting the next — prevents the agent from losing context on multi-scene projects, and it's the discipline behind a 7-minute animated short produced in 25% increments. For very dense scenes the agent itself will flag model limits and recommend splitting; in one production an 18-cut, 15-second bathroom scene got split in two on the agent's recommendation before any credits were spent.
Structure the scene-level prompts as direction, not as parameters. "I want to stay on the feral guy when we run this scene. No back and forth cutting. Hold on him right up till he lunges" — that style of prompt, fed scene by scene against a fully-loaded script and treatment, is what unlocks consistent, intentional output. The agent has the story; you give it the shot. Across documented productions this hybrid runs at $315–$750 per finished minute (Arcane-style episode at $315/min, 70-second short at ~$643/min, horror short at ~$580/min, 2-minute promo at $750/min) over 2–5 day timelines with 1–4 person teams — costs that depend on team and approach, not on whether you uploaded the script up front (you should).
One note on multi-agent setups: when you deploy a crew — a creative producer agent, a storyboard agent, a DOP agent, costume, production designer — the creative producer agent holds the full script and grounds every other agent in the same understanding. The downstream agents then receive scene-level direction from you. That's the same full-document-then-scene-level pattern, applied across roles instead of across time.
Watch some of these to see what works for you:
Just think about it as all the information you want your crew to have as you start building with them. So if you want them to have all the thoughts that are in your head, just put them down in an organized fashion and upload them onto the agent and watch the magic after that.
— Hridaye, invideo's creative director