AI Filmmaking

How should I structure a film treatment to use as a persistent AI system prompt?

Last updated June 26, 2026

Structure your treatment in 14 sections the invideo agent can lock into context on the first upload: logline, tone anchors, camera, lens & aspect, lighting grammar, color modes with hex values, composition, movement, atmosphere, mood register, film/DP attribution, a 9-element prompt assembly order, negative prompts, and a quick-reference card. Load it once; the agent holds every directive across every shot.

Open the document with a logline and tone anchors so the agent has the story spine before any style rules — uploading the full script alongside gives it character arcs, themes, and motifs to reason against. The invideo agent is an agentic video tool that reads a treatment once and keeps it loaded across every frame, so the treatment is the single source of truth for the whole production.

1. Logline + three-act spine. One paragraph of story, then a beat list. This is what makes the agent reason three scenes ahead instead of frame-by-frame.

2. Visual language — camera, lens, aspect. Specify shooting format the way a DP would: "spherical, 2.40:1 hard matte, 35mm equivalent" — and be precise, because the agent will catch and apply distinctions like circular bokeh and the absence of horizontal lens flares. Avoid hard-coding clip-level format rules; describe the film's aspect ratio, not a generation chunk size.

3. Lighting grammar as a ratio, not an adjective. Encode it like "85:15 dark-to-light, warm yellow from practicals only," with a "never do this" line per mood. Generic "warm lighting" prompts drift; sourced lighting holds.

4. Color modes with hex. Name each tonal mode ("Mode A — split-toned amber and emerald") with exact hex values. Named modes are reproducible; vibes are not.

5. Composition + movement. Document the director's recurring grammar — substitution rules, doorway holds, subliminal dollies, slow-shutter smears with the page they live on. The invideo agent has been observed pulling a named principle from page 12 and applying it to a scene type the document never specifically addressed; pages and labels make that retrieval possible.

6. Atmosphere + mood register, staged. If the film escalates, break it into emotional stages (a five-stage horror architecture is one working pattern) and give each stage locked rules for camera, lighting, sound — plus a "what never to do" line per stage. That "never" line is what lets the agent make autonomous decisions without drifting.

7. Film and DP attribution. Name the references the agent should treat as canonical so it can pull structurally correct citations, not just stylistically similar ones.

8. The 9-element prompt assembly order. Write the exact sequence the agent must use when assembling every shot prompt: camera spec → lens & aspect → lighting source → palette → composition → atmosphere → mood register → film/DP attribution → negative prompt. Locking the order is what enforces consistency across hundreds of generations.

9. Negative prompts. Per stage and per style block. For a stylized look this must explicitly prohibit live-action and photorealistic outputs; for realism it prohibits the plasticky over-sharp tells. Put them in writing so the agent applies them without being re-reminded.

10. Sound architecture. Most treatments skip this; include it. Half of what makes horror land is what you hear before what you see, and the agent will use sound logic to inform visual choices ("hard material, so it makes a horrible sound when it falls").

11. Pre-production unlock questions. Bake in the four questions the agent should force-answer before generating any asset: who is the protagonist, what is the antagonist/entity, what is the key prop, what is the deliverable format. These four answers "unlock everything" downstream.

12. Prompt templates + a quick-reference card. Give the agent fill-in-the-blank shot prompt templates and a one-page cheat sheet of the entire system. This is what makes "Everything should match" work as a three-word continuation prompt mid-production.

13. Exceptions and adaptations. Separate the director's outliers into their own directive so the agent does not misapply general rules. The Fincher protocol, for example, isolates the period-piece and assassin-genre adaptations from the default grammar.

14. Per-shot evaluation parameters. Close with the 12 parameters the agent should output for every shot request: film reference, shot design, length, style interpretation, emotional register, lens, lighting plan, color script, atmosphere layers, blocking, final prompt, negative prompt, revision prompt. This turns generation from a guess into a decision.

Validate before you generate. Stress-test the document by asking the agent to apply the style to a genre the director never worked in — a courtroom thriller through a horror lens, for example. If the agent asks intelligent clarifying questions (era, nature of threat) and returns stylistically coherent output, the doc has been internalized as grammar, not surface aesthetics. If it just mirrors superficially, tighten the sections it failed on.

One documented production ran a 25-page treatment loaded into the invideo agent before generating a single frame, and the agent then evaluated every scene request against the 12 parameters, sequenced a six-shot ending the director couldn't write, and caught a slow-shutter motion smear cue from page 17 without being prompted — a 70-second short delivered in 2 days for $750 (3,000 credits) with full character consistency and no fine-tuning. The upfront document investment is the highest-leverage act in the workflow; the more clarity you bring upfront, the more sharply the agent holds it across the project.

Where model choice matters, the invideo agent routes shots across the current roster — Seedance 2.0 for reference-to-video continuity, Kling and Veo where their grammar fits — so the treatment doesn't need to specify a model per shot, only the visual intent the model should hit.

Watch some of these to see what works for you:

Watch a 25-page treatment doc run an entire AI short film from first frame to final cut
See how a 91-page director's treatment doc gets stress-tested before a single frame is generated
How a treatment doc lets the invideo agent catch lighting errors and build reverse angles solo

This is the core reason why I insist you take your own sweet time while building the production doc in the beginning, because the more clarity you bring to the project, the more sharply Agent One will hold it for you across the project.

— Hridaye, invideo's creative director

Share

More on AI Filmmaking