AI Filmmaking

What is Director-Level Prompting for AI video and how do you do it?

Last updated June 26, 2026

Director-Level Prompting is writing prompts in cinematography language — shot type, lens, camera move, lighting source, palette, mood, film/DP reference — instead of plain scene descriptions. Video models trained on cinematic data respond to that vocabulary with specific rendering behavior. You do it by briefing the model like a crew: intent first, then the technical layers, in a fixed order.

Start by replacing the scene description with a stacked cinematographic brief. The reliable assembly order, used across documented productions, is nine elements in sequence: camera spec → lens & aspect ratio → lighting source → palette → composition → atmosphere → mood register → film or DP attribution → negative prompt. Hold that order shot to shot and the model stops guessing the look.

The contrast is concrete. Basic prompt: "a man walks into a room." Director-level prompt: "ECU on weathered hands pushing a heavy oak door, slow dolly back revealing a dim interior, single motivated practical lamp, Rembrandt key on the face, spherical 35mm in your film's aspect ratio, atmospheric haze, dread-before-dialogue register, in the grammar of [DP/film reference]; negative: flat lighting, plastic skin, lens flare." Same beat, completely different render — because each layer maps to behavior the model learned from millions of cinematic frames.

The vocabulary that actually triggers behavior. Shot type: ECU, MCU, wide, low-angle, top-down, OTS, POV. Camera move: dolly in/out, tracking, crane, orbit, slow push, handheld, locked-off. Lens: spherical vs anamorphic, 24/35/50/85mm, shallow vs deep focus. Lighting: Rembrandt, hard side key, three-point, practical-only, golden hour, motivated source, an explicit dark-to-light ratio (e.g. 85:15 for high-contrast horror). Palette: name the tonal mode and, where it matters, hex values. Attribution: name the film or DP whose grammar you want as the reference — models recognize the canon.

Direct the intent, don't specify every parameter. Once the cinematographic layers are loaded, talk to the model the way you'd talk to a DOP on set: "I want to stay on the feral guy when we run this scene. No back-and-forth cutting. We hold on him right up till he lunges." That sentence carries shot length, coverage decision, and emotional beat — the model resolves the technicals from the standing brief. Hridaye, invideo's creative director, puts the principle this way: "The thing that made it possible wasn't prompting. It was directing. Agent One didn't feel like a tool — it felt like crew."

Lock the visual language once, not per shot. The highest-leverage move is to write your style as a treatment document — camera, lens, lighting, palette, composition, atmosphere, mood, negative prompts — and load it as persistent context so every subsequent prompt inherits the grammar. A documented 70-second short used a 25-page treatment with 12 evaluated parameters per shot for $750 total; a 3-minute animated episode locked its style by uploading 64 reference frames and ran every generation thereafter against that block, finishing at $315 per finished minute. Once the doc is loaded, three words — "Everything should match" — are enough to continue a sequence with character, lighting and lens grammar intact.

Route per model — invideo holds all of them. invideo is an agentic video tool with all current generation models and upscalers available, and the invideo agent routes each shot to the right one. Practical defaults: Veo handles dense cinematic prompts and complex lighting language well; Kling 3.0 takes multi-shot beats natively, so explicit shot-by-shot framing pays off; Seedance 2.0 reference-to-video rewards stacked references (character + location + last frame) over text-heavy prompts; Runway rewards tight motion verbs and brevity. You write one director-level brief; the agent sends each beat to the model that renders it best.

Build it as a crew, not a single prompt window. For anything beyond a single shot, initialize a creative producer agent first with the full script, shot breakdown and characters — it becomes the vision holder. Then spin up specialist sub-agents alongside it: a storyboard agent to visualize before you direct, a DOP agent (one per scene if scenes need different eyes) for cinematography, a costume and production design agent. You direct in plain on-set language; each agent translates into the cinematographic layers behind the scenes.

If you've directed on set, this is your unfair advantage — shot lists, DOP briefings, blocking, lighting plans are exactly the inputs the models reward. You're not learning a new craft; you're transposing the one you have.

Watch some of these to see what works for you:

Watch a director's bible turn into a complete AI horror short film
Full unedited session: feeding a director's bible to the invideo agent
14 Fincher directives loaded once; the invideo agent holds them throughout

The thing that made it possible wasn't prompting. It was directing. Agent One didn't feel like a tool — it felt like crew.

— Hridaye, invideo's creative director

Share

More on AI Filmmaking