What shot size terms should I use when prompting AI video tools?

Name the size explicitly: extreme close-up, close-up, medium shot, wide, or extreme wide. Also describe what is in frame and what is deliberately withheld.

Which camera movement terms work best in AI video prompts?

Use standard terms like dolly in, dolly out, pan, tilt, crane, handheld, and whip pan. Also include static hold or action-tied hold instructions, which prompts most often miss.

How should I describe lighting when prompting AI video tools?

Name the light source rather than using adjectives — for example, warm yellow from the lamps only performs better than warm lighting. Use terms like motivated and practical lighting, and quantify ratios where possible.

Should I use negative prompts when generating AI video?

Yes. A negative prompt states what the shot must never be in concrete style terms. Place it as the final element in a fixed prompt sequence so no visual layer is accidentally dropped.

Filmmaking Terms to Use When Prompting AI Video Tools

Q: How do I maintain consistent style across multiple AI-generated shots?

Load recurring camera, lighting, and palette directives into the invideo agent's context once so they carry across every shot automatically instead of being retyped per prompt.

Prompt AI video tools in the vocabulary you'd give a crew: shot size and framing, camera movement (dolly, pan, static hold), angle (low, top-down, reverse), lens behavior (spherical vs anamorphic, shallow depth of field), light source and ratio, and palette — assembled in a fixed order and paired with a negative prompt stating what the shot must never be.

Write your prompts as directorial intent in film language rather than technical parameter strings — documented productions consistently get better results prompting an AI model "like a director prompts his crew." invideo is an agentic video creation tool with all the current models available, so the invideo agent accepts this vocabulary conversationally and translates it into each model's prompt format — Veo, Kling, or Seedance 2.0 — per shot.

Shot size and framing. Name the size explicitly: extreme close-up, close-up, medium shot, wide, extreme wide. Then say what is in frame and what is withheld — one director's visual-language system encoded into an AI agent covered exactly this as discrete sections on camera, angles, and composition, 14 sections in total.

Camera movement. Use the standard set — dolly in, dolly out, pan, tilt, crane, handheld, whip pan — plus the two terms prompts most often miss: a static hold (the camera does not move) and a hold instruction tied to action ("hold on him until he lunges, no cutting"). Slow, near-imperceptible moves work too: one encoded director system used "subliminal dollies" as a named directive.

Camera angle. Low angle, high angle, eye-level, Dutch/canted, bird's-eye, top-down. Coverage vocabulary matters as much as single-shot vocabulary: ask for "the reverse on [character]" or "the compositionally opposite angle of the last shot" to build matched pairs for editing — in one documented production, a complex top-down shot landed on the first generation once it was directed in this language instead of manually prompted.

Lens and format. Specify lens behavior, not just focal length: wide-angle, telephoto, macro, shallow depth of field. Spherical versus anamorphic is a meaningful distinction — spherical glass produces circular bokeh and no horizontal flares — and precision here is worth checking: in one production the AI agent had logged "anamorphic" for a director who shoots spherical, and corrected itself when challenged ("35mm, 2.40:1 hard matte — widescreen by extraction, not optics"). State the aspect ratio in your film's delivery format as part of the camera spec.

Lighting. Name the source, not the adjective. "Warm yellow from the lamps only, like all the refs" produces more accurate results than "warm lighting." Use motivated and practical lighting as terms, and quantify where you can — one director's lighting grammar was encoded as an 85:15 dark-to-light ratio the AI agent applied across every shot.

Palette and grade. Encode color as named tonal modes with exact hex values ("Mode A — split-toned amber and emerald") rather than loose adjectives, and add a film or DP attribution ("shot like [film/DP]") to anchor the overall look — both are reproducible vocabulary, not mood words.

Assembly order and negative prompts. Put the terms in a fixed sequence so no layer gets dropped: one production held a 9-element order across every frame — camera spec, lens & aspect ratio, lighting source, palette, composition, atmosphere, mood register, film/DP attribution, negative prompt. The negative prompt is vocabulary too: state what the shot must never be, in concrete style terms — a documented animated production's style block read "not live action, not photorealistic" with every surface required to feel hand-painted, and every subsequent prompt started with it. Another production went further, outputting 12 parameters per shot including emotional register, blocking, atmosphere layers, and a revision prompt.

Beyond the vocabulary itself: where a term alone won't land a shot — POV and multi-character contact are documented examples — add a visual reference rather than rewording, and load recurring camera, lighting, and palette directives into the invideo agent's context once so they carry across shots instead of being retyped per prompt.

Watch some of these to see what works for you:

Full session: feeding a director's bible to AI and correcting lens, lighting, props live

Wong Kar-wai style guide as AI system prompt — how vocabulary locks across every shot

When AI vocabulary fails: phone reference and sketches fix POV and multi-character shots

Pretty much exactly like how I would talk to my DOP on set or how I would talk to my DA on set.

— invideo's creative team, on how on-set filmmaking vocabulary maps to directing AI agents

What filmmaking terms and camera vocabulary should you use when prompting AI video tools?

More on AI Filmmaking

What filmmaking terms and camera vocabulary should you use when prompting AI video tools?

Related questions

More on AI Filmmaking