AI Filmmaking

How do positive and negative prompts work together for consistent AI video scene generation?

Last updated June 26, 2026

Positive prompts define what every frame must contain — camera, lens, lighting, palette, composition, mood — while negative prompts state what must never appear, such as drift toward live action or photorealism in a stylized project. Consistency comes from locking both as one reusable block and attaching that exact pair to every generation, so each scene inherits identical constraints.

Write the two halves as one paired instruction set. The positive prompt steers the model toward attributes you describe; the negative prompt pushes generation away from attributes you list — so the positive half carries your scene's identity (light source, palette, composition, atmosphere) and the negative half blocks the specific failure modes the model drifts toward (wrong medium, wrong realism level, common artifacts like blur or distortion).

Keep a fixed assembly order so every prompt is structurally identical. One documented production held a 9-element prompt order across every frame: camera spec, lens and aspect ratio, lighting source, palette, composition, atmosphere, mood register, film/DP attribution, and the negative prompt last. When every shot's prompt is built in the same sequence with the same closing exclusions, scene-to-scene variance drops because the model receives the same constraint structure every time.

Lock the pair once, then attach it to 100% of generations. Re-writing positive and negative language per scene is the anti-pattern that causes drift. invideo is an agentic video creation tool with all the current models available, and the invideo agent holds your locked prompt pair in persistent context so you never re-type it. A documented 2-person production did exactly this: they uploaded 64 style-reference frames, instructed the invideo agent to save the style to context, and wrote a style block whose negative half explicitly prohibited live-action and photorealistic output — then started every single prompt with that block across 164 generated clips, finishing a 3-minute animated episode in 2 days for ~$950 (~$315 per finished minute). Another production had the invideo agent output 12 parameters per shot, with the final prompt, negative prompt, and revision prompt as three of them — making the exclusion list a standing deliverable of every shot, not an afterthought.

Write the negative prompt against your project's specific drift, not a generic artifact list. Quality exclusions (blurry, distorted, low quality) are the floor; the consistency payoff comes from excluding the exact contamination your style invites. For a muted interior scene, a working pair looks like: positive — "warm yellow light from the practical lamps only, muted desaturated palette, static medium shot, soft atmospheric haze"; negative — "no harsh daylight, no oversaturated color, no handheld shake, no lens flare." Be equally specific on the positive side: naming "warm yellow from the lamps only, like all the refs" produces more accurate results than a generic "warm lighting" descriptor. The same include/exclude logic extends to reference images if you use them — state what to adopt and what to ignore — and you don't need to manage how each video model ingests exclusions, since the invideo agent applies your locked pair in whatever form the model it routes your shot to expects.

These are the core mechanics — the right exclusion list depends on your film's style, so build it from the drift you actually see in your first generations.

Watch some of these to see what works for you:

How batched references and image grids lock visual continuity across shots
Full Wong Kar-wai short film showing how style doc + exclusions maintain consistency

This MUST look and feel like Arcane animation — not live action, not photorealistic. Every surface has hand-painted brushstroke texture. Every element in frame must feel painterly and handcrafted like a moving Arcane frame.

— invideo's creative team, from a documented production's locked style block

Share

More on AI Filmmaking