AI Filmmaking

How do you use AI to generate visuals in James Wan's horror style?

Last updated June 26, 2026

Encode Wan's style as a written visual-language document the invideo agent holds as persistent context across every shot: an 85:15 dark-to-light lighting ratio, a 2.40:1 hard matte frame, sickly desaturated palette, spherical-lens grammar (circular bokeh, no horizontal flares), and five escalating emotional stages with locked camera, lighting, and sound rules per stage.

Start by writing a director's bible — a structured treatment document that captures Wan's grammar, not just his look. invideo is an agentic video tool where you upload that document once and a creative producer agent holds it as context across every generation, so you stop re-prompting style on each shot. The bible that produced a documented ~90-second Wan-style horror short ran 9 steps for shot design and 8 steps for color, and the agent applied them autonomously frame by frame.

What to put in the Wan visual-language document

Write it in sections the agent can pull from by name:

  • Lighting grammar — 85:15 dark-to-light ratio, practical sources only (lamps, bulbs, candles, flashlights), hard shadow falloff, single warm key against deep negative fill. Phrase corrections to the source explicitly: "warm yellow from the lamps only," not generic "warm lighting."
  • Lens and frame — spherical lenses (circular bokeh, no horizontal flares), 2.40:1 hard matte — widescreen by extraction, not anamorphic optics. Challenge the agent if it writes "anamorphic"; in one documented session the invideo agent had noted anamorphic, was questioned, and self-corrected to spherical.
  • Palette — desaturated greens, greys, dirty whites, jaundiced ambers; clinical neutrals broken by one warm practical. Encode named tonal modes with hex values so the palette is reproducible across shots.
  • Camera movement — slow subliminal push-ins on stillness, locked-off holds before the scare, Dutch tilts on threshold beats, fast whip only on the lunge.
  • Atmosphere and withholding — "Fear lives in what the audience cannot fully see, cannot fully hear, and cannot fully understand" — write this as a rule, with negative-prompt language that bans full-reveal lighting in pre-reveal stages.
  • Sound architecture — a full audio section. Half of what makes Wan land is what you hear before what you see; specify diegetic prop sound ("hard material, so it makes a horrible sound when it falls") inside visual briefs.
  • Five emotional stages — Stage A through E, each with locked rules for camera, lighting, sound, and a "what never to do" line. The stage rules are what let the agent make autonomous decisions instead of guessing.

Validate the document before you generate a frame

Stress-test it by asking the agent to apply Wan's grammar to a genre Wan never shot — a courtroom thriller, a kitchen drama. If it returns coherent shots and asks clarifying questions (era, nature of threat) instead of pattern-matching surface horror, the bible has been internalized as grammar. In one documented test, the agent pulled the Stage A rule mid-generation, flagged that shadows were leaning blue-green instead of neutral grey, and offered a warmer pass — without being asked.

Lock the four pre-production answers

Before any image or video runs, force the agent through four questions that change every frame: who is the protagonist (look, era), what is the entity (Bathsheba-adjacent? something newer?), what is the prop (doll, ball, locket — and what sound does it make), and what is the deliverable (frames first, then video). Generate four options per character sheet and environment plate, pick one, lock it. This is the step that prevents Wan-style continuity from drifting across a film.

Frames first, then video — using the right model per stage

Wan's style lives in faces and textures, so build static frames before motion. invideo holds every current model and the invideo agent routes each shot to the right one: Recraft for character portraits at 4K (it renders pores, lines, stubble — the skin imperfections that make a clinical-lit face read as real), Nano Banana (Pro where prompt adherence matters) for 360-degree character sheets with four angles plus face and mid closeups, and Seedance 2.0 for final video with slow pushes and locked holds. Remove props from hands before generating turnarounds so the sheet stays consistent across angles.

Direct, don't prompt

Once the document is loaded, talk to the invideo agent the way you'd talk to a DOP on a Wan set: "hold on him until he lunges, no cuts," "reverse on Marcus, near wall is concrete with a single bulb," "Stage C — withhold the entity's face, silhouette only." The agent surfaces gaps ("that near wall doesn't exist yet — what should it be?") rather than inventing them. As Hridaye, invideo's creative director, puts it: "You direct. It remembers."

The documented Wan-style horror short ran ~90 seconds, ~400 video generations and 30 image generations across 2 days for $870 (4,100 credits) — a useful benchmark for budgeting an AI horror short before you start.

Watch some of these to see what works for you:

the invideo agent directed a full James Wan horror short — watch how
full unedited session: directing the invideo agent with a Wan bible
2 days, $870, James Wan style — the invideo agent breakdown

Fear lives in what the audience cannot fully see, cannot fully hear, and cannot fully understand.

— Hridaye, invideo's creative director, citing the Wan director's bible

Share

More on AI Filmmaking