How do you use AI to generate visual options for abstract or dream sequences in a film?
Last updated June 26, 2026
For abstract or dream sequences, generate options before you commit: ask the invideo agent for 5+ distinct visual interpretations of the beat, then pick one as the canonical reference for the scene. Pair that with image grids over single shots, semantic variation prompts (shift one element at a time), and color-and-texture extraction from any non-photographic references.
Start by telling the invideo agent what the sequence is FEELING — not what it should look like. Abstract beats (a hallucination, a dream, a psychedelic interior) don't have one correct image, so ask for several distinct visual interpretations of the same moment and pick the one that lands. On a documented short film, the team did exactly this for a psychedelic hallucination sequence: the invideo agent generated 5 variations before one was selected as the canonical reference for the rest of that scene. That selected frame then becomes the anchor — every subsequent shot in the sequence references it for continuity.
Generate grids, not single images. Image generation is cheap inside invideo, so use that to your advantage: ask for 3 grid options per round (4 panels each), iterate on the grid you like, then extract the strongest panels and use THOSE as your scene anchors going forward. As one director put it: "Every director in real life always wants options." Grids give you that optionality at a fraction of the cost of one-at-a-time generation, and they let you see range across lighting, palette, and composition in a single look.
When you have references that are illustrated, animated, or otherwise non-photographic (a painting for a dream, a frame from an animated film for a vision), don't drop them into a video prompt directly — that breaks. Instead, instruct the invideo agent to read the COLORS and TEXTURES of the reference and prompt for those qualities in your film's own grammar. One director reported: "The gens came back hyper-realistic with the exact colour temperature I was looking for" — the agent understood creative intent from the reference rather than copying it.
Use semantic variation prompts to expand your options library. Once you have a base prompt that's close, generate 5–10 variants by shifting ONE element per prompt — lighting source, texture density, motion speed, palette tilt, atmosphere layer. This gives you a controlled spread of options rather than wild re-rolls, and it's how you find the unexpected reading that makes the sequence feel like a dream instead of a regular scene. A useful heuristic from the directors who do this: "If you feel like it's too off, then it means we should lock it in" — abstract sequences reward the choice that surprises you.
Batch your references by theme, not by mood board. For a surreal sequence pulling from multiple sources, separate references into thematic batches — spatial logic in one batch, color theory in another, screen/light behavior in a third — and tell the invideo agent explicitly what to take from each and what to ignore. That exclusion instruction matters as much as the inclusion: "I told it what to take and just as importantly, what to leave out." This is how you avoid the agent averaging your references into something generic.
For the model layer, dream and abstract beats are where Seedance 2.0 reference-to-video earns its keep — it carries character and atmosphere context across clips, so a hallucination sequence stays coherent across multiple generations rather than resetting each clip. invideo has every current video model (Runway, Veo, Kling, Seedance 2.0) and the invideo agent routes each shot to the right one, so you're not picking a platform per look — you're picking a sensibility per shot and letting the agent handle the routing.
Lock the chosen visuals before you move to motion. Once you've picked your hero frame from the variations, ask the invideo agent to save it to context as the canonical reference for that sequence, and have it generate any wider/closer/side angles off that lock. Locking one element prompts the agent to extract every angle around it autonomously, which keeps the dream-logic consistent as the camera moves through the sequence.
Watch some of these to see what works for you:
The better move was to have Agent 1 read the colours and textures of them and prompt for that instead.
— invideo's creative team, on using non-photographic references for AI generation