What are the four foundational questions to answer before starting AI film pre-production?
Last updated June 26, 2026
Before you generate a single frame, lock answers to four foundational questions: 1) Who is the character? 2) What is the antagonist or entity? 3) What is the prop? 4) What is the deliverable format — frames first, then video, or straight to video? invideo's creative director calls these the four things that "will change every frame," and answering them up front is what unlocks consistent output across the entire pipeline.
invideo is an agentic video creation platform where you direct a crew of AI agents (creative producer, DOP, costume, storyboard) instead of hand-writing prompts — and these four questions are the inputs that ground that crew before any asset work begins. Walk them in order:
1. The character — who they are, what era, what they look like head-to-toe. This answer feeds your character sheet: front, side, profile, back, plus face and mid-angle close-ups. Without it the model hallucinates under hats, behind props, across angles. In one 70-second short film, locking two characters via sheets held consistency across every scene with no LoRA required; in a 3-minute animated episode, the team needed roughly 5 generations to lock each character at about $9.78 per character. Answer this question and you also unlock parallel model casting — you can have the invideo agent run the same character prompt on two image models (e.g. Recraft for photoreal portraits with pores and stubble, Nano Banana for stylized sheets) and pick.
2. The antagonist or entity — its reference, its register, how clearly it should read. This is where the question "closer to Bathsheba?" lives — name a concrete visual anchor, decide if the entity reveals progressively or all at once, and decide its emotional stage in the film's arc. In a 90-second horror short film, getting this wrong meant the entity's reveal shot was running at the wrong emotional stage register — a structural error caught only on review. Lock the entity reference and its reveal logic now, and every shot featuring it inherits the right grammar.
3. The prop — what it is, what it's made of, how it behaves. Props are narrative objects, not decoration. Specify the object ("doll, ball, something else?"), the material ("hard material, so it makes a horrible sound when it falls"), and the story logic. A lifeless prop breaks audience investment regardless of how well the character renders — so generate 4 options per prop, select on story logic, and lock before video. In one production the team generated 11 reference images covering 4 characters and 1 prop before any video began.
4. The deliverable format — frames first then video, or straight to video. This answer decides your entire pipeline order. Frames-first means you direct static images to approved quality (portraits in Recraft, character sheets and 360° turnarounds in Nano Banana / GPT-Image-2), THEN move to Seedance 2.0 or Kling for motion — which is the recommended order because visual consistency is locked before generation costs scale. Straight-to-video skips that gate and is only viable for short, simple shots. Across documented productions, frames-first workflows consistently delivered tighter consistency: one 3-minute episode hit a 25% editorial selection rate (41 of 164 clips usable), and that yield depended on the sheets being locked first.
Once those four answers are in context, the invideo agent stops guessing and starts asking — it surfaces the gaps ("that near wall doesn't exist yet, what should it be?") instead of inventing them. Hand these four answers to a creative producer agent loaded with your script, and downstream agents — DOP, costume designer, storyboard artist — all build on the same locked foundation. That is the unlock: every later decision routes through these four anchors.
Watch some of these to see what works for you:
Before I build assets, four things will change every frame: The Girl: What does she look like? What era? The Entity: Closer to Bathsheba? The Toy: Doll, ball, something else? The Deliverable: The frames first, then video? These four answers unlock everything.
— Hridaye, invideo's creative director