How many panels should I request per scene node when generating grids?

Request 3–9 panels per scene node depending on how defined your world is. Use 4 panels for a standard A/B/C/D asset-lock pass and 9 panels for broader exploration when the world isn't fully defined yet.

When should I use a single image instead of a grid in invideo AI?

Use single images for surgical fixes — such as a close-up crop of an existing wide shot or correcting one specific panel in a character sheet. Don't grid assets you've already locked.

Why is locking four assets before generating anything else so important?

Locking one option from a 4-panel pass on each character, prop, and environment plate prevents visual drift across the rest of your film. Locked panels become the seeds for every subsequent scene, replacing your original references.

How much does image generation cost compared to video generation in invideo AI?

Image generation is the cheapest part of the pipeline — one production used only 11 image generations to cover 4 characters and 1 prop, while another ran 30 image generations against roughly 400 video generations. The real credit cost lives in video re-rolls.

What image and video models does invideo AI route grids through?

invideo AI's agent routes grid requests through current image models including Recraft, Nano Banana, and GPT-Image-2, and video models including Veo, Kling, and Seedance 2.0, automatically selecting the right model for each task.

Batch vs Grid Image Generation for AI Film Pre-Vis

Yes — generate grids for pre-vis, not one image at a time. Image generation is cheap inside invideo, and grids give you the option set a real director wants: multiple angles, lighting variants, and costume reads per scene node, with the panels you pick becoming the continuity anchors for everything downstream.

Ask the invideo agent for grids of 3–9 panels per scene node — wide, close, side, lighting variants, costume reads — instead of asking for one image and re-rolling. invideo is an agentic video tool with the current image models (Recraft, Nano Banana, GPT-Image-2) and video models (Veo, Kling, Seedance 2.0) on tap, so the agent routes each grid to the right model and you stay in selection mode rather than prompt-engineering mode.

Batch by theme, not by single reference. When no one image explains the look, split your references into thematic batches — spatial logic in one, screen-function in another, color palette in a third — and tell the agent explicitly what to take from each batch and what to ignore. One documented production ran 3 grid options per round to explore different parts of the world before locking any frame.

Lock four assets before you generate anything else. Run a 4-options pass on each character sheet, antagonist reference, key prop, and environment plate; pick one of the four; lock it. After that, the locked panels — not your original references — become the seeds for every subsequent scene. This is the single step that prevents drift across the rest of the film: in a 70-second short with 2 characters, this approach held consistency across every scene with no LoRA, on a 4-options-per-asset workflow.

Use grids for shot coverage, single images for surgical fixes. Grids are right for exploration: angle coverage of a new scene, costume options when you only have a mood, lighting variants of a key beat. A single targeted image is right for surgical work — a close-up crop of an existing wide, or a one-panel fix to a character sheet (ask the agent which panel has the error; it identifies the exact one and corrects only that). Don't grid what you've already locked.

Sizing and seeding the grid. 4 panels for an A/B/C/D selection pass (the standard asset-lock pattern), 9 for broader exploration when the world isn't defined yet. Hand the agent your locked character sheet and environment plate inside the same prompt so every panel in the grid inherits the same identity; without those attached, grids drift. Hridaye, invideo's creative director, frames it this way: "Rather than generating one, one, one, one, one images to generate grids. Image generation doesn't cost much, especially in invideo. Use that to your advantage."

The cost case. Image generation is the cheapest part of the pipeline — the credits go to video. Documented productions ran $750–$5,000 all-in (a 70-second short at $750 / 3,000 credits; a 3-minute animated episode at $950; a 2-minute brand promo at $1,500 / 6,000–6,500 credits; a multi-day short at $5,000 / 20,000 credits), and the image-gen line inside those budgets is small — one production used 11 image generations to cover 4 characters and 1 prop, another used 30 image generations against ~400 video generations. Spending more grids upfront to lock assets pays back many times over in avoided video re-rolls, where the real cost lives (avg 3 generations per usable video shot; ~25% of clips make the cut).

These are the patterns that work — what's optimal depends on how defined your world already is and how many characters and locations you're carrying.

Watch some of these to see what works for you:

See batched reference image workflows and image grids in action for AI film pre-production

Watch the invideo agent do surgical single-panel fixes after assets are locked

Rather than generating one, one, one, one, one images to generate grids. Image generation doesn't cost much, especially in invideo. Use that to your advantage.

— Hridaye, invideo's creative director

Should I use batch or grid image generation to pre-visualize my AI film?

More on AI Video Essentials

Should I use batch or grid image generation to pre-visualize my AI film?

Related questions

More on AI Video Essentials