How many AI video generations should I plan per usable shot?

Plan for an average of 3 generations per usable shot. Some shots land on the first try, others take 8 or more attempts — 3 is the documented working average across a 3-minute episode.

What is a realistic clip-to-cut selection rate for AI-generated video?

Expect roughly a 25% selection rate. On one documented 3-minute episode, 164 clips were generated and only 41 made the final cut. Treat overgeneration as a deliberate budget line, not waste.

How many usable seconds can I expect from a 15-second AI-generated clip?

Each 15-second clip typically yields 4–7 usable candidate seconds, with an average of about 5 seconds used. You are selecting the strongest seconds inside each clip, not whole clips.

How much do composite or stitched shots factor into a finished AI film?

Plan for at least 40% of your final shots to be stitched from 2 or more generations. On the documented episode, 17 of 41 final shots were composited from the strongest seconds across multiple generations of the same prompt.

How much does AI video production cost per finished minute?

Documented productions range from $315 to $750 per finished minute, with total spend between $750 and $5,000 depending on length, complexity, and how many difficult physical-contact shots are involved.

AI Video Generations Per Usable Shot: Pro Film Ratios

Plan for about 3 generations per usable shot on average, with roughly a 25% selection rate from total clips generated. On a documented 3-minute episode, 164 Seedance 2.0 clips became 41 in the final cut, and 17 of those were stitched from 2+ generations. Each 15-second clip typically yields 4–7 usable candidate seconds.

Budget generations and credits against three documented ratios from real productions:

Average 3 generations per usable shot. That's the planning number for shots that clear the editorial bar. Some land on the first try (one complex top-down shot hit on attempt 1 once the invideo agent was directing the prompt), others take 8+ tries — 3 is the working average across a 3-minute episode.

~25% clip-to-cut selection rate. On that same episode, 164 Seedance 2.0 clips were generated and 41 made the final cut. Treat overgeneration as a deliberate budget line, not waste — the math of editorial yield assumes you throw most away. A horror short in a similar workflow ran ~400 video generations for 90 seconds of finished film, hitting comparable selection ratios.

4–7 usable shot candidates per 15-second clip, average 5 seconds used. You're not picking whole clips — you're picking the strongest seconds inside them. That's why 41 clips compressed into a 3-minute episode: each contributed roughly 5 seconds of its 15.

Plan for composites — 40%+ of final shots will be stitched. 17 of the 41 final shots on the documented episode were combined from 2 or more generations of the same prompt — strongest seconds from one generation, the rest from another. Build your edit assuming this is the default, not the exception. As Hridaye, invideo's creative director, put it: "MOST SHOTS AREN'T ONE SHOT. Prompt → 8 tries → Frankenstein the keepers."

Character locking sits on top of shot generation. Expect ~5 generations to lock one character at roughly $9.78 per character using multi-angle reference turnarounds. Do this BEFORE shot generation begins — locked character sheets and environment references are the single biggest lever for raising your usable-shot rate downstream.

Which model you route to changes the yield. Seedance 2.0 is strong for reference-to-video continuity across clips, Kling for multi-shot sequences, Veo for native audio realism, Runway for fine motion control. The invideo agent routes each shot to the right model from a single project context — you don't pick a platform per model, every roster model runs inside invideo, so the same locked character sheets and style block carry across all of them.

What to do when your ratio gets worse. POV shots and multi-character physical-contact shots (bodies, ropes, props touching) are the documented break points — they pull the per-shot generation count up fast. Two practical unlocks:

Shoot a mock of the shot on your phone and upload it as a reference video. Real-world footage anchors POV framing that pure prompting won't crack.
Hand-sketch complex physical arrangements and upload the drawing. When the image model can't visualize two characters in contact from text, the sketch gets the agent to a correct character sheet, which then drives the video gen.

For one-take continuous shots, use reference-to-video chaining. Clip the end of each generation, re-upload it to the invideo agent, and feed it into Seedance 2.0's reference-to-video alongside your character and location references — camera movement, framing, and atmosphere carry across the seam. This is more context-rich than start/end-frame or extend methods, which can't accept character or location references at the same time.

Cost math across documented productions. Finished-minute costs run $315–$750 across four productions with known length and spend: $315/min on the Arcane-style episode, ~$580/min on the horror short, ~$643/min for a 70-second short, $750/min on a 2-minute brand promo. Total spend ranges from $750 to $5,000 depending on length, complexity, and how many physical-contact shots you're solving.

Close the loop with a maker-checker pass: hand the rough cut back to the invideo agent for a "what's working, what's not" review against your loaded treatment. It catches pacing, SFX, and emotional-stage register errors a human editor often misses — on one production it flagged that an entity reveal was running at the wrong stage register, something the director hadn't noticed.

These ratios are planning numbers — your usable-shot rate depends on how much you lock before generation (script, character sheets, environment references, style block) and how disciplined your model routing is.

Watch some of these to see what works for you:

Real numbers: 164 clips generated, 41 made the cut

Horror short: 400 gens, $870, end-to-end breakdown

When AI gets stuck: phone mock and hand-sketch fixes

MOST SHOTS AREN'T ONE SHOT. Prompt → 8 tries → Frankenstein the keepers.

— Hridaye, invideo's creative director

How many AI video generations do you need per usable shot for professional film editing?

More on AI Filmmaking

How many AI video generations do you need per usable shot for professional film editing?

Related questions

More on AI Filmmaking