What is a realistic clip discard rate when making an AI film?
Last updated June 26, 2026
A realistic discard rate is 70–80% of generated clips. In one documented production, 41 of 164 generated clips made the final cut — a ~25% selection rate — with an average of 3 generations per usable shot. Even kept clips get trimmed: on average only 5 seconds of each 15-second generation was used.
Plan your generation budget around a 25% clip selection rate — roughly 3 of every 4 clips you generate will not appear in the finished film. The clearest documented benchmark: a 2-person team producing a 3-minute animated episode generated 164 clips and used 41 in the final cut, averaging 3 generations per usable shot. A second production points the same direction at a different scale: a ~90-second short required roughly 400 video generations to complete.
The discard rate compounds inside the clips you keep. In the 3-minute episode, an average of only 5 seconds was used from each 15-second generation — so even "usable" footage yields about a third of its runtime. Each 15-second generation often contains 4–7 shot candidates, and you select the best one rather than treating the whole clip as a single shot. Budget for total generated seconds, not clip count.
The rate also varies by shot type. Multi-character physical contact shots — ropes, props, bodies touching — break models faster than almost any other scenario, and POV shots reliably need multiple iterations and multiple prompting approaches. Simple single-character or static shots sit near the 3-generations-per-shot average; contact-heavy and continuity-heavy shots sit well above it, so a film built around those should plan for a higher overall discard rate. One documented short had a two-character carry setup in 75% of its shots, which pushes the whole production toward the high end of the range.
Frankenstein shot assembly changes what "discard" means: instead of rejecting a clip outright, you stitch the strongest seconds from 2 or more generations of the same prompt into one composite shot. In the documented 3-minute episode, 17 of the final shots — over 40% — were composited this way, which is how a 25% selection rate still produces a coherent film.
The practical conclusion: treat overgeneration as a planned line item, not a failure — the documented $315-per-finished-minute figure already includes all of that iteration. Shot-by-shot approval modes — the invideo agent's Always Ask mode is one — let you vet each prompt and its references before credits are spent, which keeps the discard rate a creative choice rather than a surprise.
Watch some of these to see what works for you:
Out of 164, 41 videos made the cut, and on average only 5 seconds of each 15-second clip was used. That's how 41 clips became a 3-minute episode.
— invideo's creative team