Why must hands be empty on a character reference sheet?

AI image and video models misread held objects as body parts, which causes character drift during motion generation. Remove all props from the neutral sheet and give each prop its own separate reference.

What is the difference between T-pose and A-pose for character sheets?

T-pose has arms straight out at shoulder height for maximum limb clarity and clean rigging reference. A-pose angles arms roughly 45 degrees down, which reads more naturally on realistic or organic human characters and avoids shoulder distortion.

When should you generate a new character reference sheet?

Generate a fresh neutral sheet for every beat where the character changes costume, picks up an object, or evolves across the film. Without per-beat sheets, the model averages variants and identity drifts.

Best Pose for AI Video Character Reference Sheets

Q: How many angles should a character reference sheet include?

A standard sheet includes front, 3/4, side, and back views of the same neutral pose, plus a face close-up and a mid-angle closeup to preserve fine details like scars, accessories, and costume seams.

Q: What pose works best for stylized or chibi characters?

A relaxed neutral stand with arms loosely at the sides and weight evenly distributed is acceptable for stylized or chibi characters where T-pose or A-pose looks unnatural, as long as the arms stay clear of the torso silhouette.

The best pose is a neutral, prop-free, front-facing full-body stance — T-pose for maximum limb clarity, A-pose for organic characters, or a relaxed neutral stand for stylized ones. Pair that with side and back views on the same sheet, and remove anything the character is holding before generating turnarounds.

Generate the sheet to these specs — every item exists because AI image and video models misread held objects as body parts and read crossed or dynamic limbs as different anatomy across angles, which is what causes character drift later in motion:

Pose: neutral T-pose, A-pose, or relaxed stand — never dynamic or action.
Framing: full body, character centered, shot at eye level.
Arms: away from the torso so no limb overlaps the body silhouette.
Hands: open and empty — no props, no weapons, no accessories held.
Costume: fully visible head to toe, nothing occluded.
Background: plain and neutral, no environment detail.

Rank your pose by character type. T-pose (arms straight out at shoulder height) is the most unambiguous — every limb is fully visible, nothing occludes the torso, and it's the cleanest reference for downstream rigging or motion. A-pose (arms ~45° down) reads more natural on organic, human characters and avoids the shoulder distortion T-pose can introduce on realistic anatomy. A relaxed neutral stand (arms loosely at sides, weight even) is acceptable for stylized or chibi characters where T/A-pose looks unnatural — but only if the arms stay clear of the torso silhouette.

Build the sheet as a multi-angle grid, not a single image. Standard layout is front, 3/4, side, and back of the same neutral pose, plus a face close-up and a mid-angle closeup so small details (scars, accessories, costume seams) survive the model's compression. In documented productions, four angles per character at 4K were generated through the invideo agent using Nano Banana (with Nano Banana Pro preferred where character fidelity matters most), and Recraft handled the photoreal face portrait with the skin-level imperfections — pores, lines, stubble — that keep the face from looking plastic.

Remove props before the turnaround, then sheet them separately. Anything in the character's hands — a weapon, a phone, a toy — gets stripped out of the neutral sheet and given its own reference. One documented production hand-sketched a complex physical arrangement between two characters, uploaded the drawing to the invideo agent, and had it routed into the image model to produce a fused sheet that text prompting alone couldn't visualize. Same logic: solve identity first, then layer the prop interaction as a separate brief.

Add a per-beat sheet when the character changes. If your character picks up a trinket, swaps a costume, or evolves across the film, generate a distinct neutral sheet for each beat. In one production the character accumulated a new trinket in every sequence, which required a fresh sheet per beat — without that, the model averages the variants and identity drifts. Action poses live on a separate deliverable from the neutral turnaround, never on the same sheet.

Lock the sheet before any video generation. Generate four options per sheet, pick the best, and store it in the invideo agent's context so every downstream shot inherits the same identity. Across documented productions, locking one character this way took roughly five generation attempts at about $9.78 per character; eleven total reference images covered four characters and one prop on a 3-minute episode. Once locked, the invideo agent routes that sheet into the right video model per shot — Seedance 2.0 for reference-to-video continuity, Kling or Veo where their strengths fit — without you switching tools. If a continuity error shows up later, ask the invideo agent to inspect the sheet rather than re-rolling the shot: in one documented case it identified the exact panel containing the error, fixed it at the source, and every subsequent shot inherited the correction.

As Hridaye, invideo's creative director, puts it: "the AI always needs to see what the character is exactly, right? Or else it'll kind of hallucinate and imagine something that's under the cap. So, we don't want to do that. We always want the character to be seen as we see it on the character sheet." That's the whole reason the pose has to be neutral and the hands have to be empty — the sheet is the model's only ground truth for who this character is.

Watch some of these to see what works for you:

Building prop-free character sheets and solving multi-character poses for AI film

Full session: generating 360° character turnaround sheets with AI agents

Real Arcane-style episode workflow: character turnarounds, locking sheets, and cost breakdown

the AI always needs to see what the character is exactly, right? Or else it'll kind of hallucinate and imagine something that's under the cap. So, we don't want to do that. We always want the character to be seen as we see it on the character sheet.

— Hridaye, invideo's creative director

What is the best pose for a character reference sheet used in AI video generation?

More on AI Filmmaking

What is the best pose for a character reference sheet used in AI video generation?

Related questions

More on AI Filmmaking