AI Filmmaking

Should you include expression sheets and hand gesture panels in a character reference sheet for AI video?

Last updated June 26, 2026

Yes — include both. An AI video model renders only what your reference sheet has shown it; any face or hand state it hasn't seen gets reinvented per generation, which is exactly how identity drift happens. Add a 6-emotion expression strip (neutral, happy, angry, sad, surprised, determined) and a 4-pose gesture panel (open hand, pointing, gripping, expressive gesture).

Include both panels whenever your character emotes or uses their hands on camera — which is nearly every narrative project. The mechanism is simple: if your sheet shows only a neutral standing pose and the scene calls for the character furious or gripping a prop, the model invents that face and those hands from scratch, and the invention changes from clip to clip. Richer sheets close that gap, and the approach is proven without fine-tuning — one documented 70-second short film kept 2 characters consistent across every scene using character sheets and agent context alone, no LoRA required.

Expression panels: include a 6-emotion strip, or a 3-emotion minimum. Practitioner guides converge on six named states — neutral, happy, angry, sad, surprised, determined. If the character appears only briefly, three panels are enough: neutral plus the two emotional poles your script actually uses. Series-level or performance-heavy projects justify the full 6–8 panel strip. Generate the expressions with the same locked face as the rest of the sheet — image models like Nano Banana and GPT-Image-2 hold identity while varying the emotion — and include face close-ups, because small details such as scars and accessories only stay consistent across shots when the model has seen them at close range.

Gesture panels: include them whenever hands carry action. Hands are a known weak point of generation models, and shots involving physical contact — props, ropes, bodies touching — break character consistency faster than almost any other scenario in documented productions; one short film had a multi-character contact setup in 75% of its shots. A standard gesture set covers open hand, pointing, gripping, and one expressive gesture. Keep these as separate panels rather than baking a held prop into the turnaround itself — objects in the character's hands introduce inconsistency across turnaround angles, so the turnaround stays clean and the gesture panels carry the hand states.

Two adjacent notes, briefly: expression and gesture panels extend — never replace — the multi-angle turnaround (front, 3/4, side, back) that is the non-negotiable baseline of any sheet, and the finished sheet should be locked before video generation begins so every shot inherits it. invideo is an agentic video creation tool with the current image and video models built in, and the invideo agent stores your locked sheet in project context and attaches it to every downstream generation, so the panels you add keep working across the whole film without re-uploading.

Watch some of these to see what works for you:

Full unedited session: character sheets, props, and cinematic stills built live
Day 2: character sheets, POV shots, and hand-sketch references for AI film
7-minute animated short: iterative character sheets and consistency techniques shown step by step

the AI always needs to see what the character is exactly, right? Or else it'll kind of hallucinate and imagine something that's under the cap. So, we don't want to do that. We always want the character to be seen as we see it on the character sheet.

— invideo's creative team

Share

More on AI Filmmaking