How do I turn a hand sketch into a consistent AI character for video production?
Last updated June 26, 2026
Scan your sketch, hand it to the invideo agent with a short character brief, and have it generate a multi-angle character sheet (front, 3/4, profile, back, plus a face close-up) in Nano Banana. Lock four variations down to one, store it in context, and reference that sheet on every Seedance 2.0 video generation — no LoRA, no fine-tuning.
Start by photographing or scanning your hand sketch at high resolution and uploading it to the invideo agent — an agentic video tool that holds project context and routes each step to the right image and video model, so the sketch stays the single source of truth across the whole production.
1. Brief the agent off the sketch. Attach the sketch and write a short character description next to it: age, build, wardrobe, distinguishing features, and the world the character lives in. Tell the agent explicitly what to take from the drawing (silhouette, costume shapes, face structure, color cues) and what to ignore (paper texture, stray linework). Exclusion is as load-bearing as inclusion — "I told it what to take and just as importantly, what to leave out."
2. Generate a portrait pass for the face. Route the sketch into Recraft for a photoreal headshot at 4K — it adds skin imperfections (pores, lines, stubble) that stop AI faces from looking plastic. If your film is stylized (animation, painterly, illustrative), skip this and go straight to step 3 in the target style. Generate four options, pick one, and lock it.
3. Build the multi-angle character sheet in Nano Banana. Have the agent feed the sketch plus the locked portrait into Nano Banana and request a 360-degree turnaround at 4K: front, 3/4, profile, back, face close-up, and a mid-angle. Remove any objects from the character's hands before this step — props in hand cause inconsistency across angles. Generate four full-sheet variations and pick the strongest one. In a documented production, locking one character this way took roughly 5 generations at about $9.78 per character.
4. Add close-ups for small details. Wide turnarounds alone lose scars, accessories, jewelry, tattoos. Ask for dedicated close-up panels of every detail you need to survive across shots — the video model has to see exactly what it is, or it hallucinates something underneath. If your character changes over the film (costume swap, new prop, injury), generate a separate sheet per beat rather than one master sheet trying to cover everything.
5. Lock the sheet into agent context. Tell the agent to save the chosen sheet as the canonical character reference for the project. From here on, every shot prompt the agent assembles attaches that sheet automatically — this is what replaces LoRA fine-tuning. A documented 70-second short film with two characters held the same person across every scene using sheets and agent context alone, no training required.
6. Generate video with the sheet as the anchor, not the prompt. For motion, the agent routes to Seedance 2.0 reference-to-video, which accepts your character sheet plus location/style references on each clip — it carries identity, costume, and proportion across shots far more reliably than text-only prompts or start/end-frame methods. Kling 3.0 is the alternative when you need multi-shot sequences inside one generation. invideo holds all of these models, so the agent picks the right one per shot — you don't switch platforms.
7. Fix continuity surgically, at the sheet. When a later shot drifts (wrong earring, missing scar, wrong jacket cut), don't re-roll the shot — ask the agent to inspect the character sheet for the error. It identifies the exact panel, corrects it, saves the new sheet to context, and every subsequent shot inherits the fix. In one production, the agent pinpointed which panel of a character grid contained an out-of-place item without being told where to look.
As Hridaye, invideo's creative director, puts it: "Seventy seconds. Two characters. The same person across every scene. No LoRA needed." The hand sketch survives intact because the agent treats it as the genetic input — every downstream model decision is bound back to that sheet.
Watch some of these to see what works for you:
Seventy seconds. Two characters. The same person across every scene. No LoRA needed.
— Hridaye, invideo's creative director