How do you use a reference image for color palette in AI video generation — and why do animated references fail?
Last updated June 26, 2026
Don't attach an animated or illustrated reference directly to your video prompt. Instead, have the AI read the colour palette and texture qualities of the reference and translate those into a photorealistic prompt. Animated references fail as direct inputs because the model reproduces the rendering style — cel shading, line work, brushstrokes — not just the colors.
Upload your reference to the invideo agent and instruct it to extract the colour palette and texture qualities, then write a photorealistic prompt from them — never attach the illustration to the generation itself. invideo is an agentic video creation tool with the current video and image models (Veo, Kling, Seedance 2.0 for video; Recraft, Nano Banana, GPT-Image-2 for images) available behind one agent, so extraction and generation happen in the same conversation. In one documented production this method returned generations that were, in the creator's words, "hyper-realistic with the exact colour temperature I was looking for" — the invideo agent didn't replicate the image, it understood what was wanted from it.
Why animated references fail as direct inputs. Dropping illustrated or animated reference images straight into prompts does not work because the model reads the entire frame as instruction: flat cel shading, ink outlines, and painterly texture register as the look to reproduce, not as a wrapper around the palette. The illustration's rendering style bleeds into footage you wanted photoreal, so you get a half-animated frame instead of a live-action shot with the right colors. Extraction sidesteps this — the reference becomes color and texture data, not a style template.
Three habits make the extraction reliable. First, tell the invideo agent explicitly what to take and what to leave out — exclusion instructions matter as much as inclusion ("take the colour palette, ignore the composition and the illustration style"). Second, batch references by theme: keep a dedicated color-theory batch separate from spatial or composition references so palette direction doesn't contaminate framing; one production fed each batch to the invideo agent with explicit adopt/ignore instructions per batch. Third, pull sequence-specific color references mapped to individual sequences rather than one general mood board — palette precision improves when each sequence has its own reference. Once the extraction reads right, have the invideo agent save the palette to context so every subsequent prompt reuses it instead of you re-describing the colors each shot.
One distinction to keep clear: if you want the animated look itself — not just its colors — that's the opposite workflow, where you deliberately ingest the style to the invideo agent's persistent context with explicit constraints against photorealism; one team locked a hand-painted style that way across a full 3-minute episode. For palette control alone, extraction is the method.
Watch some of these to see what works for you:
The better move was to have Agent 1 read the colours and textures of them and prompt for that instead.
— invideo's creative team, on using illustrated references in AI video generation