How do I tell the AI what to take from a reference image and what to ignore?

Instruct the AI that the reference is for color theory only and to ignore its composition, scale, or subject. When using multiple references, batch them by theme and give each batch specific adopt and ignore instructions.

How can I make an extracted color palette reproducible across multiple shots?

Ask the AI to name extracted tonal modes with exact hex values, such as split-toned amber and emerald with hex anchors. Store this color-and-texture block in the AI's context so subsequent shots inherit it automatically without re-prompting.

Extract Color & Texture from Reference Images for AI Prompts

Q: Why shouldn't I drop an illustrated or animated reference directly into an AI video prompt?

Dropping an illustrated or animated reference directly into a prompt causes the model to replicate the source aesthetic rather than just borrowing its color and texture. The better approach is to have the AI read and translate the palette and textures into prompt language for your target style.

Q: When should I skip color extraction and upload reference images directly?

If your reference style matches your intended output style, such as animated frames for an animated film, skip extraction and upload the frames directly with an instruction to save the art style to context. Extraction is specifically for when the reference style and output style differ.

Extract colour and texture from a reference by instructing the AI to read the image's palette and texture qualities and translate them into prompt language for your target style — never by dropping an illustrated or animated reference directly into the prompt, which makes the model copy the source aesthetic instead of borrowing its look.

Start with the failure mode this technique solves: dropping an illustrated or animated reference image straight into a video prompt does not work — the model replicates the cartoon aesthetic rather than borrowing its colour and texture. "The better move was to have Agent 1 read the colours and textures of them and prompt for that instead," as invideo's creative team documented after testing both approaches. invideo is an agentic video creation tool with all the current video and image models available, so the steps below run through the invideo agent rather than a raw prompt box.

1. Upload the reference and ask for a read, not a copy. Instruct the invideo agent to read the colour palette and texture qualities of the image and write them into a prompt for your target rendering style — for example: "read the warm amber-green split tones and the rough surface texture of this frame, and translate them into a photorealistic scene." In one documented production, the generations "came back hyper-realistic with the exact colour temperature I was looking for" — the invideo agent understood creative intent from the image rather than ripping it off.

2. Say what to take and, just as importantly, what to leave out. Tell the invideo agent the reference is there for colour theory only, and to ignore its composition, scale, or subject. When no single image explains the look, batch references by theme — one batch for colour, another for texture or spatial logic — and give each batch its own adopt/ignore instructions; one production requested 3 grids per generation round off batched references this way. Exclusion prompting is what keeps unwanted attributes from leaking into the output.

3. Quantify the extracted palette so it's reproducible. Have the invideo agent name what it read as tonal modes with exact hex values — e.g. "Mode A — split-toned amber and emerald" plus its hex anchors. One production encoded a director's entire colour philosophy as named modes this way; hex-anchored modes let you repeat the exact palette across every shot instead of re-describing colours from memory each time. Anchor lighting language to the reference too — "warm yellow from the lamps only, like all the refs" produces more accurate results than generic "warm lighting."

Once extracted, treat the colour-and-texture block as a fixed component: place it in the same position in every prompt (one production held a fixed 9-element assembly order with palette as its own slot across the whole film), and tell the invideo agent to store the block in context so subsequent shots inherit it without re-prompting.

One boundary case: if your reference's style IS your target style — say, animated frames for an animated film — skip extraction and upload the frames directly with an instruction to save the art style to context; extraction is specifically for when reference style and output style differ. And any colour nuance you can't fully lock at generation can be finished in the grade afterwards.

Watch some of these to see what works for you:

Watch the exact color/texture extraction workflow from a real Day 1 session

Feed AI batched references by category — color, texture, space — separately

The better move was to have Agent 1 read the colours and textures of them and prompt for that instead.

— invideo's creative team

How do you extract color and texture from a reference image to write better AI video prompts?

More on AI Filmmaking

How do you extract color and texture from a reference image to write better AI video prompts?

Related questions

More on AI Filmmaking