What is Google Veo 3.1?
Veo 3.1 is Google's state-of-the-art video generation model that creates high-fidelity videos with stunning realism. Google Veo 3.1, accessible globally via invideo, is changing the way AI is used to create videos for everyone. Veo 3.1 solves critical issues in AI-generated videos: Frame referencing, object referencing & character consistency.
What’s new to AI video generation with Google VEO 3.1?
Google VEO 3.1 offers creating videos with text, image & object references with complete character consistency.
-
Frame referencing – Create videos by just adding first and last frame.
-
Object referencing – Replace objects in full videos.
-
Character consistency – Accurate faces throughout clips.
Invideo’s partnership with Google Veo 3.1 could mean the end of AI slop.
Invideo, a trusted Google partner, offers full access to Veo 2, Veo 3 and Veo 3.1 to all its users. This partnership enables anyone to create cinematic quality Ads, brand promos, vlogs, explainers and so much more.
Google Veo 3.1 & Invideo makes it easy to tell a complete story from start to finish, by just adding a start frame and an end frame.
Invideo along with the powerful Veo 3.1 enhance AI video creation by using technology like First-to-last frame control. Add the starting image and the ending image, add your prompt, and create a video clip with smooth cinematic motion, completing your story from just those two photos.
We recreated a popular old Brand Ad with just a few images
First frame
Last frame
Video type: Brand film
This video refers to the object, ensuring the model holds it and uses and creates a full length film that is worthy of being shown at the Super Bowl.
Output video
Created a brand promo film for your interior design studio
First frame
Last frame
Video type: Ads & Promos
This video creates the short film of this interior decorator walking in several of her creations with smooth transitions from one house to another. These transitions are what make this video incredibly real.
Output video
Create a vlog of a person going back in history and reporting live form important events
Video type: Vlog / Explainers / Social media videos
Imagine yourself as a reporter present during the construction of the Great wall of China, fall of the Titanic or the extinction of the dinosaurs.
First Frame
Ending Frame
Output video
Create a Documentary on Rome
Video type: Vlog / Explainers / Social media videos
First frame
Last frame
Output video
Google Veo 3.1 cracks character consistency in AI videos.
The biggest pain point in AI video has been character consistency. Google Veo 3.1 has solved this problem. We have currently tested prompts with a single character in the clip only. We tried characters via uploading footage and text to prompt. We’ll be testing multiple characters soon.
Typically when scenes changed, characters had minor artefacts or changes to the first frame in other AI video models. We created two videos which had multiple scenes with the same character and the characters remained consistent.
So what are you going to use Google Veo 3.1 for?
-
Product Demos
-
Social media ads
-
Real estate tours
-
Event promotions
-
Storytelling, narratives driven by characters
-
NGOs
How does Google Veo 3.1 compare with other AI video models like Sora 2?
| Parameter | Google Veo 3.1 | Other Models (Sora 2, Runway, etc.) |
| Video length | 8 second video clips | 8–12 seconds |
| Voiceover | No feature to add your audio to allow voice cloning | Sora 2 is best-in-class here since it captures your voice sample while onboarding |
| Dialogue | Generates native audio & adds ambient noise at par with Sora 2 | Sora 2 & Veo 3.1 are comparable in generative audio & ambient noise |
| Video output quality | Supports up to 1080p natively | Sora 2 and others generate at 720p and upscale to 1080p |
| Realism | Understands transformations to create natural, cinematic motion | Basic start-to-end transitions that mostly morph or fade |
| Physics | Physics, realism, and prompt adherence are as real as possible | Sora 2 is comparable; other models like Runway are catching up |
| Temporal consistency | Allows up to three reference images to maintain visual consistency across frames | Models like Runway use one image and often lose detail |
| Lighting, faces & visual style | Understands subjects in 3D, keeping lighting and style stable | Sora 2 performs well; Runway & others need significant upgrades |
| Safety & ethics | Blocks harmful requests, marks videos with SynthID watermark | Sora 2 and Veo 3.1 lead benchmarks in safety & ethics |
Is this the end of AI Slop?
Whether you’re a creator, a filmmaker, a marketer, or a business owner, this unlock changes everything. Veo 3.1 & Invideo eliminate the AI fingerprint from videos for most people even with a trained eye. Ad films. Brand spots. Social content. Cinematic stories, created with the precision of a studio team, at the speed of thought. The future of video isn’t coming, it’s already here.


