To begin with, a few pattern context:
Veo is Google / DeepMind’s multimodal video era show that can create recordings from content prompts, and too take picture prompts.
blog.google
+3
Google Cloud
+3
Google Cloud
+3
Veo 3, discharged in 2025, presented local sound era (exchange, sound impacts, surrounding sound) along with video.
Google Cloud
+3
DataCamp
+3
blog.google
+3
In Google’s Stream apparatus, Veo is coordinates with film-style capabilities (scene composition, progression over shots, etc.).
blog.google
On the Gemini app, Veo 3 is utilized to turn photographs into 8-second video clips.
Gemini
+2
blog.google
+2
So what you implied by “Veo 3.1 is way better at creating recordings from images” likely alludes to reported overhauls or improvements past Veo 3.
What is known around “Veo 3.1” or the following step
There is a later declaration approximately Veo 3.1 that portrays improvements over Veo 3. A few key focuses from the news:
Google is including more practical altering capabilities: you can presently adjust lighting and shadows in AI-generated recordings.
The Verge
New highlights like “Ingredients to Video” (you bolster in three reference pictures, and create a video with sound) and “Frames to Video” (move from a begin picture to an conclusion picture, with sound) are presented.
The Verge
Scene Expansion: you can take the last moment of a clip and amplify it with AI-generated visuals and sound (up to 1 miniature).
The Verge
It’s said to way better protect consistency over shots and handle multi-shot recordings (i.e. more than fair single-scene generative yield) — e.g. supporting multi-prompting for multi-shot recordings.
TechRadar
It underpins longer video length (versus Veo 3’s commonplace brief clips) and points for higher complexity, superior character consistency, smoother moves.
TechRadar
So, in pith: yes, the “3.1” adaptation is being depicted as an advancement over 3 — particularly in image-to-video capabilities (begin picture, conclusion picture, controlling lighting/shadow, moves, etc).
Comparison: Veo 3 vs (anticipated) Veo 3.1
Here’s a side-by-side rundown based on what’s freely reported:
Feature Veo 3 (current) Veo 3.1 (declared / expected improvements)
Image-to-video / photo-to-video Yes: turn a photo into a brief video clip (8 seconds in Gemini) with incite depicting movement and sound.
blog.google
+4
blog.google
+4
Gemini
+4
More adaptable: back for “Frames to Video” (begin to conclusion outline moves) additionally “Ingredients to Video” (different input pictures) to create video.
The Verge
+1
Audio integration Fully local sound: sound impacts, encompassing commotion, exchange implanted in created video.
Google Cloud
+3
DataCamp
+3
blog.google
+3
Continues (and conceivably with more control) — unused highlights are depicted as producing sound in all modern modes.
The Verge
Editing control (lighting, shadows, etc.) Limited control (you incite for it, but less fine control) More advanced altering: capacity to adjust shadows, lighting.
The Verge
Video expansion / continuity Usually discrete brief clips, constrained extension Scene Expansion: expand video past its unique length by creating extra outlines and sound.
The Verge
Consistency and multi-shot Good inside brief scenes, but may battle with longer, multi-shot continuity Better character/scene consistency over shots and multi-prompt for multi-shot recordings.
TechRadar
Public availability Available by means of Gemini / Stream, in spite of the fact that with length / locale / membership confinements.
blog.google
+3
blog.google
+3
blog.google
+3
The reports demonstrate Veo 3.1 is accessible (or will be) through the same stages (Stream, Gemini API) beneath “paid preview” terms.
The Verge
So, in brief: based on open reports, yes — Veo 3.1, as reported, is anticipated to be superior (more adaptable, more controllable, more effective) than Veo 3, particularly in how it handles picture inputs, moves, sound, and amplified sequences.
Caveats & What We Don’t (However) Know
The declarations around Veo 3.1 are moderately later, and a few highlights are stamped as “coming soon” or in “preview.” It’s conceivable not all the guaranteed capabilities are completely steady or broadly accessible however.
The Verge
The enhancement claims are based on Google/DeepMind’s declarations and media scope. Free benchmarks or peer-reviewed assessments are rare at this point.
The locale, membership level, or API get to may restrain utilization of the modern highlights for numerous users.
The genuine quality picks up (e.g. how much “better” things see, how vigorous moves are, how reliable characters stay) can depend intensely on the incite, the source pictures, and limitations of computation.

0 Comments