Real-World Testing of Vidu Q1 Reference Generation: Seeing Zhuge Liang, Churchill, Napoleon Take Photos at the Great Wall}
Vidu Q1's reference generation feature simplifies video creation, enabling one person to produce professional content with just three steps, showcasing impressive results across diverse scenarios.

This time, it’s truly different — I encountered the “God of Imagination”! Previously, I often said “live like a team,” now thanks to AI, it’s really achieved.
Recently, AI video model Vidu Q1 by Shengshu Technology launched its reference generation feature, greatly simplifying traditional content creation processes, truly enabling “one person is a whole crew”!
These characters should be familiar to everyone.
Zhuge Liang, waving his feather fan and saying “Who would have thought there are such shameless people,” appearing in various meme videos; Winston Churchill, the Iron Lady Prime Minister of the UK; and Napoleon, with a well-documented record, now cross time and space, sitting in a meeting room, engaging in close discussion — a “century summit”!
If using traditional AI image-to-video methods, it would require script writing, image generation, blending, and editing steps. But here, only three images and Vidu Q1’s reference generation feature were used!
Just like putting an elephant in a refrigerator in three steps, it’s just three steps here: upload photos, write prompts, generate video.

Seeing this, you can probably tell how extraordinary Vidu Q1’s reference generation feature is.
By uploading reference images of characters, props, and scenes, it can directly fuse multiple elements into a video, achieving “zero storyboard creation,” making it truly “ready to use.”
Many netizens have shared their own works created with Vidu Q1’s reference generation on social platforms, showing a wide variety of styles and expressing high admiration for human imagination.
For example, user Scobleizer posted a video where a sculpture with only a head jumps and bounces out of a garage door, making exaggerated expressions.
According to Scobleizer, he only used two photos: one of the “Silicon Valley origin” — the HP garage, and the other of the sculpture. After uploading these to Vidu Q1’s reference generation, he used a simple prompt: “Open the garage door in Image1 revealing the dancing statue inside from Image2.”
The entire video is smooth, and the sculpture’s expressions are lively and interesting.
Another creative video features a cat taking selfies in the forest while a lion slowly approaches behind.
This is a masterpiece by user SohelBloom, and the prompt was just a simple sentence: “The cat (Image1) is taking a selfie with the phone (Image2), while the lion (Image3) is approaching from behind...”
Another impressive example is artist and programmer Alex, who combined characters from 1989’s Batman and 1993’s Jurassic Park T-Rex in a single scene, with a fierce “battle,” resembling a Hollywood blockbuster.
Seeing such amazing videos, we decided to test Vidu Q1 ourselves (note: all tests are single-generation, no random draws).
Visit the official website to try the new “reference generation” feature:

Experience link: https://www.vidu.cn/create
We love Studio Ghibli’s animations. As kids, we dreamed of living inside those worlds; as adults, we want to create such comics. What if your characters appeared in a Miyazaki film?
We tried with a rough sketch of a child, and two classic scenes from “My Neighbor Totoro.” The prompt was simple:

The generated video looks like this:
As you can see, the character, originally a simple sketch, enters Miyazaki’s world, maintaining its features while matching the scene’s style, as if truly stepping into a fairy tale.
Dreams come true—everyone can be a manga artist in the AI era!
Another highlight is the high video quality, thanks to Vidu Q1’s ability to output 1080p resolution videos. Whether it’s epic sci-fi stories, adorable cartoons, or detailed facial expressions, everything is vividly clear.
Let’s try one more!
This time, we invited the Palace Museum’s big orange cat to perform a show!
Prompt: “Big orange cat holding a red tassel spear, practicing Chinese kung fu in the woods.”

The result:
The video closely matches the prompt, and the red tassel’s size was automatically adjusted for harmony with the big orange cat’s body.
Next, let’s push the limits! Currently, Vidu Q1 supports up to 7 input images, including characters, scenes, and props. We plan to upload all seven to test its capabilities.
Let’s take the classic characters—Zhuge Liang, Churchill, and Napoleon—and move their scene to the Great Wall for more interaction.
First, prepare their iconic items: Zhuge Liang’s fan, Churchill’s black bowler hat, Napoleon’s sword, and a picture of the Great Wall.

Then, the prompt:
[@Image 1] Holding [@Image 5]’s fan, [@Image 2]’s man wearing [@Image 4]’s black hat, [@Image 3]’s holding [@Image 6]’s sword, taking a group photo at [@Image 7], interacting and making a “peace” pose.
(Tip: When writing prompts, you can use “@” to select from uploaded images!)
The result:
Zhuge Liang gently shakes his feather fan, Churchill wears his iconic hat, Napoleon walks with his sword, and the three pose for a photo at the Great Wall, just like tourists, capturing a classic scene.
It looks quite accurate, with natural interaction among characters. The seamless transition shows Vidu Q1’s impressive capability!
However, a small flaw: Zhuge Liang’s fan seems to have disappeared as if by magic when he “waved” it, and Napoleon just threw his sword — a bit odd but understandable.
Let’s continue the adventure: moving them to the Iron Throne from “Game of Thrones.”
Prompt: [@Image 1] holding [@Image 5]’s fan, walking with [@Image 2] and [@Image 3] to the throne, then [@Image 1], [@Image 2], and [@Image 3] pose for a photo!
The performance remains excellent, but the fan almost covers Napoleon’s face — maybe it needs to be lowered? Churchill’s pose is better.
From these scenes, it’s clear that whether in the meeting room, the Great Wall, or the Iron Throne, the character images maintain high consistency, even allowing seamless editing into transition videos. Traditional editing would require huge effort and time.
However, some images show slight “cutout” effects, indicating room for improvement in layer blending. Still, overall, Vidu Q1’s reference generation is very powerful, simple to operate, and can turn any idea into reality with just three steps — almost like a director’s dream!
Finally, about the cost: creating a 5-second, 1080p video with Vidu Q1’s reference generation costs only 20 points. The current standard plan costs 48 yuan/month with 800 points, so a single video costs less than a bottle of mineral water — very affordable!
Interested users can try it out and experience their “director’s dream”!
Reference links: