How to Use Google Gemini Omni: A Step-by-Step Guide
Google Gemini Omni generates video, character-consistent clips, and voiceovers from a browser no software to install, no GPU required. This step-by-step guide covers everything: creating your account, understanding the three model types, writing effective prompts, and getting the most out of your credit balance.
By the end of this guide you’ll have generated your first video and understand exactly how the credit system works so you don’t waste anything.
Step 1: Sign up and choose a plan
Create an account with your email or Google account. Note: there is no free tier. You can sign up and browse the dashboard for free, but generating requires a paid plan. Plans start at $49/mo for 1,200 credits.
A 1080p, 8-second video costs approximately 150 credits. At the Starter tier, that’s about 8 full videos per month. If you need more, Pro at $69/mo gives you 4,000 credits (~26 videos/month at 1080p) and includes API access.
See the full pricing breakdown including the yearly discount options.
Step 2: Understanding the three model types
Gemini Omni has three distinct generation modes. Understanding which one to use is the most important decision before you click Generate.
Text to Video
The workhorse. You type a description, the model generates a clip from scratch. Supports:
- Text-only input (most common)
- Image + text input (image-to-video mode the first frame is your image)
- Video + text input (video-to-video restyle)
Best for: B-roll, ad creative, atmospheric footage, product visualization, abstract visuals.
Character Video
Same as text-to-video, but you supply a reference photo and the model keeps that character consistent across every clip you generate from it. The reference photo becomes an anchor the generated character will always look like the person or avatar in that photo.
Best for: brand presenters, e-learning instructors, UGC-style content, ad series with a consistent face.
AI Voice / Lip-Sync
Text-to-speech with natural prosody, available in dozens of languages. Can also be combined with video for lip-sync: you provide the video and the script, and the model generates a version where the character’s mouth matches the audio.
Best for: voiceovers, narration, multilingual content, character videos where the character needs to “speak.”
Step 3: Open the Playground
The Playground is on the homepage. You’ll see three tabs: Video, Character, Voice. Start with Video.
You don’t need to configure anything before your first generation the defaults (1080p, 8 seconds) are sensible for most use cases.
Step 4: Write your first prompt
The single biggest factor in output quality is prompt quality. A weak prompt produces generic output. A specific prompt produces exactly what you need.
The anatomy of a strong video prompt
Every good video prompt has five elements:
1. Subject who or what is in the video?
“A golden retriever puppy” (specific) vs “a dog” (generic)
2. Action what is the subject doing?
“sprinting through a sunlit wheat field, ears flapping”
3. Setting where, when, what conditions?
“late afternoon, golden hour, slight warm breeze, summer”
4. Style what visual quality or aesthetic?
“cinematic 4K, shallow depth of field, warm color grade”
5. Camera movement how does the frame move?
“slow dolly forward, low angle”
Assembled prompt:
“A golden retriever puppy sprinting through a sunlit wheat field, ears flapping, late afternoon golden hour, slight warm breeze, cinematic 4K shallow depth of field, slow dolly forward at a low angle”
Compare that to “a dog running” the difference in output quality is significant.
Prompt examples by use case
Product shot (e-commerce):
“A premium sneaker rotating slowly on a minimalist white platform, dramatic studio lighting with subtle blue rim light, professional product photography, 4K”
Abstract / tech:
“Glowing data streams flowing through dark abstract networks, blue and white particles, deep space background, technology aesthetic, 4K, slow camera drift”
Real estate exterior:
“Modern home exterior at golden hour, subtle forward camera push, lush landscaping, professional real estate style, 4K”
Vertical social content:
“A barista pouring latte art in a cozy coffee shop, warm lighting, slow motion steam rising, vertical 9:16 format for Instagram Reels”
Step 5: Set resolution and duration
After writing your prompt, set:
Resolution:
- 720p fast, cheap, use for drafts and iteration
- 1080p standard for social media, YouTube, most uses (default)
- 4K for broadcast, large screens, hero assets
Credit cost scales with resolution: roughly 75 credits (720p), 150 credits (1080p), 300 credits (4K) for an 8-second clip.
Duration:
- Range: 4–10 seconds per clip
- 8 seconds is the default and works for most use cases
- For longer sequences, generate multiple clips and edit them together
Step 6: Generate and review
Hit Generate. Most jobs complete in 30–90 seconds. You’ll see a progress indicator in the Playground; when it’s done, the video appears inline.
If the output isn’t quite right:
- Adjust the prompt add more detail on the specific element that’s off
- Change the style modifier try “documentary style” vs “cinematic” vs “photorealistic”
- Try again diffusion models have randomness. The same prompt can produce meaningfully different results on each run
Step 7: Download and use
Click the download button to get your MP4. The file is yours commercial license included on all paid plans. Use it in:
- Social media (TikTok, Reels, YouTube)
- Paid advertising (Meta, Google, YouTube ads)
- Client deliverables
- Embedded in products you sell
See the full commercial license terms for what’s included.
Step 8: Find your jobs in History
All your generations are saved in /history. You can:
- Filter by model type, status, or date
- Re-run any job with the original settings (useful when you find a prompt that works)
- Download outputs again if you forgot to save them
- Export to CSV for billing reconciliation
- Bulk delete jobs you no longer need
Using the Character Video model
The character workflow has one extra step: uploading a reference photo.
- Go to the Playground and click the Character tab
- Click “Upload reference” and select a photo of the person or avatar you want to use
- The photo is uploaded to your account use it repeatedly across as many generations as you want
- Write your prompt describing the scene: “The character presenting to camera in a modern boardroom, confident tone”
- Generate the character in the output will match your reference photo
Tips for character consistency:
- Use a clear, well-lit frontal photo of the character for the reference
- The character works better in foreground scenes than extreme wide shots
- You can place the same character in completely different environments just by changing the prompt
- Generate 5-10 clips with different scenes to build a full video series
Using the AI Voice model
- Click the Voice tab in the Playground
- Select your language and voice profile
- Type or paste your script (up to ~5,000 characters)
- Hit Generate you’ll get an audio file and, optionally, a lip-synced video if you attach a character clip
For lip-sync specifically:
- Generate your character video first (save the output URL from /history)
- Go to the Voice tab, paste your script
- Attach the character video as the input
- Generate the output is a video with the character’s mouth synchronized to your script
Optimizing your credit usage
Credits are precious, especially on Starter. Here’s how to use them efficiently:
Draft at 720p, final at 4K. A 720p draft costs half the credits of a 1080p final. Iterate on the prompt at 720p, then do one 4K final when you have the prompt dialed in.
Short duration for A/B testing. Generate 4-second clips when testing variations of a prompt. Twice as many tests for the same credits.
Re-run winners, not losers. When you find a prompt that produces a good result, use the re-run feature in /history rather than typing the same prompt from scratch.
Use the credit estimate. The Playground shows you the credit cost before you generate. Check it before hitting Submit on anything.
Getting started today
The fastest path from zero to your first video:
- Create your account
- Choose a plan that fits your volume Starter for occasional use, Pro if you’re generating weekly
- Open the Playground
- Try this prompt to start: “A lone lighthouse on a rocky cliff, storm waves below, dark clouds, cinematic 4K wide shot, slow push forward”
- Download, use commercially, and build from there
Related:
Ready to generate your first video?
Try the Playground no configuration required.