Google has unveiled Gemini Omni, a groundbreaking multimodal AI model that can generate and edit videos through conversational inputs across text, images, audio, and video. The technology, launching first with Omni Flash, represents a significant leap in AI's ability to understand and create complex multimedia content. This development could revolutionize content creation and video editing workflows.
Background
Multimodal AI models that can process and generate different types of media have been a major focus in AI research, with previous models typically limited to one or two modalities. Google's Gemini series has been at the forefront of this research, competing with other major AI models like OpenAI's GPT series.
- Source
- TechCrunch
- Published
- May 20, 2026 at 01:45 AM
- Score
- 8.0 / 10