Search Results
3 results for “multimodal AI”
Models
Gemini
Gemini is a family of multimodal large language models developed by Google DeepMind, designed to natively process and generate text, code, images, audio, and video across a range of model sizes.
6 min readUpdated May 2026
Foundations
Multimodal AI
Artificial intelligence systems that can process, understand, and generate information across multiple data types simultaneously, including text, images, audio, video, and other modalities.
5 min readUpdated May 2026
Foundations
Transformer Architecture
A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequential data in parallel, forming the foundation of modern large language models and multimodal AI systems.
7 min readUpdated May 2026