Skip to content

Vision-Language Model (VLM)

A model that processes both images and text, enabling tasks like image captioning, visual question answering, and document understanding. VLMs extend language models with visual perception by encoding images into the same representation space as text.

Related terms

Multimodal AILarge Language Model (LLM)Computer Vision

Related tools

ChatGPT logo
Freemium
ChatGPT

ChatGPT is an AI-powered chatbot tool that helps you automate customer support inquiries with a free tier to get started.

Chatbot
WriteSonic logo
PartnerFreemium
WriteSonic

Track AI visibility across ChatGPT and 10+ AI platforms. Monitor mentions, fix citation gaps, create and refresh content, target Reddit & UGC forums.

Chatbot
Gemini logo
Free
Gemini

Meet Gemini, Google’s AI assistant. Get help with writing, planning, brainstorming, and more. Experience the power of generative AI.

Chatbot
← Back to glossary