What if large language models (LLMs) had "vision", the ability to understand the meaning of images? Just like we have seen the innovation with LLMs with chatbots and text data, the ability would make another huge impact on businesses by letting LLMs look at and organize millions of images in enterprise IT systems. In this post, we will learn how a large vision language model (VLM) works and changes the business in the next couple of years.