OpenAI Accelerating Efforts to Release a Multimodal LLM called GPT-Vision

ODSC - Open Data Science
2 min readSep 29, 2023

According to a report from The Information, in a bid to beat rival Google in releasing an advanced multimodal LLM, OpenAI is reportedly accelerating efforts to release GPT-Vision, codenamed Gobi. This comes a week after Google’s version of a multimodal LLM, Gemini, was released to a small group of companies to test.

But, what exactly is a multimodal LLM? Well according to reports, these large language models will have the ability to process text and images. This means that these LLMs will be able to understand and generate content combining text and images, offering expanded capabilities.

As we saw with the release of GPT-4, such a release would not only maintain OpenAI’s lead in the market but help it maintain its market capture in the general LLM market. But it’s not yet ready. According to the same report, GPT-Vision is stuck in safety reviews.

Though this may be the case, for now, it seems that “OpenAI’s engineers seem close to satisfying legal concerns.”. These concerns have been mounting over the last couple of months as OpenAI has faced multiple threats of lawsuits due to training data from authors and The New York Times.

As mentioned earlier, if OpenAI can pull off a release of Gobi before Google. It would provide the AI start-up with a key edge over rivals who are heavily investing in generative AI in hopes of catching up with OpenAI. It’s a critical advantage they’re pushing not to lose out on.

So the race is on. Open AI is aiming to launch Gobi before Google has a chance to release Gemini. This of course is due to the massive success of ChatGPT. As the first in the market, OpenAI enjoyed its first exposure to new users and it’s clear they want to replicate that again with their multimodal LLM.

With that said, there are some interesting possibilities that Gobi could bring to the table for GPT-4. Gobi may likely build on GPT-4 by adding enhanced visual and multimodal features that OpenAI previewed earlier.

The multimodal arms race is heating up and depending on which company releases first will likely have a major impact on the future of the market for years to come.

Originally posted on OpenDataScience.com

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Interested in attending an ODSC event? Learn more about our upcoming events here.

--

--

ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.