OpenAI has finally enabled the native multimodal image generation capabilities of its GPT-4o model for ChatGPT users, which it says will soon be extended to its Enterprise, Edu and API customers.
GPT-4o was first released last May and its new image generation capabilities have been extensively tested and refined since then.
The new image generator is part of the same model that produces text and code, meaning the AI has been trained to understand all these forms of media at once.
It is seen as a response to Google’s recent release of its native multimodal image generation in its Gemini 2.0 Flash Experimental model, and offers higher quality image generation with accurate text baked into images.
However, concerns remain over copyright issues relating to the data used to train the model, which is likely to include many artworks scraped from the web.