Summary

  • OpenAI has finally enabled the native multimodal image generation capabilities of its GPT-4o model for ChatGPT users, which it says will soon be extended to its Enterprise, Edu and API customers.
  • GPT-4o was first released last May and its new image generation capabilities have been extensively tested and refined since then.
  • The new image generator is part of the same model that produces text and code, meaning the AI has been trained to understand all these forms of media at once.
  • It is seen as a response to Google’s recent release of its native multimodal image generation in its Gemini 2.0 Flash Experimental model, and offers higher quality image generation with accurate text baked into images.
  • However, concerns remain over copyright issues relating to the data used to train the model, which is likely to include many artworks scraped from the web.

By Carl Franzen

Original Article