OpenAI Adds Image Generation Capability to GPT-4o, Can Render Text and Offers Prompt-Based Editing

OpenAI has integrated the 4o Image Generation model into GPT-4o, which can be accessed via ChatGPT.

Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 27 March 2025 13:10 IST
Photo Credit: OpenAI

Several ChatGPT users have posted Ghibli-style recreations of their photos on social media

Highlights
  • The feature is available to ChatGPT Plus, Team, and Pro subscribers
  • ChatGPT free users will get access to it in the coming weeks
  • OpenAI said all generated images come with C2PA⁠ metadata
OpenAI added image generation capability to its existing GPT-4o artificial intelligence (AI) model on Tuesday. The San Francisco-based AI firm released the 4o Image Generation model and integrated it into the GPT-4o. The company said that the focus of this image generator is on usefulness instead of decorativeness. It comes with accurate text rendering, high prompt adherence, character consistency, and it offers image editing capability via text prompts. OpenAI has also taken several steps to mitigate the risk of deepfakes and the generation of harmful content.

ChatGPT Gets Enhanced Image Generation Capability

Even before this new addition, ChatGPT could generate images powered by one of the DALL-E models. However, this was a basic image-generation experience where character consistency and text generation were sub-par. In a blog post, the company explained that it now intends to add the image-generation function as a primary capability of language models.

chatgpt img1 ChatGPT image generation

Image generated using GPT-4o 
Photo Credit: OpenAI

 

This means that the company's large language models (LLMs) will now be able to inherently generate images and make edits to generated outputs. Due to the large parameter size of these models and post-training efforts, these models are well suited to understand the context behind user prompts to provide exactly what they're looking for. Also, since these are language models, they can better process and render text accurately.

The new image generator was trained on the joint distribution of online images and text. OpenAI claims that the model understands how images relate to language and how images relate to other images. As a result, it now comes with enhanced character consistency, and users can generate multiple images with the same character without much back-and-forth.

chatgpt img3 ChatGPT image generation

Images with text generated using GPT 4o
Photo Credit: OpenAI/Derya Unatmaz and Les Morgan

 

Additionally, it can also generate images with a large volume of accurate text. This means it can accurately generate images with signboards, restaurant menus, and text written on a whiteboard. Users can also share an image as input, and the chatbot can recreate it in different styles and make edits to it.

ChatGPT will also offer multi-turn generation with the latest image generator. Users will be able to ask the AI chatbot to make changes and additions to a generated image with prompts, and it can refine the output without changing other elements. OpenAI claimed that the model can handle up to 10-20 different objects in a single image and add these elements accurately.

chatgpt img2 ChatGPT image generation

Photorealistic image generated using GPT-4o
Photo Credit: OpenAI

 

These features are currently available to ChatGPT Plus, Team, and Pro subscribers. While it was initially available to the free tier as well, OpenAI CEO Sam Altman stated in a post on X (formerly known as Twitter) that due to high request volume, rollout to the free tier is being delayed indefinitely.

Notably, several users have taken to social media platforms to share Ghibli-styled recreations of their images and popular memes generated using GPT-4o. Altman also changed his profile picture on X to a Ghibli-style rendition of his image. Ghibli was also trending globally on the social platform.

Coming to safety, OpenAI is adding Coalition for Content Provenance and Authenticity (C2PA) information into the metadata of all the AI-generated images so that they can easily be distinguished from authentic images. The AI firm has also built an internal search tool that can verify if an image was generated by the company's model.

Apart from this, the company blocks requests for images that include harmful content such as child sexual abuse material and sexual deepfakes. Additionally, when users are editing images of real people, the company has added restrictions to the kind of imagery that can be created.

Comments

Akash Dutta
Akash Dutta
Akash Dutta is a Senior Sub Editor at Gadgets 360. He is particularly interested in the social impact of technological developments and loves reading about emerging fields such as AI, metaverse, and fediverse. In his free time, he can be seen supporting his favourite football club - Chelsea, watching movies and anime, and sharing passionate opinions on food. More

Comment
