A little more than a year after the introduction of built-in image creation tools, OpenAI is taking this technology to a whole new level with a major update. The company presented ChatGPT Images 2.0, positioning the system as a fundamental breakthrough in the methods of generating and editing visual content with artificial intelligence. The developers sought to transform an ordinary generator into a full-fledged interactive tool for creativity. OpenAI describes this release as a “step change” in the development of graphical models, emphasizing improved instruction following, high-quality text rendering, and advanced frame composition. The model is now able to logically justify the completion of tasks, independently verify the results obtained, and attract data from external sources.

ChatGPT Images 2.0 offers two specific formats of operation: “Instant and Thinking, each of which addresses specific creative needs. The first mode focuses on efficiency – before the official release, OpenAI secretly tested it on the LMArena platform under the code name “duct tape”. It provides fast results while maintaining high visual quality.

The Thoughtful mode, on the other hand, is slower and more thorough, as the system analyzes the logic of image construction before creating it. This allows you to ensure that the appearance of characters is consistent across different shots and create a coherent visual narrative. This feature opens up great opportunities for manga development, storyboarding, and multi-scene design. This distinction is critical, as previous models often had problems with visual consistency, and the new approach turns image creation into a structured process rather than a random one-off result.

The most significant transformation has occurred in the way humans interact with the system. OpenAI no longer considers image generation as a simple request-response scheme. During the presentation, the developers emphasized that users now engage in an interactive dialog with AI, to which the latter responds in a meaningful way. Users are able to improve graphics directly in the chat: change the scale, adjust individual details, or redesign the composition without having to start from scratch. The model preserves the context during editing, which facilitates iterative design.

For example, in one of the demo scenarios, the system suggested eight different summer outfits based on a single uploaded photo. In another case, AI analyzed user reviews on social media, visually structured this data, and generated a QR code to go to ChatGPT.

OpenAI has also significantly improved its work with non-Latin language systems. The model now reproduces text in Japanese, Korean, Chinese, Hindi, and Bengali much better. The company also claims higher accuracy in reproducing various artistic styles and better matching to specific visual languages.

These updates make the tool more practical for the video game and visual storytelling industries. On the technical side, Images 2.0 supports variable aspect ratios from 3:1 to 1:3, is capable of producing images in resolutions up to 2K and creating up to eight variants in one generation cycle.

See also:

Sourceinterestingengineering

Subscribe

0 Comments

Newest

OldestMost Voted

Up to 8 options per request: OpenAI raises the bar for image generation

New comments