Apple released an AI model that can edit photos based on text commands

Apple isn’t one of the leading players in the artificial intelligence market, but the tech giant’s new open-source AI model for image editing shows how much the company could contribute to the field.

The model, called MLLM-Guided Image Editing (MGIE), uses multimodal large language models (MLLM) to interpret text commands in image processing. In other words, with this tool, the user can edit photos based on the text they enter. Although it is not the first tool that can do this, “human instructions are sometimes too short for modern methods to capture and execute,” the project document says.

The company developed MGIE together with researchers from the University of California. MLLMs can transform simple or ambiguous text prompts into more detailed and understandable instructions that a photo editor can follow. For example, if a user wants to edit a photo of a pepperoni pizza to “make it healthier,” MLLM can interpret that as “add vegetable toppings” and edit the photo accordingly.

In addition to making basic image changes, MGIE can also crop, resize, and rotate photos, as well as improve their brightness, contrast, and colour balance, all with text prompts. You can also use the tool to edit certain areas of the photo and, for example, change the hair, eyes and clothing of the person in it, or remove elements in the background.

Apple has released a mockup via GitHub, but those interested can also try out a demo version, which is currently hosted on Hugging Face Spaces. The company hasn’t said yet whether it plans to use what it learns from the project as a tool or feature it might include in any of its products.

Read also:

SourceEngadget

Subscribe

0 Comments

Newest

Oldest Most Voted

Inline Feedbacks

View all comments

Apple released an AI model that can edit photos based on text commands

New comments