ADVERTISEMENT

DALL·E 2: Hierarchical Text-Conditional Image Generation with CLIP Latents – NewsEverything | Technology

Modern AI systems can create realistic images and art from a description in natural language.

Previously, two approaches have been proposed for the problem of text-conditional image generation: contrastive models like CLIP and diffusion models. Recently, OpenAI has proposed a novel system for this task: DALL·E 2.

Example of a generated image. Credit: DALL·E 2

Example of a generated image. Credit: DALL·E 2

This new method generates more realistic and accurate images with 4x greater resolution than its predecessor DALL·E. The novel system combines two previous methods: a diffusion decoder is trained to invert the CLIP image encoder.

In addition to creating original, realistic images and art from a text description, DALL·E 2 can make realistic edits, like adding or removing elements, to existing images. It can even use an image as input and create different variations of it inspired by the original. Besides empowering people to express themselves creatively, the research also helps humans understand how advanced AI systems see and understand our world.

Advertisement. Scroll to continue reading.
ADVERTISEMENT
ADVERTISEMENT

Link: https://openai.com/dall-e-2/