ChatGPT’s New AI Image Generator Looks Scarily Good

ChatGPT’s New AI Image Generator Looks Scarily Good



Two people smiling as they high-five in front of a whiteboard with handwritten notes. The whiteboard includes points under "Pros" and "Fixes." Both are wearing casual dark t-shirts.A couple of years ago, few people would have suspected that this image is anything other than a real photo. While there are mistakes, especially in the hands, it’s a very real-looking image.

OpenAI has launched a new AI image generator that is a technological step forward, and some of the examples the company shared achieve a frightening degree of verisimilitude.

Called “Images in ChatGPT”, the feature differs from DALL-E — OpenAI’s previous image generator which seems like it’s being retired — because the images come from within ChatGPT-4o.

Describing the model as a “step change”, research lead Gabriel Goh tells The Verge that GPT-4o is “omnimodal” — a model that can generate any kind of data like text, image, audio, and video.

This new type of model is indicative of a wider change in the AI industry where systems combine all types of data. Yesterday, PetaPixel reported on Google’s “Project Astra” which can see the world around it via a smartphone camera and answer questions.

Image Generation Capabilities

In a blog post revealing Images in ChatGPT, OpenAI shared some impressive examples. The pictures of an “OpenAI researcher” working on a whiteboard in a room “overlooking the Bay Bridge” with the photographer’s reflection are scarily good.

A person wearing an OpenAI t-shirt is writing on a whiteboard. The board contains text about AI models and fixes, with a bridge visible through the window in the background.Prompt: A wide image taken with a phone of a glass whiteboard, in a room overlooking the Bay Bridge. The field of view shows a woman writing, sporting a t-shirt with a large OpenAI logo. The handwriting looks natural and a bit messy, and we see the photographer’s reflection.
Two people smiling and giving a high five in front of a whiteboard filled with notes. The person on the left is wearing a black OpenAI t-shirt. The whiteboard contains text about transfer between modalities and fixes for models.Prompt: selfie view of the photographer, as she turns around to high-five him

OpenAI also shared other examples which showcase the model’s ability to generate photorealistic images.

A girl in a white shirt and denim overalls drinks from a pink smoothie with a straw at an outdoor market. People mill about in the background under white tents. The image date reads "8.24.'06.Prompt: Generate a photorealistic image of farmer’s market in toronto on a saturday in summer 2006, it’s a beautiful late june day, people are shopping and eating sandwiches. in focus should be a young asian girl wearing denim overalls and sipping on a strawberry banana smoothie – rest can be blurred. the photo should be reminiscent of that a digital camera from 2006 would take, with a timestamp like a printed photo would have. aspect ratio should be 3:2. Four people standing close together indoors. One person is laughing, two are smiling, with one of them in a playful headlock, and another looks serious. They are all wearing dark-colored clothing.Prompt: Generate a candid, Polaroid-style photograph of four diverse friends in their early 20s at a gritty dive bar. The lighting features a very harsh, direct flash, creating sharp shadows and giving the photo a very overexposed, vintage instant-camera feel. Colors should be slightly muted, evoking nostalgic, early-2000s party vibes. The aesthetic is casually emo. No border or logos or signs. There’s an interesting looking wall behind them with some light graffiti. Quality of the image should be very sharp and detailed (very little grain). The energy should be silly and chaotic. They’re either playfully grimacing, smiling, or pretending to look tough. One of them should have their friend in a silly, playful headlock. Their mouths are closed. A brown horse runs through shallow water, creating ripples. The scene is calm with a light sky and an expansive, smooth water surface that reflects the horse.Prompt: Realistic photograph of a horse galloping from right to left across a vast, calm ocean surface, accurately depicting splashes, reflections, and subtle ripple patterns beneath their hooves. A blurry image of a parked sedan on a dimly lit street at night. A streetlight casts a warm glow, partially illuminating the car. Trees and buildings are faintly visible in the background.Prompt: blurry old analog film photograph, picture of parked car on side street, quiet night. | Credit: Roope Rainisto

Images in ChatGPT doesn’t have a visual watermark the way DALL-E did. However, ChatGPT multimodal product lead Jackie Shannon tells The Verge that “all of our generated images will include standard C2PA metadata to mark the image as having been created by OpenAI.”

The new version of ChatGPT started rolling out yesterday (Tuesday) and will be available to people using the free and paid versions of the chatbot.

Image credits:OpenAI.



Content Curated Originally From Here