Photobombers, bad lighting, bad angles. Lots of goofs and unforeseen circumstances can turn an attempt to capture a special moment into an unusable image. The smartphone industry is well aware and, for years, has been introducing cleanup tools, such as the ability to remove a person or object from a snap. Generative AI can supercharge this process, in both good and bad ways, due to its unpredictability. So it's big news when Google, which makes one of the two major phone operating systems and provides one of the top photo-sharing sites, introduces new tech.
In late August, Google announced a "major upgrade" to the photo-editing capabilities of its Gemini app, based on new technology, called Nano Banana, from its DeepMind AI lab. Clicking the new banana icon in the Gemini app now brings you into a mode where you can generate entirely artificial images or modify ones you've already shot.
Whenever looking at any other AI chatbot, the natural question is, "How does it compare to ChatGPT?". Based on my experience, the answer is "relatively well." Both apps are competent at what have long been standard AI editing tasks, such as removing a person from an image. And Gemini stands out for being fast. The "Just a sec" message it gives as it's working is a bit optimistic, but you will probably get results in under a minute. The performance is essentially the same whether you have a free or paid Gemini account.
ChatGPT, which got an image generation and editing capability upgrade in March, was a lot slower in my experience, especially on the free version, as paying customers get preference. "Just a min" would be a more appropriate estimate, were ChatGPT to provide one.
But good things come to those who wait. Both the free and paid versions of ChatGPT beat both versions of Gemini for complex tasks. For instance, each could reconfigure an image that was shot from the side to look as if it had been taken head on. Gemini didn't fully correct the angle and continually lowered the color saturation.
Results were more dramatic when asking the apps to fill in missing details – in this case, an ancient mural of Ramses II riding on his chariot. The pharaoh still looked good in the poster of the original, but his poor horse's face had flaked off. ChatGPT understood the assignment immediately and filled in a convincing-looking equine mug. It also decided to make some tweaks to the original image that made it a tad cartoonish. Gemini struggled every time. Once, it added the face but erased the legs and other intact parts of the image for no apparent reason. When told of its error, it restored the deleted parts but also left the horse's face missing.
In another test, Gemini didn't even try. For each app, I uploaded two images of a singer (my friend Erica Rowell) performing in New York. One was in black-and-white, the other was in color. I asked each app to colorize the black-and-white image, using the color photo as a reference. ChatGPT not only got the colors right, but it also enhanced the quality of what had been a so-so cellphone photo shot in low light. It sharpened details and added lovely, realistic tones to the woman's face. ChatGPT did mess with the composition a bit, however. Gemini claimed it had also colorized the photo, but in fact, it just kept showing the original color photo I'd given it as a reference. I called this out several times. Gemini apologized, then gave me the exact same result.
In other cases, however, "exact same" is a virtue of Gemini. "Character consistency," or maintaining the original appearance of subjects, is the key selling point of Nano Banana. In one photo of a couple, I removed the man, brought him back, replaced the white wine in both of their glasses with red, and turned the woman's hair from blond to brown. That final operation was where I noticed some failing on ChatGPT's part, with subtle but noticeable changes in the face. And when I generated the colorized photo of the singer a second time, she did look a bit different. ChatGPT will have to limit its eagerness to "fix" things to achieve its full potential.
Bottom line: For a quick fix on a snapshot, such as removing a photobomber, Gemini will serve you well, including taking fewer liberties. But ChatGPT, while not flawless, really shows the magic of generative AI image editing.
Read next: Meet Gemini Home: The AI Upgrade Google Assistant Needed
[Image credit: Ramses original photo via Ahmed88z, CC BY-SA 4.0, via Wikimedia Commons, all other images via Sean Captain/Techlicious]