“Too easy“—Midjourney tests dramatic new version of its AI image generator

By | November 9, 2022
Eight images we generated with the alpha version of Midjourney v4.
Enlarge / Eight images we generated with the alpha version of Midjourney v4.
Ars Technica

On Saturday, AI image service Midjourney began alpha testing version 4 (“v4”) of its text-to-image synthesis model, which is available for subscribers on its Discord server. The new model provides more detail than previously available on the service, inspiring some AI artists to remark that v4 almost makes it “too easy” to get high-quality results from simple prompts.

Midjourney opened to the public in March as part of an early wave of AI image synthesis models. It quickly gained a large following due to its distinct style and for being publicly available before DALL-E and Stable Diffusion. Before long, Midjourney-crafted artwork made the news by winning art contests, providing material for potentially historic copyright registrations, and showing up on stock illustration websites (later getting banned).

Over time, Midjourney refined its model with more training, new features, and greater detail. The current default model, known as “v3,” debuted in August. Now, Midjourney v4 is getting put to the test by thousands of members of the service’s Discord server that create images through the Midjourney bot. Users can currently try v4 by appending “–v 4” to their prompts.

“V4 is an entirely new codebase and totally new AI architecture,” wrote Midjourney founder David Holz in a Discord announcement. “It’s our first model trained on a new Midjourney AI supercluster and has been in the works for over 9 months.”

Comparison output between Midjourney v3 (left) and v4 (right) with the prompt "a muscular barbarian with weapons beside a CRT television set, cinematic, 8K, studio lighting."
Enlarge / Comparison output between Midjourney v3 (left) and v4 (right) with the prompt “a muscular barbarian with weapons beside a CRT television set, cinematic, 8K, studio lighting.”
Ars Technica

In our tests of Midjourney’s v4 model, we found that it provides a far greater amount of detail than v3, a better understanding of prompts, better scene compositions, and sometimes better proportionality in its subjects. When seeking photorealistic images, some results we’ve seen can be difficult to distinguish from actual photos at lower resolutions.

According to Holz, other features of v4 include:

– Vastly more knowledge (of creatures, places, and more)
– Much better at getting small details right (in all situations)
– Handles more complex prompting (with multiple levels of detail)
– Better with multi-object / multi-character scenes
– Supports advanced functionality like image prompting and multi-prompts
– Supports –chaos arg (set it from 0 to 100) to control the variety of image grids

Reaction to Midjourney v4 has been positive on the service’s Discord, and fans of other image synthesis models—who regularly wrestle with complex prompts to get good results—are taking note.

One Redditor named Jon Bristow posted in the r/StableDiffusion community, “Does anyone else feel like Midjourney v4 is ‘too easy’? This was ‘Close-up photography of a face’ and it feels like you didn’t make it. Like it was premade.” In reply, someone joked, “Sad for Pro prompters who will lose their new job created one month ago.”

Midjourney says that v4 is still in alpha, so it will continue to fix the new model’s quirks over time. The company plans on increasing the resolution and quality of v4’s upscaled images, adding custom aspect ratios (like v3), increasing image sharpness, and reducing text artifacts. Midjourney is available for a monthly subscription fee that ranges between US $10 and $50 a month.

Considering the progress Midjourney has made over eight months of work, we wonder what next year’s progress in image synthesis will bring.

Go to discussion…

Source