Midjourey V6.1 through Midjourney API

A logo of PiAPI
PiAPI

Hi developers!

As you probably already know, the founder and CEO of Midjourney DavidH posted on July 30th 2024 in Midjourney's Discord server that Midjourney V6.1 is here, perfect for all y'all developers to try out! :D

A screenshot of Midjourney CEO David posting about the new V6.1 model
Midjourney V6.1 Announcement

Midjourney V6.1 Update!

Alright, so let's check out the new improvements with the V6.1 model of Midjourney!

  1. 25% faster - all you developers who are complaining about fast tasks are too expensive but relax jobs are too slow, you might be excited about this one!
  2. Better quality! Better images for arms, legs, body parts, plants, animals, etc!
  3. Picture enhancement - eyes, faces, hands all getting improvement in detail
  4. Textual Accuracy - when you put texts in quotation marks in prompts, the new model will generate pictures with these texts with higher accuracy!
  5. Lastly, users don't have to add --v 6.1 to try the new model, the new V6.1 model will be selected as a default and users can use the --v parameter to choose the previous model versions as they desire.

Accessing the Midjourney V6.1 model through API!

As you might have already guessed, since Midjourney has decided that users will switch to the V6.1 model by default; therefore for developer's using our Midjourney API, no changes to the existing code is needed to try out the V6.1 model. And if you want to try previous model versions, you can just add the --v parameter as per Midjourney documentation. Below are two sample cURL code for your reference.

Midjourney API Imagine Endpoint call using the V6.1 model

curl --location 'https://api.piapi.ai/mj/v2/imagine' \

--header 'X-API-Key: your_api_key' \

--header 'Content-Type: application/json' \

--data '{

"prompt": "a boy running in the park, arms in swaying back and forth, wearing a t-shirt saying '\''Love Midjourney'\'' ",

"process_mode": "fast",

"aspect_ratio": "",

"webhook_endpoint": "",

"webhook_secret": ""

}'

Midjourney API Imagine Endpoint call using the V6 model

curl --location 'https://api.piapi.ai/mj/v2/imagine' \

--header 'X-API-Key: your_api_key' \

--header 'Content-Type: application/json' \

--data '{

"prompt": "a picture of pianist playing piano, wearing a golden ring on the index finger of the right hand, face focused with concentration --v 6 ",

"process_mode": "fast",

"aspect_ratio": "",

"webhook_endpoint": "",

"webhook_secret": ""

}'

V6.1 vs V6 Model Comparison

And you know we can't wrap this blog up without some actual test comparison on these two models. Since Midjourney mentioned what V6.1 will excel at, we have tried the following prompts to illustrate actual test results on quality of detail, quality of body parts, text accuracy, generation speed, etc. And of course, all the tests below are ran using Midjourney API from PiAPI!

A picture of a boy running in the park generated by Midjourney API using model V6.1
V6.1 | "a boy running in the park, arms in swaying back and forth, wearing a t-shirt saying 'Love Midjourney'"

A picture of a boy running in the park generated by Midjourney API using model V6
V6 | "a boy running in the park, arms in swaying back and forth, wearing a t-shirt saying 'Love Midjourney'"

As you can see above, although there aren't too much difference in terms of the accuracy of the arms, the letter "Love Midjourney" is much better traced in V6.1!

A person playing piano - generated by Midjourney V6.1
V6.1 | "a picture of pianist playing piano, wearing a golden ring on the index finger of the right hand, face focused with concentration"

A person playing piano - generated by Midjourney V6
V6 | "a picture of pianist playing piano, wearing a golden ring on the index finger of the right hand, face focused with concentration"

As shown above, there seem to be a slight improvement of the hand illustration in V6.1.

A picture of a basketball player dunking - generated by the V6.1 model
V6.1 | "a basketball star dunking midair, 'Best Dad in Town' shown on his jersey, tongues out, face with concentration"

A picture of a basketball player dunking - generated by the V6 model
V6 | "a basketball star dunking midair, 'Best Dad in Town' shown on his jersey, tongues out, face with concentration"

As you can see, again text adherence is definitely better for V6.1 The motion of the arms, legs and hands are a bit more natural in V6.1. The part with how the tongue is interacting with the face is however quite off for both models, presumably due to the abnormally large tongue. Although this is probably understandable given the lack of realistic training data.

Note the two pictures are processed in relax mode, and since our Midjourney API tracks the numbers of seconds that tasks used to generation, below are the times for the two jobs

V6.1: 80 seconds

V6: 88 seconds

Conclusion

And that is it! As you can see, the V6.1 is definitely a bit faster than the V6 model, has significant better text adherence (although not perfect), and we can see a bit of improvement on its illustration on arms, legs, hands, etc.

Like true fans, we are indeed looking forward to the future improvements from Midjourney, we hope you will have fun playing around with the V6.1 model as well!


More Stories