Last week, a new state-of-the-art text-to-image model called Flux was released by Black Forest Labs (the original creators of Stable Diffusion), which is open-sourced and offers capabilities comparable to Midjourney. Curious about its quality compared to other models, I conducted a quick one-shot generation test for the following models (prices are estimated based on official pricing websites and replicate.com):
Model Name | Company | Type | Price per Image |
---|---|---|---|
Flux Schnell | Black forest labs | Open Source | $0.003 / image |
Flux Pro | Black forest labs | Open Source | $0.055 / image |
Stable Diffusion 3 | Stability.ai | Open Source | $0.035 / image |
Dalle 3 | OpenAI | Closed Source | $0.040 / image |
Kling | KuaiShou | Closed Source | $0.002 / image |
I used the following prompt for general image with an artist style:
a surreal landscape with floating islands and a giant glowing moon in the style of Hayao Miyazaki
and another prompt to test the text generation:
gateau cake spelling out the words “Takin.AI”, tasty, food photography, dynamic shot
The testing results are listed below.
You can use text2image models such as Flux, SD3, Dalle3, and ControlNets with one single account from Takin.ai - start with a free account to try the examples in this post.
Flux Schnell (fastest - only took 1.3 second):
Flux Pro (took about 8.1 second):
Dalle 3:
SD 3:
Kling:
PS. The featured image for this post is generated using HiddenArt tool from Takin.ai.