Where each tool fits in a real workflow, what the outputs actually look like after iteration — not cherry-picked demos — and which pricing model catches people off guard. No affiliate links.
Generative image and video tools are among the most heavily marketed in the current AI landscape and among the most misrepresented in terms of what it actually takes to produce usable output. This review is about that gap — between the demo reel and the working session.
I tested Midjourney, Kling and Runway over several weeks using real creative briefs: product visualisation for a small business client, social content for a service brand, and short video clips for a presentation. The tests were not designed to produce the best possible output — they were designed to reflect what a non-specialist user with moderate effort would actually get.
Every tool produces better results with more iteration — more refined prompts, more regenerations, more post-processing. This review notes approximately how many iterations were needed to reach usable output for each task, because time is the real cost these tools impose beyond the subscription fee.
Midjourney remains the strongest image generation tool for aesthetic quality at the standard prompt level. The default output — without extensive prompt engineering — is more visually coherent than alternatives. For images where visual quality is the primary requirement (concept art, mood boards, marketing imagery), Midjourney produces results that require the least cleanup.
The Discord-based interface continues to be the main friction point. Working inside Discord is unintuitive for users who are not regular Discord users, and the lack of a traditional file management system means keeping track of outputs requires deliberate organisation. Midjourney has begun rolling out a web interface that addresses some of this, but at the time of testing it was still limited in capability compared to the Discord workflow.
Midjourney's main practical limitation for commercial work is character and object consistency across images. If you need multiple images featuring the same person, product or environment, achieving visual consistency between generations requires significant effort — reference images, style references and careful prompt construction. This is not a Midjourney-specific problem; it applies to all diffusion-based tools. But it is often not mentioned in the marketing, and it matters for most practical commercial uses.
For one-off high-quality images with no consistency requirement, Midjourney is the default choice. For anything that needs to look like a coherent visual system, the iteration overhead is substantial.
Kling is a video generation tool developed by Kuaishou, the Chinese short-video platform. It entered the market as a strong challenger to Runway and Sora and, in tests, delivers competitive results particularly for motion quality. The physics simulation in Kling — the way objects move, fabric behaves, and camera motion is handled — is noticeably above average for the category.
The practical workflow involves uploading a reference image (or generating one) and prompting for motion. Kling's image-to-video capability is its strongest feature. The results at five seconds of video are significantly more reliable than at longer durations, where artefacts and coherence issues increase.
Kling is best suited to short, controlled video clips where the subject is relatively simple — a product rotating, a scene with limited camera movement, a stylised environment. For complex action sequences or multiple subjects interacting, the results are less predictable and require more generation attempts to find a usable clip.
Access is through a web application, which is a significantly better user experience than Midjourney's Discord interface. The pricing model (credits per generation) is transparent, though at higher quality settings the credit consumption is substantial.
Runway Gen-3 Alpha positions itself as a professional video tool — and the platform's broader feature set reflects this. Beyond text-to-video and image-to-video, Runway includes video-to-video transformation, inpainting, motion brush controls and a timeline editor. For users who need more than raw generation — who need to edit, combine and post-process outputs — Runway's ecosystem is the most complete of the three.
The raw video quality from Runway Gen-3 is competitive with Kling. Neither is consistently superior; the results depend significantly on prompt type and subject matter. Runway handles photorealistic human subjects somewhat better than Kling in my tests; Kling handles environmental and product footage more reliably.
Runway's positioning as a professional tool comes with professional pricing. The free tier is extremely limited. The Standard plan ($15/month) provides a reasonable amount of credits for moderate use, but professional workflows will quickly exhaust credits and move to higher tiers. For a team producing regular video content, the budget implications are significant and should be calculated against the specific output volume needed.
Runway also integrates with After Effects and Premiere, which matters for users already in a video production workflow. This is a differentiator that neither Midjourney nor Kling offers.
The gap between what generative AI tools show in demos and what they produce in working sessions is largely an iteration gap. Demo outputs are the result of extensive prompt refinement, cherry-picking from dozens of generations, and often post-processing in traditional tools. The demo is not the first result; it is the best result from a much larger set.
Across the three tools, here is a realistic iteration estimate for reaching publishable output on a moderately complex brief:
Midjourney (image): 4–12 generations to get to a result worth using, more if consistency with other assets is required. Each generation takes 30–60 seconds.
Kling (video, 5 seconds): 3–8 attempts for a usable clip. Each generation takes 2–5 minutes. A longer clip (10 seconds) roughly doubles the failure rate.
Runway (video, 5–10 seconds): Similar to Kling — 4–10 attempts for reliable output. Post-processing time in Runway's editor adds to this but also gives more control over the final output.
These estimates are for users who have some familiarity with prompt construction. First-time users will need more attempts. Pricing models that charge per generation make this iteration cost real and material.
Midjourney suits creative professionals, marketers and small business owners who need high-quality still images — concept visuals, mood boards, social imagery, product mockups — and are willing to invest time in prompt iteration. It is not the right choice if consistency across a series of images is critical or if a traditional file management UI is important.
Kling suits content creators and marketers who need short video clips — product animations, environmental shots, stylised short-form social content — and want a straightforward web-based workflow. It is not suited to complex narrative video or anything requiring extended clip length reliably.
Runway suits video producers and creative agencies who need the full production toolkit: generation, editing, compositing and integration with existing video workflows. The higher price point is justified if the platform replaces meaningful stock footage spend or post-production time. For casual use, the free and low-tier plans are too restrictive to evaluate the tool's potential.
Midjourney starts at $10/month (Basic, ~200 image generations) and scales to $60/month (Pro, ~1,900 generations). The key pricing consideration is that iterations consume credits — a session to find one usable image may consume 8–15 credits. Calculate monthly output need against credit allocation before committing to a tier.
Kling uses a credit model. At launch, it offered generous free credits; at time of testing, the paid tiers were competitive (roughly $8–$35/month depending on credit volume). High-quality generation at longer durations is significantly more credit-intensive, so the effective cost per usable clip can be higher than the headline price suggests.
Runway starts free (limited credits) and goes to $15/month (Standard), $35/month (Pro) and higher for teams. Professional use cases — particularly regular video content production — should budget for the $35+ tier, which provides enough credits for a realistic monthly workflow.
All three tools charge for generations, not for output quality. A generation that produces an unusable result costs the same as one that produces the final asset. This is the structural pricing reality of the category and should be factored into any budget calculation.
These are three tools with genuinely different profiles, and the question of which to use depends almost entirely on what you need to produce.
For still images, Midjourney leads on quality at the standard prompt level. Nothing else consistently matches it for visual coherence without extensive prompt engineering.
For short video clips, Kling and Runway are competitive. Kling is more accessible and has a simpler pricing structure; Runway is the right choice if you need the broader production toolkit or existing video software integration.
For most small business and freelance use cases, one tool is enough. Start with Midjourney if the primary need is imagery; start with Kling if the primary need is short video. Add Runway if the workflow grows to require editing and post-production features.
None of these tools eliminates the need for creative direction, prompt skill or post-processing judgement. They reduce the cost of production; they do not replace the decisions that make output worth producing.
Not sure which tool fits your specific use case — or whether any of them are worth the investment for what you're trying to produce? Happy to talk it through in a free call.
Discuss your projectAll platform reviews on this site are independent — no affiliate commissions, no sponsored placements. Browse the full list or get in touch about a tool that isn't covered yet.