🎬 Qtum.ai text-to-video is open to everyone
Four generative video models behind one prompt box. No credit card, no subscription, no token packs — sign in with Google or MetaMask Snap, get 500 free tokens, and pay as you go in QTUM.
Good video models are everywhere. Good results aren't.
Text-to-video crossed the threshold from party trick to production tool sometime in the last eighteen months. The clips are sharper, the physics mostly behave, and a ten-second shot that would have needed a camera crew now needs a sentence.
But anyone who has actually used these tools knows the two places projects go sideways. First: model choice. Every model has a personality — one nails fluid camera moves but melts faces in second twelve, another is cheap and fast but tops out at 720p. Render the same prompt on the wrong model and you pay flagship prices for draft-quality output, or wonder why your "quick test" looks like a feature film trailer you didn't need. Second: the prompt itself. "A dog running on a beach" is not a shot. A shot has a lens, a light source, a camera move, and a mood — and video models respond to that vocabulary far more reliably than most people expect.
This guide covers both. We'll walk through the four models on Qtum.ai and what each one is actually for, then get into prompt mechanics — structure, lens language, lighting terms — with real prompts and the clips they rendered, inline, so you can judge for yourself.
What is Qtum.ai?
Qtum.ai is the video generation arm of the Qtum network — the same ecosystem behind the Qtum AI Router. Where the router unifies text and vision LLMs behind one API key, Qtum.ai puts four generative video models behind one prompt box, aimed squarely at creators and small teams. The pitch:
- No credit card, no subscriptionSign up with Google or MetaMask Snap. There's no monthly plan to forget about and no "Pro tier" gate in front of the good models.
- Pay as you go, in QTUMTokens are the unit of work, QTUM is the primary way to buy them. Top up your balance, render, done. No token packs, no prepaid bundles, no expiry.
- 500 free tokens to startEvery new account starts with enough tokens to render real clips on real models before spending anything.
- Your content stays yoursNo data harvesting, no training on your prompts or renders, no tracking pixels, no ad retargeting. You render it, you own it.
- Four models, one interfaceSeedance 1.0, 1.5 and 2.0 plus HappyHorse — switch per render, pay per render, compare side by side.
- Built on running infrastructureBacked by Qtum's GPU compute layer and a blockchain that has run without downtime since 2017.
That last point about payment is worth repeating because it shapes how you use the service: QTUM is the primary method to pay for tokens. Your render budget lives in the same asset as the rest of your Qtum activity — there's no card-on-file, no fiat checkout in the loop, and topping up from a wallet via MetaMask Snap takes seconds. More on the mechanics below.
Meet the models
All four models take a text prompt and return a clip. That's where the similarity ends — each has a distinct sweet spot, and the trick to getting good value out of Qtum.ai is matching the job to the model rather than defaulting to the most expensive one.
Seedance 1.0
Fast draftsThe original Seedance — quick, cheap, and still remarkably capable for short clips. Coherent motion, decent prompt adherence, fast queue times. It won't deliver flagship detail, but it renders in a fraction of the time and cost, which makes it the natural place to iterate on an idea.
Best for: prompt iteration, storyboarding, short social clips, testing ten variations before committing tokens to a final render.
Seedance 1.5
Everyday defaultThe mid-generation refresh, and for most people the sensible default. Noticeably better prompt adherence than 1.0 — it actually respects your lens and lighting direction — with crisper 1080p output and more stable subjects across the clip. Strong balance of cost, speed and quality.
Best for: day-to-day production work — product clips, social content, B-roll — where quality matters but every render doesn't need to be a hero shot.
Seedance 2.0
FlagshipThe current state of the art on the platform. Cinematic detail, convincing physics, complex camera moves (orbits, dolly-ins, crane shots) that hold together, and multi-shot storytelling with scene continuity. Independent reviews consistently place it at the top of its class. It costs the most per second — and earns it.
Best for: final renders, hero shots, anything with faces or water or fast motion, multi-shot sequences where continuity matters.
HappyHorse
Long clips, less spendAlibaba's video model, and the value pick for longer work. It generates multi-shot compositions up to 15 seconds with solid physics and notably stable camera work, supports reference image uploads, and undercuts Seedance 2.0 on price. Its known trade-off: continuity softens in long clips — faces and fine textures can drift across shots.
Best for: longer multi-shot clips on a budget, animating a reference image, scenes where motion matters more than facial close-ups.
Which model for what — at a glance
| Model | Best for | Max clip | Output | Tokens / clip |
|---|---|---|---|---|
| Seedance 1.0 | Iteration, drafts, short social clips | 5 s | 720p | ~40 |
| Seedance 1.5 | Everyday production, product clips, B-roll | 10 s | 1080p | ~90 |
| Seedance 2.0 | Final renders, hero shots, multi-shot continuity | 15 s | 1080p | ~220 |
| HappyHorse | Long multi-shot clips, image-to-video, budget renders | 15 s | 1080p | ~150 |
Token costs scale with clip length and resolution; the figures above are typical for a default render. The live per-model pricing is always visible in the Qtum.ai console next to the generate button — what you see is what gets deducted.
How to write a video prompt that actually directs
Video models are trained on footage that came with descriptions written by people who think in shots — so the closer your prompt reads to a shot description from a treatment or a stock-footage caption, the better the model performs. The single biggest upgrade you can make is to stop describing a thing and start describing a shot.
The anatomy of a shot prompt
A reliable structure, in order:
A weathered fisherman in a yellow oilskin hauls a net over the gunwale, spray flying, on a small boat in rough grey-green seas. Handheld 35mm, tracking close on his hands, overcast diffuse light, cold and even. Gritty documentary realism, muted color grade.
Front-load what matters most — models weight the start of the prompt heaviest. Subject and action first, garnish later. And keep it to one scene per prompt: if your prompt contains "and then," you're describing an edit, not a shot. (The exception is multi-shot models — Seedance 2.0 and HappyHorse understand explicit "cut to:" transitions; see the examples below.)
Speak lens. Speak light.
These two vocabularies do more work per word than anything else you can type. You don't need film school — you need about a dozen terms:
📷 Camera & lens terms
- wide-angle / 16mm
- Big environments, dramatic perspective, landscapes and interiors.
- 35mm
- The documentary look — natural perspective, what your eye expects.
- 85mm, shallow depth of field
- Portrait compression, creamy blurred background, subject isolation.
- macro
- Extreme close-up — texture, droplets, mechanical detail.
- aerial drone shot
- High and moving. Pair with "slowly orbiting" or "flying over."
- dolly-in / dolly-out
- Smooth push toward or pull away from the subject.
- tracking shot
- Camera travels alongside a moving subject.
- handheld
- Subtle shake, documentary energy. Omit it and you get tripod-smooth.
- slow pan / static shot
- Calm, controlled. "Static shot" is underrated — it stops the model inventing camera moves you didn't ask for.
💡 Lighting terms
- golden hour
- Warm, low-angle sun, long shadows. The most flattering light there is.
- blue hour
- Just after sunset — cool, moody, city lights starting to glow.
- overcast / diffuse
- Soft, even, shadowless. Great for faces and product shots.
- rim lighting / backlit
- Bright edge around the subject, separates them from the background.
- volumetric light / god rays
- Visible beams through mist, dust or windows. Instant atmosphere.
- neon / practical lights
- Light from sources inside the scene — signs, lamps, screens. Pairs beautifully with rain.
- candlelight / tungsten
- Warm, orange, intimate, flickering.
- high-key / low-key
- Bright and airy versus dark and dramatic — sets the whole emotional register in one word.
Description tips that consistently pay off
- Concrete nouns beat adjectives. "A rusted blue 1970s pickup" renders better than "an old cool truck." Models can't paint "cool"; they can paint rust.
- Give every subject a motion verb. Video models need to know what moves. "Steam rising," "hair blowing," "waves crashing" — static prompts produce static, lifeless clips.
- Describe what you want, never what you don't. Negations backfire: "no people in the street" often renders people. Say "an empty street" instead.
- Name one style anchor, not five. "Shot on 35mm film" or "anamorphic lens flare, cinematic" sets a coherent look. Stacking "cinematic, anime, hyperrealistic, vaporwave" gives the model a contradiction, not a style.
- Specify pace. "Slow motion," "time-lapse," "real-time" — otherwise the model picks for you.
- Keep it 40–90 words. Long enough to direct, short enough that nothing gets ignored.
A 15-second makeover
Same dog, same beach. The second prompt tells the model where the camera is, what the light is doing, and how time flows. Quality keywords like "4K" and "epic" do almost nothing — lens and light do almost everything.
Example prompts — rendered on Qtum.ai
Four prompts, every clip below rendered as-is on Qtum.ai with default settings — no cherry-picking. The first shows a single model; the next three run the same prompt through two models side by side, so you can see exactly what you're paying for as you move up the lineup. The exact prompt sits above each set — copy them, remix them, re-render them yourself.
1 · The fast draft — Seedance 1.0
A single subject, tight framing, a few seconds. Exactly the kind of shot you iterate on cheaply before spending flagship tokens.
2 · The everyday upgrade — Seedance 1.0 vs 1.5
More demanding: a moving subject, a moving camera, practical lighting, and a specific color grade. The same prompt on both models shows what the mid-generation jump to 1.5 actually buys you — tighter prompt adherence and steadier reflections.
Same prompt, both models. Watch the puddle reflections and the rain holding together under the tracking move on 1.5 — the kind of direction 1.0 starts to lose. For a few more tokens, 1.5 is the sensible everyday default.
3 · The money shot — Seedance 1.0 vs 2.0
Everything that breaks lesser models in one prompt: water physics, an orbiting aerial camera, volumetric light, and a film look. Run on the entry model and the flagship back to back, the gap is the clearest argument for when 2.0's tokens are worth it.
The crashing water and the continuous orbit are exactly what separates a flagship from the rest. 1.0 gets you the idea; 2.0 holds the physics and the camera move together — this is what the flagship tokens are for.
4 · The long multi-shot — HappyHorse vs Seedance 2.0
A 15-second, three-shot micro-story using explicit cut to: transitions. Both models handle multi-shot continuity — the comparison is about price and personality: HappyHorse delivers the length for less, Seedance 2.0 spends more for tighter detail.
Three shots, fifteen seconds, a single render on each. HappyHorse holds the motion and the cuts at a meaningfully lower token cost; Seedance 2.0 keeps faces and fine textures crisper across the shots. Pick by what the clip needs — length and budget, or detail.
Paying with QTUM — how billing works
Qtum.ai deliberately skips the subscription playbook. There's no monthly plan, no seat license, no token packs that expire at the end of the quarter. The model is simpler:
- Tokens are the unit of work. Every render quotes its token cost up front — based on model, clip length and resolution — and deducts exactly that.
- QTUM is the primary way to buy tokens. Top up your balance straight from your wallet — MetaMask Snap makes this a few clicks — and the tokens land in your account. No card on file, no fiat checkout, no billing relationship to manage.
- You start with 500 free tokens. Enough to render real clips on real models, including the flagship, before you've spent anything at all.
Pay-as-you-go does something subtle to how you work: because there's no monthly quota to "use up," there's no pressure to over-render — and because flagship renders visibly cost more than drafts, the draft-cheap / finish-expensive workflow from this guide isn't just good practice, it's the economically obvious one.
Qtum: the network behind the prompt box
Qtum.ai isn't a standalone startup renting GPUs — it's the media-generation layer of a network that's been running production traffic since 2017.
A blockchain with seven years of uninterrupted operation
Qtum's underlying blockchain launched in 2017 and hasn't had downtime since. The network combines Bitcoin's UTXO security model with EVM-compatible smart contracts, runs on Proof-of-Stake consensus, and has shipped 50 core updates over eight years without breaking the chain. That operational record is what backs the token billing your renders run on.
The Qtum ecosystem
- Qtum blockchain — Bitcoin-secured UTXO with Ethereum-compatible smart contracts, since 2017.
- Qtum.ai — the AI video generation covered here: four models, pay-as-you-go in QTUM.
- Qtum AI Router — a unified, OpenAI- and Anthropic-compatible inference layer across multiple LLM providers, billed in QTUM credits.
- Qtum Ally — a desktop AI agent that integrates multiple LLMs out of the box.
- GPU infrastructure — the compute layer powering inference and video generation workloads across the network.
The takeaway: your QTUM balance is one account across the whole stack — render video today, route LLM traffic tomorrow, same asset, same wallet.
Render your first clip
Pick a prompt off this page — they're written to be stolen — choose a model, and watch it come back as footage. 500 free tokens on signup, no credit card, no subscription, pay as you go in QTUM.