IntelliVerse AI Gateway · prices Firecrawl-verified 2026-07-04

Every AI model. One API key. Prices you can audit.

Claude Sonnet & Opus. GPT-5.4. Gemini 3 Flash. DeepSeek V4. Kimi K2.6. Qwen3. Plus Veo 3.1, Sora 2, Seedance 2.0 and Kling 3.0 video, FLUX and Nano Banana images, 120-second long video, AI music, Whisper, TTS and embeddings — behind one platform that never goes down. Chat from $0.24 per million tokens, video clips from $0.10 per second, and every single price traced to the provider's own pricing page.

5-minute integration · no credit card to talk to us · keys usually issued same day

35+
model routes, one key
120s
single-shot long video — self-hosted
$0.24
per 1M tokens — cheapest chat tier
100%
of prices cited to official sources

You're probably overpaying for LLM calls. Here's the math.

Chat that costs 25x less than frontier

Qwen3-30B handles the bulk of chat, summaries and extraction at $0.24/$1.00 per 1M tokens — versus $6/$30 for a frontier model doing the same job. Route smart, keep frontier for the 5% that needs it.

model: "selfhosted-chat"

1M-context coding for $2.40/M out

MiniMax-M3 (428B MoE) passed our 17-prompt coding eval at 94% and reads your whole repo in one 1M-token window — at a fraction of frontier output pricing.

model: "minimax-coder-pro"

Transcribe 1 hour of audio for $0.22

Whisper Large V3 at 217x realtime, self-hosted first with Groq fallback. The same hour through OpenAI's gpt-4o-transcribe list price runs ~$0.36 upstream — before you build any failover.

model: "whisper-1"

For app & game builders

The full media stack — images, video, long video and music behind the same account.

The same engine that renders our own game cinematics and marketing videos is open to you: self-hosted GPUs answer first at near-zero marginal cost, and cloud frontier models fill in automatically by duration, budget and reference needs.

Character-consistent cinematics

Seedance 2.0 takes up to 9 reference images, so your hero looks identical across every cut-scene. Veo 3.1 and Sora 2 bake in synced dialogue and SFX.

model: "seedance-2"

Store creatives & UGC ads at $0.20/s

Veo 3.1 Fast renders app-store and TikTok-format clips with native audio at a fraction of standard Veo — and falls back to Seedance Fast, Kling and Wan on load.

model: "veo-3.1-fast"

120-second shots no hosted API sells

Self-hosted SkyReels-V2 and FramePack generate continuous 60–120s takes; assembly pipelines stitch them into narrated 10–60 minute videos with captions.

model: "skyreels-v2"

Sprites, textures & loopable BGM

Self-hosted FLUX for game art and Nano Banana 2 for text-in-image thumbnails, plus ACE-Step music that loops cleanly — all with scale-to-zero economics.

model: "flux-dev"

The only pricing model you can actually verify

1 · Verified upstream list price

Every price on this page was scraped from the provider's official pricing page via Firecrawl on 2026-07-04 and is cited in the Sources section. Click any [source] link and check us.

2 · You pay exactly 2x

Proxied models bill at 2x provider list. Self-hosted GPU models bill at 2x the cheapest comparable hosted rate — which still lands far below frontier pricing. No tiers, no seats, no minimums.

3 · The margin works for you

Failover chains that always answer, warm-on-traffic GPU scaling, Bedrock prompt caching (cache reads at 0.1x upstream), Langfuse tracing on every call, per-key daily budgets and spend alerts.

The full menu — 35+ routes, 11 tiers

Prices shown as input / output. Platform is what you pay; Upstream is the provider list price it is derived from. Send the alias as the model field — nothing else changes.

Self-Hosted GPU Tier

Qwen-family models served by vLLM on in-cluster GPUs with scale-to-zero. Priced at 2x the cheapest comparable hosted rate — external fallbacks answer instantly while GPUs warm.

AliasServed byModalityContextPlatform priceUpstream list
selfhosted-chat

Primary chat workhorse. Also answers as qwen3-30b, qwen3-chat, selfhosted-primary.

Qwen3-30B-A3B-AWQ · in-cluster vLLM

Fallbacks: OpenAI bridge → Claude Haiku → Kimi K2

text32K$0.24 / $1/1M tokens$0.12 / $0.5[source]
selfhosted-voice

Low-latency tier for chatboxes, games and voice. Aliases: selfhosted-fast, qwen3-8b, Qwen3-30B-A3B.

Qwen3-30B-A3B · in-cluster vLLM (always-fast tier)

Fallbacks: DeepInfra → OpenRouter → SiliconFlow → Fireworks → Haiku

text8K$0.24 / $1/1M tokens$0.12 / $0.5[source]
selfhosted-reasoner

Deliberate chain-of-thought reasoning. Alias: qwq-32b.

QwQ-32B · in-cluster vLLM

Fallbacks: OpenAI pro bridge → Claude → Kimi K2

text (reasoning)32K$0.24 / $1/1M tokens$0.12 / $0.5[source]
qwen3-omni

Multimodal omni model, warm-on-demand.

Qwen3-Omni-30B · in-cluster vLLM

Fallbacks: OpenAI pro bridge → Claude → Kimi K2

text + audio + vision32K$0.24 / $1/1M tokens$0.12 / $0.5[source]
qwen3-coderQwen3-Coder · in-cluster vLLM

Fallbacks: OpenAI bridge → Kimi K2

text (code)32K$0.24 / $1/1M tokens$0.12 / $0.5[source]

Pro Tier

Large mixture-of-experts models for frontier-class quality at open-model prices.

AliasServed byModalityContextPlatform priceUpstream list
selfhosted-chat-pro

Frontier-class open model. Alias: qwen3-122b.

Qwen3.5-122B-A10B (122B MoE)

Fallbacks: OpenAI pro bridge → Claude Fable → Haiku → Kimi K2

text (thinking + tools, 201 languages)262K$0.58 / $4.80/1M tokens$0.29 / $2.40[source]
minimax-coder-pro

Passed our 17-prompt coding eval at 94%. 1M-token context.

MiniMax-M3 (428B MoE, 23B active)

Fallbacks: OpenAI bridge → Kimi K2

text (code + agents)1M$0.6 / $2.40/1M tokens$0.3 / $1.20[source]

Frontier Tier (Claude via Bedrock)

Anthropic Claude with automatic prompt caching (cache reads billed at 0.1x input upstream).

AliasServed byModalityContextPlatform priceUpstream list
claude-sonnet

Daily-driver frontier model with prompt caching (cache reads at 0.1x). Aliases: anthropic/claude-sonnet-4.6, sonnet5, claude-fable, fable5, o3, gpt-4o.

Claude Sonnet 4.6 · AWS Bedrock

Fallbacks: Opus → Haiku → Kimi K2 → OpenAI

text + vision + tools200K$6 / $30/1M tokens$3 / $15[source]
claude-opus

Top-tier reasoning. Aliases: anthropic/claude-opus-4.6, opus6, o1.

Claude Opus 4.6 · AWS Bedrock

Fallbacks: Haiku → Kimi K2 → OpenAI

text + vision + tools200K$10 / $50/1M tokens$5 / $25[source]
claude-haiku

Fast frontier tier. anthropic/claude-haiku-4.5 and gpt-4o-mini serve from the self-hosted fast tier first, with Haiku as fallback.

Claude Haiku 4.5 · AWS Bedrock

Fallbacks: Kimi K2 → OpenAI

text + vision200K$2 / $10/1M tokens$1 / $5[source]

Fast External Tier

Independent providers used both directly and as cold-start bridges for the GPU tiers.

AliasServed byModalityContextPlatform priceUpstream list
kimi-k2

Native multimodal, thinking + non-thinking. Aliases: kimi-k2.6, moonshot-v1-auto, gpt-4-turbo.

Kimi K2.6 · Moonshot AI

Fallbacks: terminal tier (always answers)

text + image + video input256K$1.90 / $8/1M tokens$0.95 / $4[source]
deepseek/deepseek-chat

1M context at commodity pricing. deepseek-v4-pro also available ($0.87/1M out upstream).

DeepSeek V4-Flash

Fallbacks: wildcard route — any deepseek/* model id

text (thinking optional)1M$0.28 / $0.56/1M tokens$0.14 / $0.28[source]
gemini/gemini-3-flash-preview

Grounding with Google Search supported upstream.

Gemini 3 Flash · Google AI

Fallbacks: wildcard route — any gemini/* model id

text + image + video + audio1M$1 / $6/1M tokens$0.5 / $3[source]
openrouter/*

Escape hatch to virtually every hosted open model (e.g. Qwen3-30B from $0.048/$0.193 upstream).

400+ models · OpenRouter

Fallbacks: wildcard route — any openrouter/* model id

variesvariespriced per routed model, 2x upstreamvaries[source]

OpenAI Tier

Direct OpenAI routes, including the guaranteed terminal fallback for every chain.

AliasServed byModalityContextPlatform priceUpstream list
gpt-5.4

gpt-4.1 / gpt-4.1-mini / gpt-4.1-nano also routed.

OpenAI GPT-5.4

Fallbacks: direct route

text + vision + tools400K$5 / $30/1M tokens$2.50 / $15[source]

Image Generation

Self-hosted FLUX plus Gemini Nano Banana 2 and gpt-image-1.5, smart-routed with automatic fallbacks. Sprites, textures, storyboards, thumbnails and store creatives.

AliasServed byModalityContextPlatform priceUpstream list
gpt-image-1.5OpenAI gpt-image-1.5

Fallbacks: gpt-image-1 also routed

image generation + editing$16 / $64/1M image tokens$8 / $32[source]
nano-banana-2

Primary storyboard & sprite engine in our own pipelines. 2K/4K output supported upstream.

Gemini 3.1 Flash Image · Google AI

Fallbacks: FLUX self-hosted → nano-banana-pro

image generation + editing (text-in-image, multi-turn)$0.134/1K-res image$0.067[source]
flux-dev

Self-hosted with scale-to-zero — priced vs the cheapest hosted FLUX.2 [klein] rate. Game sprites, textures and marketing stills.

FLUX.1 dev/schnell · in-cluster ComfyUI GPUs

Fallbacks: Gemini Nano Banana 2 → PiAPI FLUX

image generation (txt2img, img2img, LoRA)$0.03/image (1MP)$0.015[source]

Video Generation (Clips)

Veo 3.1, Sora 2, Seedance 2.0, Kling 3.0, Wan 2.2 and LTX-2 — routed by duration, budget and reference needs through the Media Engine. Native-audio options for trailers, UGC ads and in-game cinematics.

AliasServed byModalityContextPlatform priceUpstream list
veo-3.1

Cinematic quality with synced dialogue & SFX baked in.

Google Veo 3.1 (native audio)

Fallbacks: veo-3.1-fast → Seedance → Kling

text-to-video + image-to-video, 720p–4K≤8s/clip$0.8/sec of video$0.4[source]
veo-3.1-fast

Our default shorts engine — 4x cheaper than Veo standard. veo-3.1-lite ($0.05/s upstream) also routed.

Google Veo 3.1 Fast

Fallbacks: Seedance fast → Kling → Wan

text-to-video + image-to-video, 720p–4K≤8s/clip$0.2/sec of video$0.1[source]
sora-2

sora-2-pro ($0.30/s upstream, 1080p) also routed.

OpenAI Sora 2

Fallbacks: Veo 3.1 → Seedance

text-to-video with audio, up to ~12s≤12s/clip$0.2/sec of video$0.1[source]
seedance-2

Up to 9 reference images for character-consistent shots — the workhorse for game cinematics. Fast tier at $0.24/s upstream.

ByteDance Seedance 2.0

Fallbacks: seedance-2-fast → Kling → Wan

text/image/reference-to-video, up to 15s≤15s/clip$0.607/sec (720p + audio)$0.3034[source]
kling-3.0

Motion brush + element consistency (1–4 refs). $0.112/s upstream with audio. Kling turbo also routed for 5s budget clips.

Kuaishou Kling 3.0

Fallbacks: Wan → Hailuo

text/image-to-video with camera control≤10s/clip$0.168/sec (audio off)$0.084[source]
wan-2.2

Self-hosted with scale-to-zero — priced vs the cheapest hosted Wan rate. Cheapest clip tier on the platform; Wan 2.6, Hailuo and Hunyuan cloud routes behind it.

Wan2.2 TI2V-5B · in-cluster GPUs

Fallbacks: Wan 2.6 cloud → Hailuo → Hunyuan

text/image-to-video, 720p≤8s/clip$0.1/sec of video$0.05[source]
ltx-2

Self-hosted, open-weights — priced vs the hosted LTX-2 rate.

Lightricks LTX-2 19B · in-cluster GPUs

Fallbacks: Wan 2.2 → cloud clip tier

text/image-to-video with native audio, up to 4K≤10s/clip$0.12/sec (1080p)$0.06[source]

Long Video (30s–120s shots, minutes-long assembly)

Self-hosted FramePack and SkyReels-V2 generate continuous 60–120 second shots that hosted APIs can't — and assembly pipelines stitch them into narrated 10–60 minute videos with TTS, music and captions.

AliasServed byModalityContextPlatform priceUpstream list
framepack

Anti-drift long takes — no hosted API offers this length; priced vs the cheapest hosted per-second clip rate.

FramePack (HunyuanVideo) · in-cluster GPUs

Fallbacks: FramePack F1 → cloud FramePack (10–30s)

image-to-video, continuous 60s+ shots≤60s+/shot$0.1/sec of video$0.05[source]
skyreels-v2

Diffusion-forcing for infinite-length generation. Long-form pipelines stitch these into narrated 10–60 minute videos with TTS, music and captions.

SkyReels-V2-DF-14B 720p · in-cluster GPUs

Fallbacks: FramePack → clip tier + stitching

text/image-to-video, up to 120s≤120s/shot$0.1/sec of video$0.05[source]

Music Generation

Self-hosted ACE-Step for zero-marginal-cost background music, with Google Lyria 3 as the hosted route. Loopable BGM for games, scored scenes for video.

AliasServed byModalityContextPlatform priceUpstream list
ace-step

Self-hosted, open-weights — priced vs Lyria 3 Clip. Loopable game BGM and per-scene scoring.

ACE-Step 1.5 · in-cluster GPUs

Fallbacks: Lyria 3 → hosted music tier

text-to-music (BGM, stems, vocals)$0.08/30s track$0.04[source]
lyria-3Google Lyria 3

Fallbacks: ACE-Step self-hosted

text-to-music$0.08 / $0.16/30s clip · /full song$0.04 / $0.08[source]

Speech & Audio

Speech-to-text and text-to-speech — same endpoint, same key.

AliasServed byModalityContextPlatform priceUpstream list
gpt-4o-transcribeOpenAI transcription

Fallbacks: gpt-4o-mini-transcribe ($0.003/min upstream) also routed

speech-to-text$0.012/min audio$0.006[source]
whisper-1

Powers the Meeting Transcripts hub and voice pipeline.

Self-hosted Whisper Large V3 → Groq → OpenAI

Fallbacks: in-cluster STT first (zero marginal cost), Groq Whisper V3 fallback

speech-to-text$0.222/hour audio$0.111[source]
tts-1OpenAI TTS (+ gpt-4o-mini-tts)

Fallbacks: direct route

text-to-speech$30/1M chars$15[source]
hexgrad/Kokoro-82M

24x cheaper than tts-1 upstream — used for audiobooks.

Kokoro-82M TTS · DeepInfra

Fallbacks: direct route

text-to-speech (82M, natural voices)$1.24/1M chars$0.62[source]

Embeddings

Vector embeddings for search and RAG.

AliasServed byModalityContextPlatform priceUpstream list
text-embedding-3-smallOpenAI embeddings

Fallbacks: direct route

embeddings (1536-dim)8K$0.04/1M tokens$0.02[source]
text-embedding-3-largeOpenAI embeddings

Fallbacks: direct route

embeddings (3072-dim)8K$0.26/1M tokens$0.13[source]

Don't see your model? The openrouter/* wildcard reaches 400+ more.

Battle-tested by our own AI products

We are the gateway's biggest customer. Everything below runs in production today and routes its AI traffic through the same endpoint you'd use — so it inherits fallbacks, budgets and tracing for free.

AI Gateway (LiteLLM)

One OpenAI-compatible endpoint for every model above. Automatic multi-provider fallbacks, warm-on-traffic GPU scaling, per-key budgets, and full Langfuse tracing on every call — successes and failures.

OpenAI-compatibleAnthropic-compatiblefallbacksbudgets

Meeting Transcripts Hub

Multi-provider meeting intelligence: webhooks for Wave, Fireflies, Krisp and Nylas, email ingest for Marblism/Eva, plus self-hosted Whisper transcription and LLM summarization with encrypted token storage.

webhooksSTTsummariesAES-256-GCM

Voice Pipeline

Self-hosted speech-to-text (Whisper Large V3) and fast LLM tier with KEDA scale-to-zero. Wakes on demand via Redis triggers; requests are served instantly by external fallbacks while GPUs warm.

Whisperscale-to-zeroKEDA

Media Engine (Content Factory)

Unified image, video, long-video and music generation with smart routing: self-hosted GPUs (FLUX, Wan 2.2, LTX-2, FramePack, SkyReels-V2, ACE-Step) answer first, cloud models (Veo, Sora, Seedance, Kling, Nano Banana) fill in by duration, budget and reference needs. Full pipelines assemble narrated 10–60 minute videos.

imagevideolong videomusicsmart routing

Ads Campaign Engine

AI-driven campaign creation and optimization for the kiosk and ads networks, with LLM-generated creative and targeting.

adscreative generation

AI Microservice (ai-svc)

Shared AI backend for apps and games: chat, image generation, transcription and audiobook TTS with provider fallback chains — every call traced and budgeted through the gateway.

chatimagesTTSSTT

Automation (n8n)

Workflow automations — email ingestion, format agents, Discord alerting — with all LLM and TTS steps routed through the gateway.

workflowsemail ingest

Observability (Langfuse + spend logs)

Every request logged with model, latency, tokens and cost. Daily cost rollups with Discord alerts. Per-key daily budgets enforced at the gateway.

tracingcost alerts

Ship in 5 minutes — no new SDK to learn

The gateway speaks the OpenAI API (chat, embeddings, audio, images) and the Anthropic Messages API. Point your existing SDK at the base URL and swap the model name — apps, games, agents, LangChain, n8n workflows and IDEs all work unchanged. Full API documentation →

curl
curl https://litellm.intelli-verse-x.ai/v1/chat/completions \
  -H "Authorization: Bearer $INTELLIVERSE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "selfhosted-chat",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
OpenAI SDK (TypeScript)
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://litellm.intelli-verse-x.ai/v1",
  apiKey: process.env.INTELLIVERSE_API_KEY,
});

const res = await client.chat.completions.create({
  model: "claude-sonnet",        // or any alias on this page
  messages: [{ role: "user", content: "Ship it." }],
});

Frequently asked questions

Can I use my existing OpenAI SDK / LangChain / Vercel AI SDK code?

Yes. The gateway is a drop-in OpenAI-compatible endpoint (chat completions, streaming, embeddings, audio, images) and also speaks the Anthropic Messages API. Change the base URL and API key — nothing else. LangChain, LlamaIndex, the Vercel AI SDK, Cursor, Cline and n8n all work unchanged.

Which models can I call?

Claude Sonnet 4.6, Claude Opus 4.6, Claude Haiku 4.5, GPT-5.4 and the GPT-4.1 family, Gemini 3 Flash (any gemini/* id), DeepSeek V4 Flash and Pro (any deepseek/* id), Kimi K2.6, Qwen3-30B, Qwen3.5-122B, QwQ-32B reasoner, Qwen3-Omni, MiniMax-M3 for coding, plus 400+ open models through the openrouter/* wildcard. Whisper speech-to-text, OpenAI and Kokoro text-to-speech, and OpenAI embeddings are on the same key. Media generation adds gpt-image-1.5, Gemini Nano Banana 2 and self-hosted FLUX for images; Veo 3.1, Sora 2, Seedance 2.0, Kling 3.0, Wan 2.2 and LTX-2 for video clips; FramePack and SkyReels-V2 for 60–120 second long video; and ACE-Step plus Lyria 3 for music.

Can I generate video and long-form video for my app or game?

Yes. Short clips (3–15s) route across Veo 3.1, Sora 2, Seedance 2.0, Kling 3.0, Wan 2.2 and LTX-2 — picked by duration, budget and how many character-reference images you need. For long video, self-hosted FramePack and SkyReels-V2 generate continuous 60–120 second shots, and assembly pipelines stitch clips, narration, music and captions into 10–60 minute videos. Clip pricing starts at $0.10 per second of output ($0.05/s upstream, 2x rule).

What happens when a provider has an outage?

Nothing, from your side. Every alias has a multi-provider failover chain that terminates in a provider that always answers. If a self-hosted GPU tier is cold, an external provider answers instantly while the GPU warms in parallel — then traffic flips back automatically.

How is pricing calculated?

You pay exactly 2x the provider's public list price (for self-hosted models, 2x the cheapest comparable hosted rate). Every upstream price on this page was scraped from the provider's official pricing page and is cited with a link and a verification date. No hidden markups, no per-seat fees, no minimums.

Why pay 2x instead of going direct?

One key instead of nine provider accounts, automatic failover so you never ship an outage, Bedrock prompt caching passed through (cache reads at 0.1x input), per-key daily budgets so a bug can't burn your wallet, and full request tracing. Most teams spend more than the margin on the first provider outage they didn't handle.

Do you support streaming?

Yes — server-sent events on chat completions and the Anthropic Messages endpoint, identical to the upstream format your SDK already parses.

Can I set spending limits?

Every API key carries a daily budget enforced at the gateway. When it's exhausted you get a clean 429 with a budget error — not a surprise invoice. Budgets and usage are visible per key.

How fast can I get a key?

Email support@intelli-verse-x.ai with your use case and expected volume — keys are usually issued the same day with a starter budget.

Stop juggling nine provider dashboards.

One key. Every model. Automatic failover. Auditable pricing. Your first integration is a base-URL change away.

Sources

Upstream list prices were captured from the following official provider pages via Firecrawl on 2026-07-04. Platform prices are exactly 2x these figures at time of verification; upstream providers may change their pricing at any time.

  1. [1]AnthropicClaude Opus 4.6 $5/$25, Sonnet 4.6 $3/$15, Haiku 4.5 $1/$5 per 1M tokenshttps://www.anthropic.com/pricing
  2. [2]OpenAIgpt-5.5 $5/$30, gpt-5.4 $2.50/$15 per 1M tokens; gpt-image-1.5 $8/$32 image tokens; gpt-4o-transcribe ~$0.006/minhttps://platform.openai.com/docs/pricing
  3. [3]OpenAItts-1 $15.00 per 1M charactershttps://platform.openai.com/docs/models/tts-1
  4. [4]OpenAItext-embedding-3-small $0.02, text-embedding-3-large $0.13 per 1M tokenshttps://platform.openai.com/docs/models/text-embedding-3-small
  5. [5]DeepInfraQwen3-30B-A3B $0.12/$0.50 per 1M tokenshttps://deepinfra.com/pricing
  6. [6]DeepInfraQwen3.5-122B-A10B $0.29/$2.40 per 1M tokenshttps://deepinfra.com/Qwen/Qwen3.5-122B-A10B
  7. [7]DeepInfraMiniMax-M3 $0.30/$1.20 per 1M tokens (1M context)https://deepinfra.com/MiniMaxAI/MiniMax-M3
  8. [8]DeepInfraKokoro-82M TTS $0.62 per 1M charactershttps://deepinfra.com/hexgrad/Kokoro-82M
  9. [9]DeepSeekdeepseek-v4-flash $0.14/$0.28, deepseek-v4-pro $0.435/$0.87 per 1M tokens (1M context)https://api-docs.deepseek.com/quick_start/pricing
  10. [10]Google GeminiGemini 3 Flash Preview $0.50/$3.00 per 1M tokenshttps://ai.google.dev/gemini-api/docs/pricing
  11. [11]Moonshot AIKimi K2.6 $0.95/$4.00 per 1M tokens ($0.16 cache hit, 256K context)https://platform.moonshot.ai/docs/pricing/chat
  12. [12]GroqWhisper Large V3 $0.111 per hour of audio transcribed (217x realtime)https://groq.com/pricing
  13. [13]OpenRouterQwen3-30B-A3B-Instruct-2507 from $0.048/$0.193 per 1M tokenshttps://openrouter.ai/qwen/qwen3-30b-a3b-instruct-2507
  14. [14]SiliconFlowQwen3.5-122B-A10B $0.26/$2.08 per 1M tokens (262K context)https://www.siliconflow.com/pricing
  15. [15]Google GeminiGemini 3.1 Flash Image (Nano Banana 2) $60/1M image tokens ≈ $0.067 per 1K (1024px) imagehttps://ai.google.dev/gemini-api/docs/pricing
  16. [16]Black Forest LabsFLUX.2 [klein] 9B $0.015 first MP, [pro] $0.03, [max] $0.07 per image — cheapest hosted FLUX comparablehttps://bfl.ai/pricing
  17. [17]Google Gemini (Veo)Veo 3.1 with audio $0.40/s (720p/1080p); Veo 3.1 Fast $0.10/s (720p); Veo 3.1 Lite $0.05/shttps://ai.google.dev/gemini-api/docs/pricing
  18. [18]OpenAI (Sora)Sora 2 $0.10/s (720p); Sora 2 Pro $0.30/s (720p), $0.50/s (1024p)https://platform.openai.com/docs/pricing
  19. [19]fal.ai (ByteDance Seedance)Seedance 2.0 $0.3034/s 720p with audio ($0.2419/s fast, $0.682/s 1080p)https://fal.ai/models/bytedance/seedance-2.0/text-to-video
  20. [20]fal.ai (Kling)Kling 3.0 standard $0.084/s (audio off), $0.112/s with audiohttps://fal.ai/models/fal-ai/kling-video/o3/standard/image-to-video
  21. [21]fal.ai (Wan)Wan 2.5 $0.05/s (480p), $0.10/s (720p), $0.15/s (1080p) — cheapest hosted Wan comparablehttps://fal.ai/models/fal-ai/wan-25-preview/text-to-video
  22. [22]fal.ai (Lightricks LTX-2)LTX-2 $0.06/s (1080p), $0.12/s (1440p) — cheapest hosted LTX comparablehttps://fal.ai/models/fal-ai/ltxv-2/text-to-video
  23. [23]Google Gemini (Lyria)Lyria 3 Clip (30s) $0.04 per song; Lyria 3 Pro (full song) $0.08https://ai.google.dev/gemini-api/docs/pricing

Model availability and fallback chains reflect the live gateway configuration. For API keys and volume pricing, email support@intelli-verse-x.ai.