Ship, sell, and scale AI models on a single marketplace.
Buyers call thousands of models and datasets through one gateway, billed per token. Vendors publish in minutes and earn on every request — metering, billing, and payouts handled for you.
OPEN-WEIGHTS & PROPRIETARY MODELS, READY TO CALL
One endpoint. Every model. Billed per token.
Swap models without rewriting code. The Rust inference gateway authenticates, meters, and routes every call — so you only think about prompts.
from modelmarket import Client mm = Client(api_key="mk_live_•••") # call any listing by id resp = mm.infer( "llama-3.1-70b", prompt="Summarise this contract…", stream=True, ) for chunk in resp: print(chunk.text, end="")
import { ModelMarket } from "@modelmarket/sdk" const mm = new ModelMarket({ apiKey: process.env.MM_KEY, }) const res = await mm.infer("llama-3.1-70b", { prompt: "Summarise this contract…", }) console.log(res.text, res.usage)
curl https://api.modelmarket.dev/v1/infer \ -H "Authorization: Bearer mk_live_•••" \ -H "Content-Type: application/json" \ -d '{ "model": "llama-3.1-70b", "prompt": "Summarise this contract…" }' # → streamed tokens, billed on completion
Thousands of models. Already running.
Browse open-weights releases the moment they drop, alongside specialised models and curated datasets from independent vendors — every one callable through the same key.
Llama 3.1 70B
Flagship open instruction model. Strong reasoning, 128K context, fully self-hostable.
Mistral Large
Top-tier proprietary model for complex reasoning, coding, and multilingual tasks.
Whisper Large v3
Robust speech-to-text across 99 languages with timestamping and diarisation.
Stable Diffusion XL
High-resolution text-to-image generation with fine-grained style control.
FinSentiment v2
Specialised financial-news classifier tuned on 12M annotated market headlines.
Embed v3 Multilingual
1024-dim embeddings for retrieval and semantic search across 100+ languages.
Common Crawl Curated
Deduplicated, licence-filtered web text in Parquet — provenance docs included.
Qwen2 7B Instruct
Compact, fast open model with strong coding and tool-use performance.
MedImage-Seg
Radiology segmentation model with audited provenance and licence agreement.
Two sides, one platform.
Whether you're consuming models or monetising them, you're live in three steps.
Discover
Search the catalogue by modality, task, licence, or price. Compare model cards, benchmarks, and sample outputs before you spend a cent.
Get a key
Generate a scoped API key with built-in spend caps and quotas. Top up prepaid credits or get invoiced monthly.
Call & pay per use
Point the SDK at any listing and ship. You're billed only for the tokens you actually consume — itemised to the request.
You set the price. We meter the tokens. You keep the rest.
List for free and pay nothing until your models earn. Our platform fee is a small, transparent charge on the tokens your buyers actually consume — never a flat subscription, never a surprise.
Pay for what flows through.
No seats to buy, no platform to license. Both sides pay on real usage — measured token by token.
Pay per use
Bring a card, get a key, and pay only for the tokens and downloads you consume — at each vendor's listed rate.
- Prepaid credits or monthly invoicing
- Per-listing spend caps against runaway costs
- Itemised, per-request invoices
- Multi-seat team accounts & shared billing
Token-based platform fee
Publish at no cost. We take a small percentage of the usage your buyers consume, then pay out the rest automatically.
- No fee until you earn — usage-based only
- Set any pricing model per listing
- Automated Stripe payouts & invoicing
- Lower rates at volume via vendor plans
Dataset downloads billed per-file or per-GB · Training compute billed per GPU-hour (T4 · A100 · H100)
Build with every model.
Or sell to everyone.
Join the marketplace where AI models, datasets, and compute meet usage-based billing — live in minutes.