One interface for a market of models open-weights ready

Ship, sell, and scale AI models on a single marketplace.

Drop into a ready-made chat interface and a growing ecosystem of prebuilt tools — search, code execution, retrieval, and more. Connect models, build workflows, and ship faster without starting from scratch.

Browse the catalogue Start selling

$ Users — tiered subscription plans, cancel anytime

$ Developers — list free, earn per token of users' usage.

infer.sh 38 ms

# one key. every model.

curl https://api.modelmarket.shop/v1/infer \

-H "Authorization: Bearer mk_live_•••" \

-d '{ "model": "llama-3.1-70b",

"prompt": "Explain RLHF" }'

▸ 200 OK · 142 tokens · $0.00006

"RLHF aligns a model to human

preference by training a reward…"

OPEN-WEIGHTS & PROPRIETARY MODELS, READY TO CALL

Llama 3.1 Mistral Whisper SDXL Qwen2 Embed-v3 Phi-3

For everyone, not just developers

Powerful AI tools, ready the moment you sign in.

Open a tool, describe what you want in plain words, and it will be done. Here's what's waiting for you inside.

Available now

Chat

Your everyday assistant

Ask questions, draft emails, summarise long documents, plan a trip, or learn something new. It's like texting a knowledgeable friend who's available around the clock and never runs out of patience.

Available now

Prism

Design without a designer

Describe the image, logo, or layout you have in mind and watch it appear. Create social posts, mockups, and artwork in seconds — no complicated design software to learn.

Available now

Lumen

Builds and fixes things for you

Describe a tool, app, or website and Lumen takes it from there — planning the steps, writing the code, running it to check it works, and fixing its own mistakes until everything clicks. It works across your whole project and keeps going on its own, so you just watch the finished result come together.

On the way

More tools

The toolkit keeps growing

New tools land regularly, each built for a specific job. Everything you use shares the same conversation history and files, so you can move between them without ever repeating yourself.

No jargon, no setup, no code

You'll never touch a server, an API key, or a settings file. Every tool works the way texting does — you say what you want, and it happens. If you can describe it, you can do it.

Integrate in minutes

Your models, running everywhere.

Publish once and your model runs on serverless GPU infrastructure distributed globally — cold starts in seconds, scaling to zero when idle, no cluster management required.

Drop-in SDKs for Python & TypeScript, with streaming support.

Itemised usage — every token of usage of your model is your revenue.

Read the API reference

from modelmarket import Client

mm = Client(api_key="mk_live_•••")

# call any listing by id
resp = mm.infer(
    "llama-3.1-70b",
    prompt="Summarise this contract…",
    stream=True,
)

for chunk in resp:
    print(chunk.text, end="")

import { ModelMarket } from "@modelmarket/sdk"

const mm = new ModelMarket({
  apiKey: process.env.MM_KEY,
})

const res = await mm.infer("llama-3.1-70b", {
  prompt: "Summarise this contract…",
})

console.log(res.text, res.usage)

curl https://api.modelmarket.dev/v1/infer \
  -H "Authorization: Bearer mk_live_•••" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.1-70b",
    "prompt": "Summarise this contract…"
  }'

# → streamed tokens, billed on completion

gateway · rust/axum ▸ p50 latency 41ms

Live catalogue

Thousands of models. Already running.

Browse open-weights releases the moment they drop, alongside specialised models and curated datasets from independent developers — every one callable through the same key.

Lm Open

Llama 3.1 70B

Meta · open weights

Flagship open instruction model. Strong reasoning, 128K context, fully self-hostable.

text-gen128K ctxchat

$0.40 / 1M tokens ★ 4.9

Mi Developer

Mistral Large

Mistral AI

Top-tier proprietary model for complex reasoning, coding, and multilingual tasks.

text-gencodemultilingual

$2.00 / 1M tokens ★ 4.8

Wh Open

Whisper Large v3

OpenAI · open weights

Robust speech-to-text across 99 languages with timestamping and diarisation.

speech→text99 langs

$0.006 / min ★ 4.9

SD Open

Stable Diffusion XL

Stability · open weights

High-resolution text-to-image generation with fine-grained style control.

text→image1024px

$0.002 / image ★ 4.7

Fs Developer

FinSentiment v2

Northgate Labs

Specialised financial-news classifier tuned on 12M annotated market headlines.

classifyfinance

$0.10 / 1K reqs ★ 4.6

Em Open

Embed v3 Multilingual

Model Market · open

1024-dim embeddings for retrieval and semantic search across 100+ languages.

embeddingsRAG

$0.013 / 1M tokens ★ 4.8

Da Developer

Common Crawl Curated

Dataset · 2.4 TB

Deduplicated, licence-filtered web text in Parquet — provenance docs included.

parquetpretrain

$0.15 / GB ★ 4.7

Qw Open

Qwen2 7B Instruct

Alibaba · open weights

Compact, fast open model with strong coding and tool-use performance.

text-gentools

$0.08 / 1M tokens ★ 4.6

Mx Developer

MedImage-Seg

Helix Bio

Radiology segmentation model with audited provenance and licence agreement.

segmentationmedical

$0.05 / image ★ 4.5

Explore all 4,200+ listings

How it works

Built for users and developers.

Whether you're here to chat with a wider variety of models or publish models for others, you're up and running in three steps.

Create your account

Sign up in seconds — no credit card needed to start. Your account unlocks the full platform: the chat interface, all available tools, and your personal workspace in one place.

Pick a tool

Use the chat interface for everyday AI assistance, Prism for AI-powered visual design, or Lumen — our agentic coding assistant that writes, runs, and iterates code for you. More tools on the way.

Prism · Lumen · Scout (planned) · Quill (planned)

Get things done

AI handles the heavy lifting while you stay in control. Every tool shares your conversation history and files — switch between them without losing your flow.

For developers

You earn every token. You pay only for the GPU.

Publish for free and keep 100% of what users spend. Your only cost is the GPU compute your models consume — billed transparently at 2× RunPod serverless rates, nothing more.

Zero revenue cut You keep 100% of the tokens users spend on your models. Model Market takes nothing.

Pay only for compute Billed for GPU seconds at 2× RunPod serverless rates — only when your model actually runs.

Real-time analytics Revenue per listing, latency, error rates, and user cohorts — live.

TOKENS SERVED / MONTH 500M

YOUR PRICE PER 1M TOKENS

$0.40· adjustable per listing

Token revenue $200

Platform fee $0

Est. GPU compute cost −$1

Your monthly payout $199

GPU cost estimate assumes ~2s A100 inference per 1M tokens. Actual cost varies by model and hardware tier.

Publish from Hugging Face

Bring your Hugging Face model. Start earning.

Connect a repo and we host it on serverless GPUs — nothing to upload, no servers to run. List it on the marketplace and keep the revenue from every token your model serves.

Keep your token revenue List on the marketplace and earn from every call — the platform takes no cut of what users spend on your model.

No infrastructure to run We pull the weights and host them on serverless GPUs. You pay only for the compute your model actually uses.

Live in minutes Paste a repo path, connect your Hugging Face token once, and publish — no weight uploads, no servers.

Publish a model

Endpoint · Hugging Face Live

Model path

zai-org/GLM-5.2

Token connected

hf_aBcd…

16 GB

24 GB

48 GB

80 GB

Deploy endpoint

Pricing

Simple on both sides.

Users subscribe for a monthly token budget. Developers earn every token spent — and pay only for the GPU time they consume.

For users

Monthly subscription

$? / month

One flat subscription unlocks your monthly token budget — use it across any model on the marketplace. Need more? Top up instantly.

Monthly token allowance included with subscription
Top up any time if you need more tokens
Access every model on the marketplace
Tokens roll over within the billing period

Get started

For developers 0% cut

Keep everything you earn

100% of user tokens to you

Model Market takes no share of your revenue. You pay only for the GPU seconds your models consume — billed at 2× the RunPod serverless rate for your chosen hardware.

GPU compute billing

B200 · 180 GB $17.28/hr

H200 · 141 GB $11.16/hr

RTX 6000 Pro · 96 GB $8.00/hr

H100 · 80 GB $8.36/hr

A100 · 80 GB $5.44/hr

L40 / L40S / 6000 Ada · 48 GB $3.80/hr

A6000 / A40 · 48 GB $2.44/hr

RTX 5090 · 32 GB $3.16/hr

RTX 4090 · 24 GB $2.20/hr

L4 / A5000 / 3090 · 24 GB $1.38/hr

A4000 / A4500 / RTX 4000 / RTX 2000 · 16 GB $1.16/hr

Earn 100% of the tokens users spend on your models
No platform cut — Model Market takes nothing
Billed only for actual GPU seconds used
Transparent rates at 2× RunPod serverless pricing

Start publishing

Get started free

Build with every model.
Or sell to everyone.

Join the marketplace where AI models, datasets, and compute meet usage-based billing — live in minutes.

Get an API key Publish a model

Ship, sell, and scale AI models on a single marketplace.

Powerful AI tools, ready the moment you sign in.

Chat

Prism

Lumen

More tools

Your models, running everywhere.

Thousands of models. Already running.

Llama 3.1 70B

Mistral Large

Whisper Large v3

Stable Diffusion XL

FinSentiment v2

Embed v3 Multilingual

Common Crawl Curated

Qwen2 7B Instruct

MedImage-Seg

Built for users and developers.

Create your account

Pick a tool

Get things done

You earn every token. You pay only for the GPU.

Bring your Hugging Face model. Start earning.

Simple on both sides.

Monthly subscription

Keep everything you earn

Build with every model.Or sell to everyone.

Build with every model.
Or sell to everyone.