Gemma 4 Guides
GLM 5.2 Pricing: API Cost, Subscription Plans & Free Tier (2026)

GLM 5.2 Pricing: API Cost, Subscription Plans & Free Tier (2026)
GLM 5.2, released on June 16, 2026 by Z.ai (formerly Zhipu AI), is a 744-billion-parameter Mixture-of-Experts model built for long-horizon coding and agentic tasks. It offers a 1-million-token context window and is available under an MIT license. With pricing at roughly one-sixth the cost of GPT-5.5, it has quickly become one of the most cost-competitive frontier-class models on the market.
This guide covers every GLM 5.2 pricing option as of June 22, 2026 — API pay-per-token rates, GLM Coding Plan subscription tiers, OpenRouter pricing, and free access paths.
Prices change frequently. Always verify current rates at z.ai/subscribe and bigmodel.cn/pricing.
Quick Answer: GLM 5.2 Pricing at a Glance
| Access Type | Price |
|---|---|
| API — Input | $1.40 per 1M tokens |
| API — Cached Input | $0.26 per 1M tokens |
| API — Output | $4.40 per 1M tokens |
| OpenRouter — Input | $1.00 per 1M tokens |
| OpenRouter — Output | $4.00 per 1M tokens |
| GLM Coding Lite | ~$10–$18/month |
| GLM Coding Pro | ~$30–$50/month |
| GLM Coding Max | ~$80–$112/month |
| Self-hosted (MIT weights) | Free (hardware costs only) |
| New user free tokens | 20M tokens (bigmodel.cn) |
GLM 5.2 Free Tier
There are several ways to use GLM 5.2 for free:
1. New User Token Bonus (bigmodel.cn)
New users who register on bigmodel.cn receive a free resource package of 20 million tokens to get started, along with 120 image and video generation credits. This is the easiest free entry point for developers in China.
2. Z.ai Free Token Seeding
Z.ai has been running a promotional program offering developers a large free token allowance (community reports estimate up to 300M tokens) when using the Z.ai coding CLI. Availability and quota size vary, so confirm current terms at z.ai.
3. Self-Hosting (MIT License)
Because GLM 5.2 is released under the MIT license, you can download the full weights from Hugging Face (zai-org/GLM-5.2) and run the model completely free. The catch: GLM 5.2 is a 744B-parameter MoE model, requiring over 1TB of GPU VRAM in BF16 format. This is only practical for organizations with significant infrastructure.
4. ZCode / Bigmodel Free Daily Credits
Zhipu's ZCode 3.0 product provides 3 million free GLM 5.2 tokens per day for eligible users. Check open.bigmodel.cn/glm-coding for current eligibility.
GLM 5.2 API Pricing
The GLM 5.2 standalone API went live on June 16, 2026 via Z.ai's developer platform.
Token Rates (as of June 2026)
| Token Type | Price per 1M Tokens |
|---|---|
| Input tokens | $1.40 |
| Cached input tokens | $0.26 |
| Output tokens | $4.40 |
The cached input rate is particularly valuable for agentic workflows that repeatedly reference the same large codebase context. At $0.26/MTok, caching cuts input costs by over 80% compared to uncached requests.
Real-World Cost Examples
Agent workflow: 10,000 turns/day (2,000 input + 500 output tokens each)
| Model | Daily API Cost |
|---|---|
| GLM 5.2 | ~$23/day |
| GPT-5.5 | ~$95/day |
| Claude Opus 4.8 | ~$375/day |
Monthly batch processing: 10M tokens (50/50 input/output split)
| Model | Monthly Cost |
|---|---|
| GLM 5.2 | ~$29/month |
| GPT-5.5 | ~$175/month |
| Claude Opus 4.8 | ~$150/month |
The output token gap is especially significant for coding tasks: GLM 5.2's output rate ($4.40/MTok) is over 5x cheaper than Claude Opus 4.8 ($25/MTok) and nearly 7x cheaper than GPT-5.5 ($30/MTok).
GLM 5.2 Subscription Plans (GLM Coding Plan)
In addition to pay-per-token API access, Z.ai offers the GLM Coding Plan — a flat monthly subscription designed for use inside supported coding tools (Claude Code, VS Code, Cursor, and others). GLM 5.2 is available across all tiers.
GLM Coding Plan Tiers
| Plan | Monthly Price (approx.) | Usage Allowance | Best For |
|---|---|---|---|
| Lite | ~$10–$18/month | ~400 prompts/week | Lightweight iteration on small repos |
| Pro | ~$30–$50/month | ~2,000 prompts/week (5x Lite) | Day-to-day development on mid-sized repos |
| Max | ~$80–$112/month | ~8,000 prompts/week (20x Lite) | Heavy workloads, dedicated peak resources |
| Team | Seat-based pricing | Custom | Organizations needing team billing |
Note: Annual billing typically offers a ~10–15% discount. Check z.ai/subscribe for exact current pricing, as promotional rates and regional variations apply.
What You Get with the Coding Plan
- Access to GLM 5.2 and GLM-5-Turbo (for lightweight tasks)
- Integration with Claude Code, VS Code, Cursor, Windsurf, and other IDEs
- Prompt-based quota rather than per-token billing — more predictable for teams
- Priority access during peak hours on Max and Team tiers
GLM 5.2 on OpenRouter
GLM 5.2 is available on OpenRouter, which offers slightly different pricing:
| Token Type | OpenRouter Price per 1M Tokens |
|---|---|
| Input | $1.00 |
| Cached Input | $0.26 |
| Output | $4.00 |
OpenRouter pricing is marginally lower than direct Z.ai API pricing and provides the benefit of a unified API endpoint alongside other frontier models. This is useful if you're already using OpenRouter for multi-provider routing.
Other Third-Party Providers
GLM 5.2 is also available via:
- Together AI — together.ai/models/glm-52
- Requesty — at comparable rates to OpenRouter
Always compare current prices across providers before committing to a deployment architecture.
GLM 5.2 Pricing vs Claude vs GPT-5.5
Here is a full comparison of GLM 5.2 against leading frontier models as of June 2026:
| Model | Input ($/MTok) | Output ($/MTok) | Context Window |
|---|---|---|---|
| GLM 5.2 | $1.40 | $4.40 | 1M tokens |
| GPT-5.5 | $5.00 | $30.00 | 128K tokens |
| Claude Opus 4.8 | $5.00 | $25.00 | 200K tokens |
| Gemini 3.1 Pro | ~$3.50 | ~$10.50 | 2M tokens |
Key takeaways:
- GLM 5.2 input tokens are 3.6x cheaper than Claude Opus 4.8 and GPT-5.5.
- GLM 5.2 output tokens are 5.7x cheaper than Claude Opus 4.8 and 6.8x cheaper than GPT-5.5.
- GLM 5.2's 1M context window matches or exceeds the context size of competing models at a fraction of the price.
- On coding benchmarks (SWE-bench Verified, CodeForces), GLM 5.2 scores ahead of GPT-5.5 on several long-horizon tasks despite the price difference.
For teams running large-scale agentic coding workflows, the savings can be dramatic. A workload costing $1,000/day on Claude Opus 4.8 output tokens would cost around $176/day on GLM 5.2.
How to Get a GLM 5.2 API Key
Via Z.ai (Global)
- Go to docs.z.ai and create an account.
- Navigate to the GLM Coding Plan and select a subscription tier (or choose pay-per-token API access).
- After subscribing, go to Individual Coding Plan > Plan Overview to generate your API key.
- Team Plan users: find your API key under Team Coding Plan > My Plan.
- Set your base URL to
https://api.z.ai/api/coding/paas/v4.
Via bigmodel.cn (China / Chinese Users)
- Register at open.bigmodel.cn.
- New accounts receive 20M free tokens automatically.
- Generate an API key from the console dashboard.
- Use the standard OpenAI-compatible endpoint format.
Quick Setup for Claude Code
# Automated helper
npx @z_ai/coding-helper
# Or manual: edit ~/.claude/settings.json
# Set ANTHROPIC_BASE_URL to https://api.z.ai/api/anthropic
# Set ANTHROPIC_AUTH_TOKEN to your Z.ai API key
Is GLM 5.2 Worth the Price?
For coding and agentic tasks: yes, strongly. GLM 5.2 was purpose-built for long-horizon software engineering. At 1/6th the blended cost of GPT-5.5 while matching or beating it on several coding benchmarks, the value-per-dollar is exceptional for development teams.
When GLM 5.2 makes sense:
- High-volume coding agents (SWE-bench-style workflows)
- Projects requiring 1M-token context (whole-codebase analysis)
- Teams on tight AI infrastructure budgets
- Organizations that want open weights for compliance/self-hosting reasons
When you might prefer Claude or GPT-5.5:
- Tasks heavily weighted toward non-coding reasoning, creative writing, or general knowledge
- Teams already deeply integrated with the Claude or OpenAI ecosystem
- Use cases requiring the absolute state-of-the-art benchmark scores regardless of cost
Bottom line: For coding-focused AI applications, GLM 5.2 delivers frontier-level performance at a mid-tier price point. The MIT license adds optionality for self-hosting, making it a serious enterprise option.
Frequently Asked Questions
How much does GLM 5.2 cost?
GLM 5.2 API pricing (as of June 2026): $1.40/MTok input, $4.40/MTok output, $0.26/MTok cached input via the direct Z.ai API. On OpenRouter, it's $1.00/MTok input and $4.00/MTok output. Subscription plans (GLM Coding Plan) start at approximately $10–$18/month for the Lite tier.
Is there a free GLM 5.2 plan?
Yes. New users on bigmodel.cn receive 20 million free tokens. Z.ai's coding CLI also offers a generous free token seeding program (up to ~300M tokens reported). The full model weights are freely available on Hugging Face under an MIT license for self-hosting.
How much does the GLM 5.2 API cost per token?
Via Z.ai direct API: $0.0000014 per input token and $0.0000044 per output token. Via OpenRouter: $0.000001 per input token and $0.000004 per output token. Cached input tokens cost $0.00000026 each.
Is GLM 5.2 cheaper than Claude?
Yes, significantly. Compared to Claude Opus 4.8 ($5/MTok input, $25/MTok output), GLM 5.2 is 3.6x cheaper on input and 5.7x cheaper on output. At scale, this represents enormous cost savings for high-throughput agentic workloads.
Where can I get a GLM 5.2 API key?
For global access: sign up at z.ai or follow the quick start at docs.z.ai. For users in China: register at open.bigmodel.cn. GLM 5.2 is also accessible via third-party providers like OpenRouter and Together AI.
Does GLM 5.2 have a free API?
Yes — new accounts on bigmodel.cn come with a 20M token free quota. Z.ai also provides a promotional free allowance for the coding CLI. Once free credits are exhausted, usage is billed at standard token rates. The model itself (weights) is free to download from Hugging Face.
Related Guides
- Is GLM 5.2 Free? Every Free Access Option Explained
- GLM 5.2 Review: Benchmarks, Strengths & Weaknesses
- GLM 5.2 Hardware Requirements for Self-Hosting
Last updated: June 22, 2026. Pricing information sourced from Z.ai official documentation, OpenRouter, and third-party benchmark reports. Always verify current rates at z.ai/subscribe and bigmodel.cn/pricing before making purchasing decisions.
Related guides
Continue through the Gemma 4 cluster with the next guide that matches your current decision.

GLM 5.2 Review: Benchmarks, Coding Performance & Is It Worth Using?
GLM 5.2 launched on June 13, 2026 as Zhipu AI's open-weight flagship — 744B MoE parameters, a 1-million-token context window, MIT license, and benchmark scores that rival closed-source frontier models at roughly one-sixth the API cost. Here is everything you need to know.

Is GLM 5.2 Free? Every Free Way to Use It in 2026
GLM 5.2 is free to download and self-host under the MIT license, and free to try via Cloudflare Workers AI and the z.ai web chat. This guide covers every free option and where paid plans kick in.

How to Run GLM-5.2 in Ollama: Cloud Tag, Local Setup & API Guide
GLM-5.2 is available in Ollama via the glm-5.2:cloud tag — one command gets you a 976K-context coding model without managing a 744B-parameter download yourself.
Still deciding what to read next?
Go back to the guide hub to browse model comparisons, setup walkthroughs, and hardware planning pages.
