Gemma 4 Guides

GLM 5.2 Pricing: API Cost, Subscription Plans & Free Tier (2026)

6 min read
glm 5.2glm 5.2 pricingzhipu aillm pricingai api cost
GLM 5.2 Pricing: API Cost, Subscription Plans & Free Tier (2026)

GLM 5.2 Pricing: API Cost, Subscription Plans & Free Tier (2026)

GLM 5.2, released on June 16, 2026 by Z.ai (formerly Zhipu AI), is a 744-billion-parameter Mixture-of-Experts model built for long-horizon coding and agentic tasks. It offers a 1-million-token context window and is available under an MIT license. With pricing at roughly one-sixth the cost of GPT-5.5, it has quickly become one of the most cost-competitive frontier-class models on the market.

This guide covers every GLM 5.2 pricing option as of June 22, 2026 — API pay-per-token rates, GLM Coding Plan subscription tiers, OpenRouter pricing, and free access paths.

Prices change frequently. Always verify current rates at z.ai/subscribe and bigmodel.cn/pricing.


Quick Answer: GLM 5.2 Pricing at a Glance

Access Type Price
API — Input $1.40 per 1M tokens
API — Cached Input $0.26 per 1M tokens
API — Output $4.40 per 1M tokens
OpenRouter — Input $1.00 per 1M tokens
OpenRouter — Output $4.00 per 1M tokens
GLM Coding Lite ~$10–$18/month
GLM Coding Pro ~$30–$50/month
GLM Coding Max ~$80–$112/month
Self-hosted (MIT weights) Free (hardware costs only)
New user free tokens 20M tokens (bigmodel.cn)

GLM 5.2 Free Tier

There are several ways to use GLM 5.2 for free:

1. New User Token Bonus (bigmodel.cn)

New users who register on bigmodel.cn receive a free resource package of 20 million tokens to get started, along with 120 image and video generation credits. This is the easiest free entry point for developers in China.

2. Z.ai Free Token Seeding

Z.ai has been running a promotional program offering developers a large free token allowance (community reports estimate up to 300M tokens) when using the Z.ai coding CLI. Availability and quota size vary, so confirm current terms at z.ai.

3. Self-Hosting (MIT License)

Because GLM 5.2 is released under the MIT license, you can download the full weights from Hugging Face (zai-org/GLM-5.2) and run the model completely free. The catch: GLM 5.2 is a 744B-parameter MoE model, requiring over 1TB of GPU VRAM in BF16 format. This is only practical for organizations with significant infrastructure.

4. ZCode / Bigmodel Free Daily Credits

Zhipu's ZCode 3.0 product provides 3 million free GLM 5.2 tokens per day for eligible users. Check open.bigmodel.cn/glm-coding for current eligibility.


GLM 5.2 API Pricing

The GLM 5.2 standalone API went live on June 16, 2026 via Z.ai's developer platform.

Token Rates (as of June 2026)

Token Type Price per 1M Tokens
Input tokens $1.40
Cached input tokens $0.26
Output tokens $4.40

The cached input rate is particularly valuable for agentic workflows that repeatedly reference the same large codebase context. At $0.26/MTok, caching cuts input costs by over 80% compared to uncached requests.

Real-World Cost Examples

Agent workflow: 10,000 turns/day (2,000 input + 500 output tokens each)

Model Daily API Cost
GLM 5.2 ~$23/day
GPT-5.5 ~$95/day
Claude Opus 4.8 ~$375/day

Monthly batch processing: 10M tokens (50/50 input/output split)

Model Monthly Cost
GLM 5.2 ~$29/month
GPT-5.5 ~$175/month
Claude Opus 4.8 ~$150/month

The output token gap is especially significant for coding tasks: GLM 5.2's output rate ($4.40/MTok) is over 5x cheaper than Claude Opus 4.8 ($25/MTok) and nearly 7x cheaper than GPT-5.5 ($30/MTok).


GLM 5.2 Subscription Plans (GLM Coding Plan)

In addition to pay-per-token API access, Z.ai offers the GLM Coding Plan — a flat monthly subscription designed for use inside supported coding tools (Claude Code, VS Code, Cursor, and others). GLM 5.2 is available across all tiers.

GLM Coding Plan Tiers

Plan Monthly Price (approx.) Usage Allowance Best For
Lite ~$10–$18/month ~400 prompts/week Lightweight iteration on small repos
Pro ~$30–$50/month ~2,000 prompts/week (5x Lite) Day-to-day development on mid-sized repos
Max ~$80–$112/month ~8,000 prompts/week (20x Lite) Heavy workloads, dedicated peak resources
Team Seat-based pricing Custom Organizations needing team billing

Note: Annual billing typically offers a ~10–15% discount. Check z.ai/subscribe for exact current pricing, as promotional rates and regional variations apply.

What You Get with the Coding Plan

  • Access to GLM 5.2 and GLM-5-Turbo (for lightweight tasks)
  • Integration with Claude Code, VS Code, Cursor, Windsurf, and other IDEs
  • Prompt-based quota rather than per-token billing — more predictable for teams
  • Priority access during peak hours on Max and Team tiers

GLM 5.2 on OpenRouter

GLM 5.2 is available on OpenRouter, which offers slightly different pricing:

Token Type OpenRouter Price per 1M Tokens
Input $1.00
Cached Input $0.26
Output $4.00

OpenRouter pricing is marginally lower than direct Z.ai API pricing and provides the benefit of a unified API endpoint alongside other frontier models. This is useful if you're already using OpenRouter for multi-provider routing.

Other Third-Party Providers

GLM 5.2 is also available via:

Always compare current prices across providers before committing to a deployment architecture.


GLM 5.2 Pricing vs Claude vs GPT-5.5

Here is a full comparison of GLM 5.2 against leading frontier models as of June 2026:

Model Input ($/MTok) Output ($/MTok) Context Window
GLM 5.2 $1.40 $4.40 1M tokens
GPT-5.5 $5.00 $30.00 128K tokens
Claude Opus 4.8 $5.00 $25.00 200K tokens
Gemini 3.1 Pro ~$3.50 ~$10.50 2M tokens

Key takeaways:

  • GLM 5.2 input tokens are 3.6x cheaper than Claude Opus 4.8 and GPT-5.5.
  • GLM 5.2 output tokens are 5.7x cheaper than Claude Opus 4.8 and 6.8x cheaper than GPT-5.5.
  • GLM 5.2's 1M context window matches or exceeds the context size of competing models at a fraction of the price.
  • On coding benchmarks (SWE-bench Verified, CodeForces), GLM 5.2 scores ahead of GPT-5.5 on several long-horizon tasks despite the price difference.

For teams running large-scale agentic coding workflows, the savings can be dramatic. A workload costing $1,000/day on Claude Opus 4.8 output tokens would cost around $176/day on GLM 5.2.


How to Get a GLM 5.2 API Key

Via Z.ai (Global)

  1. Go to docs.z.ai and create an account.
  2. Navigate to the GLM Coding Plan and select a subscription tier (or choose pay-per-token API access).
  3. After subscribing, go to Individual Coding Plan > Plan Overview to generate your API key.
  4. Team Plan users: find your API key under Team Coding Plan > My Plan.
  5. Set your base URL to https://api.z.ai/api/coding/paas/v4.

Via bigmodel.cn (China / Chinese Users)

  1. Register at open.bigmodel.cn.
  2. New accounts receive 20M free tokens automatically.
  3. Generate an API key from the console dashboard.
  4. Use the standard OpenAI-compatible endpoint format.

Quick Setup for Claude Code

# Automated helper
npx @z_ai/coding-helper

# Or manual: edit ~/.claude/settings.json
# Set ANTHROPIC_BASE_URL to https://api.z.ai/api/anthropic
# Set ANTHROPIC_AUTH_TOKEN to your Z.ai API key

Is GLM 5.2 Worth the Price?

For coding and agentic tasks: yes, strongly. GLM 5.2 was purpose-built for long-horizon software engineering. At 1/6th the blended cost of GPT-5.5 while matching or beating it on several coding benchmarks, the value-per-dollar is exceptional for development teams.

When GLM 5.2 makes sense:

  • High-volume coding agents (SWE-bench-style workflows)
  • Projects requiring 1M-token context (whole-codebase analysis)
  • Teams on tight AI infrastructure budgets
  • Organizations that want open weights for compliance/self-hosting reasons

When you might prefer Claude or GPT-5.5:

  • Tasks heavily weighted toward non-coding reasoning, creative writing, or general knowledge
  • Teams already deeply integrated with the Claude or OpenAI ecosystem
  • Use cases requiring the absolute state-of-the-art benchmark scores regardless of cost

Bottom line: For coding-focused AI applications, GLM 5.2 delivers frontier-level performance at a mid-tier price point. The MIT license adds optionality for self-hosting, making it a serious enterprise option.


Frequently Asked Questions

How much does GLM 5.2 cost?

GLM 5.2 API pricing (as of June 2026): $1.40/MTok input, $4.40/MTok output, $0.26/MTok cached input via the direct Z.ai API. On OpenRouter, it's $1.00/MTok input and $4.00/MTok output. Subscription plans (GLM Coding Plan) start at approximately $10–$18/month for the Lite tier.

Is there a free GLM 5.2 plan?

Yes. New users on bigmodel.cn receive 20 million free tokens. Z.ai's coding CLI also offers a generous free token seeding program (up to ~300M tokens reported). The full model weights are freely available on Hugging Face under an MIT license for self-hosting.

How much does the GLM 5.2 API cost per token?

Via Z.ai direct API: $0.0000014 per input token and $0.0000044 per output token. Via OpenRouter: $0.000001 per input token and $0.000004 per output token. Cached input tokens cost $0.00000026 each.

Is GLM 5.2 cheaper than Claude?

Yes, significantly. Compared to Claude Opus 4.8 ($5/MTok input, $25/MTok output), GLM 5.2 is 3.6x cheaper on input and 5.7x cheaper on output. At scale, this represents enormous cost savings for high-throughput agentic workloads.

Where can I get a GLM 5.2 API key?

For global access: sign up at z.ai or follow the quick start at docs.z.ai. For users in China: register at open.bigmodel.cn. GLM 5.2 is also accessible via third-party providers like OpenRouter and Together AI.

Does GLM 5.2 have a free API?

Yes — new accounts on bigmodel.cn come with a 20M token free quota. Z.ai also provides a promotional free allowance for the coding CLI. Once free credits are exhausted, usage is billed at standard token rates. The model itself (weights) is free to download from Hugging Face.


Related Guides


Last updated: June 22, 2026. Pricing information sourced from Z.ai official documentation, OpenRouter, and third-party benchmark reports. Always verify current rates at z.ai/subscribe and bigmodel.cn/pricing before making purchasing decisions.

Related guides

Continue through the Gemma 4 cluster with the next guide that matches your current decision.

Still deciding what to read next?

Go back to the guide hub to browse model comparisons, setup walkthroughs, and hardware planning pages.