Alibaba unveils Qwen3.7-Max, its flagship AI model for real-world tasks

Alibaba just dropped what might be its most consequential AI model to date. Qwen3.7-Max, the flagship proprietary model in the company’s latest series, is purpose-built for the kind of messy, multi-step work that most AI models still struggle with: coding complex projects, automating office workflows, and running autonomously for hundreds or even thousands of sequential steps without falling apart.

What Qwen3.7-Max actually does

The model first appeared as a preview on the LM Arena leaderboard on May 14, before Alibaba’s Qwen team formally announced it between May 19 and May 21 at the Alibaba Cloud Summit in Hangzhou. It’s now accessible through the Alibaba Cloud Model Studio API, which means developers and businesses can start building with it immediately.

In internal testing, Qwen3.7-Max executed over 1,000 tool calls autonomously in a single run. Inference speed improved by approximately 10 times compared to earlier versions. Alibaba credits iterative kernel optimizations for that leap.

Compatibility is broad. The model works with popular protocols like OpenAI’s API format and agent frameworks including Claude Code and OpenClaw.

On the benchmark front, Qwen3.7-Max posted a 92.4 on GPQA Diamond, which tests graduate-level reasoning. It scored 80.4 on SWE-Verified, a software engineering benchmark that measures a model’s ability to resolve real GitHub issues. And it hit 91.6 on LiveCodeBench, which evaluates coding performance on problems that didn’t exist in its training data.

The agent race is heating up

Alibaba has been building toward this moment across several model generations. Community observers have noted the speed of development between the Qwen3.6 series and this latest release. Most models can handle a few tool calls. Sustaining performance across hundreds or thousands of sequential actions requires a fundamentally different approach to training and optimization.

What this means for investors and the broader market

A GPQA Diamond score of 92.4 and a LiveCodeBench score of 91.6 position Qwen3.7-Max in direct competition with the best models from any lab on the planet.

Models that can handle 1,000-plus tool calls in a single session are targeting use cases like automated software development pipelines, customer service operations that span multiple systems, and financial reporting processes that currently require teams of analysts.

Alibaba has made no connection between Qwen3.7-Max and any digital asset or blockchain system. The model is positioned squarely within Alibaba’s cloud and enterprise services ecosystem.

What Qwen3.7-Max actually does

Compatibility is broad. The model works with popular protocols like OpenAI’s API format and agent frameworks including Claude Code and OpenClaw.

The agent race is heating up

What this means for investors and the broader market

A GPQA Diamond score of 92.4 and a LiveCodeBench score of 91.6 position Qwen3.7-Max in direct competition with the best models from any lab on the planet.

Alibaba has made no connection between Qwen3.7-Max and any digital asset or blockchain system. The model is positioned squarely within Alibaba’s cloud and enterprise services ecosystem.

Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

Alibaba unveils Qwen3.7-Max, its flagship AI model for real-world tasks

What Qwen3.7-Max actually does

The agent race is heating up

What this means for investors and the broader market

Alibaba unveils Qwen3.7-Max, its flagship AI model for real-world tasks

What Qwen3.7-Max actually does

The agent race is heating up

What this means for investors and the broader market

Get Crypto Briefing in your inbox