Qwen (Alibaba Cloud)

POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions

Uses SSE streaming with OpenAI-compatible tool-calling format. Supports vision (base64 images) and reasoning content. Qwen offers one of the widest model ranges of any provider — from the ultra-cheap Qwen 3.5 Flash at

0.06/

0.24 per MTok to the frontier Qwen 3.7 Max. All Qwen3+ models support three reasoning modes (None, High, Max) and up to 1M context. The dedicated Qwen3 Coder Plus model is tuned for code generation tasks. Best for: Cost-efficient agentic workflows, code generation, multilingual tasks, and workloads that benefit from a wide selection of price/performance tiers.

Qwen 3.5 Flash is one of the cheapest reasoning-capable models available — at

0.06/

0.24 per MTok with 1M context, it’s significantly cheaper than DeepSeek V4 Flash while still supporting full reasoning modes. Great for high-volume tasks where cost matters.

Getting an API Key

Go to qwencloud.com
Sign up or log in
Navigate to API Keys and create a new key
Paste it into Wolffish → Settings → Models → Qwen

Models

Model	Context	Modes	Input / Output (per MTok)	Cached	Notes
qwen3.7-max	1M	Off, High, Max	$2.50 /$ 7.50	$0.25	Flagship. Frontier reasoning.
qwen3.7-plus	1M	Off, High, Max	$0.40 /$ 1.60	$0.064	Strong reasoning at mid-range price.
qwen3.6-plus	1M	Off, High, Max	$0.40 /$ 1.60	$0.04	Previous-gen plus.
qwen3.6-flash	1M	Off, High, Max	$0.25 /$ 1.50	$0.025	Fast reasoning.
qwen3.5-plus	1M	Off, High, Max	$0.40 /$ 1.60	$0.04	Balanced quality and cost.
qwen3.5-flash	1M	Off, High, Max	$0.06 /$ 0.24	$0.006	Ultra-cheap reasoning.
qwen3-max	131K	Off, High, Max	$1.60 /$ 6.40	$0.40	Strong reasoning, smaller context.
qwen3-coder-plus	131K	Off, High, Max	$0.40 /$ 1.60	$0.04	Code-optimized.
qwen3-coder-flash	131K	Off, High, Max	$0.40 /$ 1.60	$0.04	Fast code-optimized.
qwq-plus	131K	On	$0.40 /$ 1.60	$0.04	Reasoning-only (always thinks).
qvq-max	131K	On	$1.60 /$ 6.40	$0.16	Vision reasoning (always thinks).
qwen-max	131K	—	$1.60 /$ 6.40	$0.16	Legacy. No reasoning.
qwen-plus	131K	—	$0.40 /$ 1.60	$0.04	Legacy. Fast, no reasoning.
qwen-turbo	1M	—	$0.30 /$ 0.60	$0.03	Legacy. Fast, no reasoning.
qwen-flash	1M	—	$0.06 /$ 0.24	$0.006	Legacy. Ultra-cheap, no reasoning.

Reasoning modes

The brain icon next to the message box controls how this model reasons. Click it to cycle through the modes the selected model supports. Two separate ideas combine here:

Thinking — whether the model reasons

Off — the model answers immediately. Fastest and cheapest; ideal for simple, direct tasks.
On — the model first works through the problem in a dedicated reasoning pass before replying. Slower and uses more tokens, but markedly more accurate on multi-step, logical, or ambiguous tasks.

Effort — how hard it thinks

Only effort-capable models expose this; it applies once thinking is on.

High — standard reasoning depth. The right default for most agentic work.
Max — the model reasons longer and deeper for the hardest problems. More tokens and latency in exchange for higher quality on complex work.

Button states

State	Colour	Meaning
Off	gray	Thinking off — direct answer
On	blue	Thinking on — no effort control
High	purple	Thinking on, standard effort
Max	orange	Thinking on, maximum effort

Each model shows only the states it genuinely supports. If a model always reasons (can’t be turned off) or has no effort control, the button reflects that and locks where there’s nothing to change. Wolffish remembers your choice per model. On Qwen: qwen3.x models support Off / High / Max (effort via a thinking-token budget). qwq and qvq reason always-on (locked on). Legacy qwen-max/plus/turbo/flash don’t reason.

Cost Comparison

Qwen spans a wide price range, competing at every tier:

Tier	Model	Input / Output (per MTok)	Comparable To
Ultra-cheap	qwen3.5-flash	$0.06 /$ 0.24	Cheaper than DeepSeek V4 Flash
Budget	qwen3.7-plus	$0.40 /$ 1.60	MiMo V2.5 Pro range
Mid-range	qwen3.7-max	$2.50 /$ 7.50	Between Kimi K2.6 and Anthropic
Legacy	qwen-max	$1.60 /$ 6.40	MiniMax M3 range

Start with Qwen 3.5 Flash for high-volume tasks, Qwen 3.7 Plus for general agentic work, or Qwen 3.7 Max when you need frontier reasoning. The dedicated Qwen3 Coder Plus model is a good pick for code-heavy workflows at a budget price.

​Qwen (Alibaba Cloud)

​Getting an API Key

​Models

​Reasoning modes

​Thinking — whether the model reasons

​Effort — how hard it thinks

​Button states

​Cost Comparison

Qwen (Alibaba Cloud)

Getting an API Key

Models

Reasoning modes

Thinking — whether the model reasons

Effort — how hard it thinks

Button states

Cost Comparison