Stepfun (Always-On Reasoning)

POST https://api.stepfun.ai/v1/chat/completions

Uses SSE streaming with OpenAI-compatible tool-calling format. Supports vision (base64 images) and reasoning content. Stepfun’s Step-3 models reason on every request — there’s no toggle to disable thinking. Reasoning tokens count toward the completion budget, so the model balances thinking depth against output length automatically. This makes Stepfun a straightforward choice when you always want reasoning without managing mode settings. Best for: Reasoning-heavy tasks, workloads where you always want the model to think, and vision-capable workflows.

Getting an API Key

Go to platform.stepfun.ai
Sign up or log in
Navigate to API Keys and create a new key
Paste it into Wolffish → Settings → Models → Stepfun

Models

Model	Context	Max Output	Modes	Input / Output (per MTok)	Notes
step-3.7-flash	128K	32K	On (always)	$0.83 /$ 6.94	Latest. Frontier reasoning.
step-3.5-flash	128K	32K	On (always)	$0.83 /$ 6.94	Previous gen. Fast reasoning.

Step-3 models always reason — the enable_thinking parameter is accepted but ignored. Reasoning tokens count toward the completion token budget. All models support tool calling with no hard tool-count limit.

Stepfun’s context window is 128K — smaller than the 1M offered by DeepSeek, Qwen, or xAI. If your workflows regularly exceed 128K tokens of context, consider a provider with a larger window.

Reasoning modes

The brain icon next to the message box controls how this model reasons. Click it to cycle through the modes the selected model supports. Two separate ideas combine here:

Thinking — whether the model reasons

Off — the model answers immediately. Fastest and cheapest; ideal for simple, direct tasks.
On — the model first works through the problem in a dedicated reasoning pass before replying. Slower and uses more tokens, but markedly more accurate on multi-step, logical, or ambiguous tasks.

Effort — how hard it thinks

Only effort-capable models expose this; it applies once thinking is on.

High — standard reasoning depth. The right default for most agentic work.
Max — the model reasons longer and deeper for the hardest problems. More tokens and latency in exchange for higher quality on complex work.

Button states

State	Colour	Meaning
Off	gray	Thinking off — direct answer
On	blue	Thinking on — no effort control
High	purple	Thinking on, standard effort
Max	orange	Thinking on, maximum effort

Each model shows only the states it genuinely supports. If a model always reasons (can’t be turned off) or has no effort control, the button reflects that and locks where there’s nothing to change. Wolffish remembers your choice per model. On Stepfun: Step models always reason — the button stays locked on. There’s no off switch and no effort control.

xAI Z.ai

​Stepfun (Always-On Reasoning)

​Getting an API Key

​Models

​Reasoning modes

​Thinking — whether the model reasons

​Effort — how hard it thinks

​Button states

Stepfun (Always-On Reasoning)

Getting an API Key

Models

Reasoning modes

Thinking — whether the model reasons

Effort — how hard it thinks

Button states