> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wolffi.sh/llms.txt
> Use this file to discover all available pages before exploring further.

# Stepfun

> Set up Stepfun — always-on reasoning models with vision support

# Stepfun (Always-On Reasoning)

```
POST https://api.stepfun.ai/v1/chat/completions
```

Uses SSE streaming with OpenAI-compatible tool-calling format. Supports vision (base64 images) and reasoning content.

**Stepfun's Step-3 models reason on every request** — there's no toggle to disable thinking. Reasoning tokens count toward the completion budget, so the model balances thinking depth against output length automatically. This makes Stepfun a straightforward choice when you always want reasoning without managing mode settings.

Best for: Reasoning-heavy tasks, workloads where you always want the model to think, and vision-capable workflows.

## Getting an API Key

1. Go to [platform.stepfun.ai](https://platform.stepfun.ai)
2. Sign up or log in
3. Navigate to **API Keys** and create a new key
4. Paste it into Wolffish → Settings → Models → Stepfun

## Models

| Model              | Context | Max Output | Modes       | Input / Output (per MTok) | Notes                         |
| ------------------ | ------- | ---------- | ----------- | ------------------------- | ----------------------------- |
| **step-3.7-flash** | 128K    | 32K        | On (always) | $0.83 / $6.94             | Latest. Frontier reasoning.   |
| step-3.5-flash     | 128K    | 32K        | On (always) | $0.83 / $6.94             | Previous gen. Fast reasoning. |

<Note>
  Step-3 models **always reason** — the `enable_thinking` parameter is accepted but ignored. Reasoning tokens count toward the completion token budget. All models support tool calling with no hard tool-count limit.
</Note>

<Note>
  Stepfun's context window is 128K — smaller than the 1M offered by DeepSeek, Qwen, or xAI. If your workflows regularly exceed 128K tokens of context, consider a provider with a larger window.
</Note>

## Reasoning modes

The **brain icon** next to the message box controls how this model reasons. Click it to cycle through the modes the selected model supports. Two separate ideas combine here:

### Thinking — *whether* the model reasons

* **Off** — the model answers immediately. Fastest and cheapest; ideal for simple, direct tasks.
* **On** — the model first works through the problem in a dedicated reasoning pass before replying. Slower and uses more tokens, but markedly more accurate on multi-step, logical, or ambiguous tasks.

### Effort — *how hard* it thinks

Only effort-capable models expose this; it applies once thinking is on.

* **High** — standard reasoning depth. The right default for most agentic work.
* **Max** — the model reasons longer and deeper for the hardest problems. More tokens and latency in exchange for higher quality on complex work.

### Button states

| State | Colour | Meaning                         |
| ----- | ------ | ------------------------------- |
| Off   | gray   | Thinking off — direct answer    |
| On    | blue   | Thinking on — no effort control |
| High  | purple | Thinking on, standard effort    |
| Max   | orange | Thinking on, maximum effort     |

Each model shows only the states it genuinely supports. If a model always reasons (can't be turned off) or has no effort control, the button reflects that and locks where there's nothing to change. Wolffish remembers your choice per model.

**On Stepfun:** Step models always reason — the button stays locked on. There's no off switch and no effort control.