If you’ve ever wondered why Claude sometimes gives you a three-paragraph novel when you asked a yes/no question, or why it burns through your context window planning something you could’ve done in two minutes, the effort parameter is what you’re looking for.
The fix is one command: type /effort high or /effort max in Claude Code and it changes how hard Claude thinks, how many tools it calls, and how long responses take. I wrote about the trigger words a while back (think, think hard, ultrathink). Those are informal prompt tricks that still work, but typing /effort followed by a level is the official control.
Table of contents
Open Table of contents
What does effort actually control?
It’s not just thinking depth. The effort level changes three things at once:
Thinking depth. How much internal reasoning Claude does before responding. At low, it often skips thinking entirely. At max, it can reason for thousands of tokens before writing a single word of output.
Tool call appetite. At higher effort, Claude is more willing to read additional files, run extra commands, and explore before acting. At low, it does the minimum. I learned this the hard way when I set low for a refactoring session and Claude just… renamed the function without checking who called it. Three broken imports later, lesson learned.
Response length. Higher effort produces longer, more detailed responses. Lower effort keeps things tight.
This is why effort feels like it changes Claude’s personality. low Claude is your coworker who gives one-word Slack replies. max Claude is the one who writes a design doc for a button color change.
The five levels
low skips thinking for simple requests. Minimal tool calls. Short responses. Great for renaming a variable, fixing a typo, “what does this function return?”, generating boilerplate you’ll edit anyway. Don’t use it for anything where you’d be annoyed if it missed an edge case.
medium thinks when the problem warrants it, skips when it doesn’t. I’ll be real, I almost never explicitly set this one. It’s the no-man’s-land between “I don’t need you to think” and “please actually think.” Works fine for writing tests against a known function or adding a feature where the pattern is already established. (This is the default for Opus on Max/Team plans.)
high almost always engages thinking. Will read related files without being asked. Gives thorough responses. You can leave it here 90% of the time. I do.
max is full send. Maximum thinking depth with no token constraint. Claude explores broadly, considers edge cases you didn’t mention, and sometimes finds problems you didn’t know existed. But it takes longer and uses more tokens. Opus only, and it only lasts for the current session. Save it for debugging something that’s stumped you, architecture decisions with real tradeoffs, security reviews.
auto resets to whatever the default is for your plan and model. Use it when you’ve been tweaking effort and just want to go back to normal.
One important detail: low, medium, and high persist across sessions. Set it once and it sticks until you change it. max resets when your session ends.
How to set it in Claude Code
The /effort command is the official way. Type it followed by the level:
/effort low
/effort medium
/effort high
/effort max
That’s it. It sticks for the rest of your session (and low/medium/high persist across sessions too).
The --effort flag sets it when you launch Claude Code:
claude --effort low
I use this when I’m doing a cleanup pass (rename things, fix formatting, quick edits) where I know I don’t need deep reasoning.
Trigger words like “think harder” or “ultrathink” in your prompt also nudge the effort level. These are informal prompt tricks, not official settings, but they work in practice. I covered them in the thinking triggers post.
What about the API?
If you’re building something with Claude directly:
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=16000,
thinking={"type": "adaptive"},
output_config={"effort": "high"},
messages=[{"role": "user", "content": "..."}],
)
The effort parameter lives inside output_config. Combine it with thinking: {"type": "adaptive"} and Claude handles the rest. It’ll think deeply at max, skip thinking at low, and make smart calls in between.
If you’re still using manual budget_tokens, that still works but it’s deprecated as of early 2026. Switch to adaptive + effort before Anthropic pulls the plug.
The part nobody talks about: cost
Higher effort means more input tokens (longer thinking burns tokens on reasoning you never see), more output tokens, more tool calls, and more time. All of that adds up.
On a quick task, the difference between low and max can be 10x in time and tokens. I ran a batch rename across 12 files last week at max because I forgot to change it back. Took four minutes and cost me probably 50x what low would have. On the flip side, max once caught a race condition in my auth flow that I’d been debugging for an hour. Paid for itself in the first five minutes.
My rough rule: start at high (the default). Drop to low for batch grunt work. Reach for max when you’re stuck or when the stakes are high.
How effort and adaptive thinking work together
Opus 4.6 introduced adaptive thinking, which means Claude decides for itself how much to reason based on the problem. The effort parameter sets the ceiling and the bias.
At high, Claude will think on most things but might keep it brief for simple questions. At max, Claude thinks extensively on everything, even stuff that probably doesn’t need it. At low, Claude actively avoids thinking unless the problem is clearly complex.
Think of effort as telling Claude how careful to be. Adaptive thinking is Claude being smart about it within those bounds. Most of the time you don’t need to touch it. But when you’re burning tokens on simple tasks or getting shallow answers on hard problems, now you know which knob to turn.
Want more Claude Code productivity tips? I also wrote about the thinking trigger words that map to these effort levels, and my full workflow for getting the most out of Claude Code.