Skip to content
Go back

Claude Code's effort parameter: when to go full send and when to chill

Published:  at  06:00 AM

I wrote about the trigger words a while back. Think, think hard, ultrathink, all that. Those still work. But there’s a more precise control underneath that most people skip right over.

The effort parameter is the actual knob. The trigger words are shortcuts to it. If you’ve ever wondered why Claude sometimes gives you a three-paragraph novel when you asked a yes/no question, or why it burns through your context window planning something you could’ve done in two minutes, effort is what you’re looking for.

Table of contents

Open Table of contents

What does effort actually control?

It’s not just thinking depth. The effort level changes three things at once:

Thinking depth. How much internal reasoning Claude does before responding. At low, it often skips thinking entirely. At max, it can reason for thousands of tokens before writing a single word of output.

Tool call appetite. At higher effort, Claude is more willing to read additional files, run extra commands, and explore before acting. At low, it does the minimum. I learned this the hard way when I set low for a refactoring session and Claude just… renamed the function without checking who called it. Three broken imports later, lesson learned.

Response length. Higher effort produces longer, more detailed responses. Lower effort keeps things tight.

This is why effort feels like it changes Claude’s personality. low Claude is your coworker who gives one-word Slack replies. max Claude is the one who writes a design doc for a button color change.

The four levels

low skips thinking for simple requests. Minimal tool calls. Short responses. Great for renaming a variable, fixing a typo, “what does this function return?”, generating boilerplate you’ll edit anyway. Don’t use it for anything where you’d be annoyed if it missed an edge case.

medium thinks when the problem warrants it, skips when it doesn’t. I’ll be real, I almost never explicitly set this one. It’s the no-man’s-land between “I don’t need you to think” and “please actually think.” Works fine for writing tests against a known function or adding a feature where the pattern is already established.

high is the default, and it’s the default for a reason. Almost always engages thinking. Will read related files without being asked. Gives thorough responses. This is where Claude Code lives out of the box, and you can leave it here 90% of the time. I do.

max is full send. Maximum thinking depth. Claude explores broadly, considers edge cases you didn’t mention, and sometimes finds problems you didn’t know existed. But it takes longer and uses more tokens. Save it for debugging something that’s stumped you, architecture decisions with real tradeoffs, security reviews. Anything where being wrong costs more than being slow.

How to set it in Claude Code

You’ve got three ways.

Trigger words in your prompt are the easiest. Just say “think harder” or “ultrathink” in your message. Claude Code maps these to effort levels. I covered the full list in the thinking triggers post.

The /config menu lets you set the default effort level for the session. Type /config and change it there. Useful if you’re doing a bunch of quick tasks and don’t want to say “keep it brief” every time.

Per-session flag sets it from the start:

claude --effort low

I use this when I’m doing a cleanup pass (rename things, fix formatting, quick edits) where I know I don’t need deep reasoning.

What about the API?

If you’re building something with Claude directly:

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    output_config={"effort": "high"},
    messages=[{"role": "user", "content": "..."}],
)

The effort parameter lives inside output_config. Combine it with thinking: {"type": "adaptive"} and Claude handles the rest. It’ll think deeply at max, skip thinking at low, and make smart calls in between.

If you’re still using manual budget_tokens, that still works but it’s deprecated as of early 2026. Switch to adaptive + effort before Anthropic pulls the plug.

The part nobody talks about: cost

Higher effort means more input tokens (longer thinking burns tokens on reasoning you never see), more output tokens, more tool calls, and more time. All of that adds up.

On a quick task, the difference between low and max can be 10x in time and tokens. I ran a batch rename across 12 files last week at max because I forgot to change it back. Took four minutes and cost me probably 50x what low would have. On the flip side, max once caught a race condition in my auth flow that I’d been debugging for an hour. Paid for itself in the first five minutes.

My rough rule: start at high (the default). Drop to low for batch grunt work. Reach for max when you’re stuck or when the stakes are high.

How effort and adaptive thinking work together

Opus 4.6 introduced adaptive thinking, which means Claude decides for itself how much to reason based on the problem. The effort parameter sets the ceiling and the bias.

At high, Claude will think on most things but might keep it brief for simple questions. At max, Claude thinks extensively on everything, even stuff that probably doesn’t need it. At low, Claude actively avoids thinking unless the problem is clearly complex.

Think of effort as telling Claude how careful to be. Adaptive thinking is Claude being smart about it within those bounds. Most of the time you don’t need to touch it. But when you’re burning tokens on simple tasks or getting shallow answers on hard problems, now you know which knob to turn.

Want more Claude Code productivity tips? I also wrote about the thinking trigger words that map to these effort levels, and my full workflow for getting the most out of Claude Code.

More in this series

New posts, shipping stories, and nerdy links straight to your inbox.

Real insights. Zero filler.


Share this post on:

Next Post
How to Be Cool: A Dad's Guide