Deep Dive · XiaoHu Explains

Claude Fable 5's Official Prompting Guide: A Dozen Ready-to-Use Tweaks, From Curbing Over-Planning to Killing Fake Progress Reports

Anthropic's official documentation teaches you how to adjust system prompts and engineering scaffolding for the new model — the same methods also apply to Claude Mythos 5
60-Second Overview
  • Anthropic released an official prompt engineering guide for Claude Fable 5 and Claude Mythos 5, listing this generation's behavioral differences relative to Claude Opus 4.8, along with corresponding changes to make to prompts and engineering scaffolding
  • The guide provides a dozen-plus instruction snippets you can paste directly into a system prompt, for curbing over-planning, blocking unsolicited refactors, tightening verbose output, and drawing the line on when it should actually stop and ask the user
  • A dedicated audit-style instruction requires the model to verify claims against this session's actual tool call results before reporting progress; Anthropic states this nearly eliminated fabricated status reports on test tasks specifically designed to elicit them
  • The guide recommends adding a send_to_user tool for long-running async agents, so content that must reach the user verbatim skips summarization and is delivered directly
  • The docs warn: if a prompt asks the model to restate its internal reasoning process in reply text, it may trigger Fable 5's reasoning_extraction refusal category, causing the request to auto-downgrade to Claude Opus 4.8
Framing note: this is official Anthropic product documentation, meant to help you get the most out of their new model. Capability descriptions in the text, along with conclusions like "nearly eliminated" or "correct on the first try," come from internal vendor testing and early tester feedback, and have not been independently verified. What follows is a faithful restatement of the document's content and the instructions you can copy as-is.
1Background · What It Is

What Problem This Document Solves

Anthropic recently released an official prompt engineering guide for Claude Fable 5 and Claude Mythos 5, summarizing this generation's behavioral shifts relative to Claude Opus 4.8, and the prompt and engineering scaffolding changes you need to make in response.

Bottom line, this is a tuning manual: the new model is more capable, but several default behaviors have changed, and copying over old prompts and old engineering frameworks unmodified will trip you up. The document pairs every change with an instruction snippet you can paste directly into a system prompt and adjust from there.

🎯Why it's worth reading: it doesn't talk in vague generalities — it gives you a dozen-plus ready-to-use instruction snippets, including one audit-style instruction that, in official internal test tasks specifically designed to induce the model into fabricating progress, brought fake status reports down to nearly zero. You don't have to trial-and-error your way to these settings yourself.
lowmedhighxhigh
First pick the right effort dial
Then set it up for long runs: minutes to hours, verifying evidence before every progress report. These two things are the whole guide's throughline

First, a quick pass on what it does better than the last generation

The document lists seven improvements relative to Claude Opus 4.8. Here's each in one line, without elaboration:

  • Long-horizon autonomy: can keep working unsupervised for hours to days on long complex tasks, without dropping the thread or forgetting the original goal
  • Right the first time: some systems that used to take days of iteration to get working, early testers report got implemented correctly in a single pass (testers' own account)
  • Visual understanding: reads dense technical diagrams, web apps, and detailed screenshots more accurately, at lower token cost, and can use bash and cropping tools to handle rotated, blurry, or noisy images
  • Enterprise workflows: sticks to instructions better, stays on-task, and produces more professional output on financial analysis, spreadsheets, slides, and documents
  • Code review and debugging: markedly higher bug-finding recall than Opus 4.8 (except in cybersecurity), and can search across codebases and history
  • Handling ambiguity: can take a pile of complex, multi-threaded requests and decide the next step on its own
  • Subagent collaboration: more willing to dispatch subagents in parallel, and more stable at sustaining async communication with long-running subagents and peer agents

It also runs a safety classifier that blocks three categories of requests: offensive cybersecurity (building exploits, malware, attack tools), biology and life sciences (experimental methods, molecular mechanisms), and extracting the model's internal summarized reasoning. Benign related work may also get caught by mistake. You can configure a server-side or client-side fallback so refused requests automatically fall back to Claude Opus 4.8.

2Curbing Over-Planning

Curbing Over-Planning: One Line to Make It Act Once It Has Enough

Whenever a task is vague or effort is turned up, Fable 5 tends to overthink: re-deriving facts already settled in the conversation, listing out options it wouldn't even end up choosing, and dragging out the preamble. The instruction below gets it to act as soon as it has enough information.

Without the instruction
  • Repeatedly restates known facts, re-litigates decisions already discussed
  • Lists a pile of options it will never actually pursue
  • Long-winded root-cause explanations, far more preamble than conclusion
With "act once you have enough"
  • Acts directly once it has sufficient information
  • When weighing tradeoffs, gives one recommendation instead of an exhaustive list
  • Leads with the conclusion, keeps the reasoning process in the thinking block
Stops it from over-planning, re-stating known facts, or rambling about options it won't pursue
When you have enough information to act, act. Do not re-derive facts already established in the conversation, re-litigate a decision the user has already made, or narrate options you will not pursue in user-facing messages. If you are weighing a choice, give a recommendation, not an exhaustive survey. This does not apply to thinking blocks.

Alongside this, an engineering-side reminder: long-horizon autonomy refers to the model's ability to keep working unsupervised for a long time — hours or even days without dropping the thread. The tradeoff is that a single request runs longer: at higher effort, for tasks that require gathering context and self-verification, a single request may take tens of minutes, and autonomous runs may extend for hours. Before migrating, make sure your client-side timeout settings, streaming display, and user-facing progress indicators are all properly tuned — ideally switch your engineering framework to periodic async polling of task status, rather than blocking and waiting for it to return.

3Effort Dial

How to Set the Effort Dial, and How to Stop It From Refactoring Code on Its Own Initiative

Effort is the master switch for balancing "smart, slow, expensive" on Fable 5, with four levels: low, medium, high, xhigh. Default to high; use xhigh for the most capability-demanding tasks; dial down to medium or low for routine work. The key point: this generation's low tier often already exceeds the previous generation's xhigh. If a task gets done but slower than it needs to be, dial it down. Click the four levels below to see what each is for.

low
Fastest and cheapest, for routine light work. Dial down here when you want faster, more conversational-feeling interaction — performance still holds up well, and often exceeds the previous generation's xhigh.
Intelligence
Latency
Cost
medium
Another option for routine work. When a task completes but slower than it needs to, drop from high to here to trade for speed.
Intelligence
Latency
Cost
high (default)
The default tier for most tasks. Already delivers strong verification behavior, complex reasoning, and the most rigorous output. The tradeoff: on routine work it gathers context and thinks more than the task actually needs.
Intelligence
Latency
Cost
xhigh
Reserved for the most capability-demanding tasks. Smartest but also slowest and most expensive, and most prone to overthinking routine work.
Intelligence
Latency
Cost
The three bars show relative position only — not official figures, meant only to illustrate the monotonic relationship "higher tier = smarter, but slower and pricier"
An analogy

Effort is like a camera's focus mode: auto is fine for everyday shots, and you switch to fine manual focus only for the hardest shot. Higher isn't always better — you push it up only for the harder work.

Turning up the dial has a side effect: at higher effort, Fable 5 tends to do extra work on its own — adding features, refactoring, adding unnecessary validation. To curb this "unsolicited tidying up," add the following:

Stops it from adding features, refactoring, or adding unnecessary validation on its own at higher effort levels
Don't add features, refactor, or introduce abstractions beyond what the task requires. A bug fix doesn't need surrounding cleanup and a one-shot operation usually doesn't need a helper. Don't design for hypothetical future requirements: do the simplest thing that works well. Avoid premature abstraction and half-finished implementations. Don't add error handling, fallbacks, or validation for scenarios that cannot happen. Trust internal code and framework guarantees. Only validate at system boundaries (user input, external APIs). Don't use feature flags or backwards-compatibility shims when you can just change the code.
4 tiers
Effort splits into low / medium / high / xhigh; official guidance is to default to high for most tasks, use xhigh for the most capability-demanding work, and dial down for routine tasks
Low tier > previous gen's high tier
Fable 5's low-effort setting still performs well, often exceeding the previous generation's xhigh performance
4Output & Pausing

Making Summaries Sound Human, and When to Actually Stop and Ask

This generation's instruction-following is strong enough that one sentence can steer an entire class of behavior — you don't have to spell out every single case. This section gives two: one to curb verbosity, one to draw the line on when it should genuinely stop and ask the user.

Without constraints, Fable 5 tends to sprawl: walking through options it won't even pursue, long explanations of root cause, heavily structured PR descriptions, and a comment on every single line explaining what the next line does. A short refinement instruction works just as well as listing out every individual issue:

Have replies lead with the conclusion, without compressing into arrow-chain jargon
Lead with the outcome. Your first sentence after finishing should answer "what happened" or "what did you find": the thing the user would ask for if they said "just give me the TLDR." Supporting detail and reasoning come after. Being readable and being concise are different things, and readability matters more.

The way to keep output short is to be selective about what you include (drop details that don't change what the reader would do next), not to compress the writing into fragments, abbreviations, arrow chains like A → B → fails, or jargon.

"Pausing to ask the user" in long workflows works the same way — no need to enumerate every scenario. Only three situations genuinely warrant stopping: an irreversible or destructive action, a real change in scope, or information only the user can provide. When you hit one of these, ask a question and end the turn — don't leave off with a promise instead.

Draws the line on when it should genuinely stop and ask the user
Pause for the user only when the work genuinely requires them: a destructive or irreversible action, a real scope change, or input that only they can provide. If you hit one of these, ask and end the turn, rather than ending on a promise.
5Curbing Fake Progress · Core

Curbing Fake Progress Reports: Make It Back Up Claims With Evidence

During long autonomous runs, the model can fabricate progress reports — claiming a step is done when it isn't. Anthropic's fix is to have it self-check before reporting: verify every claim against actual tool call results from this session, and explicitly flag anything unverified.

A rare paragraph with measured results

This is one of the few places in the whole document that gives effect data. Anthropic states that on test tasks specifically designed to induce the model into fabricating progress, adding this audit-style instruction nearly eliminated fabricated status reports.

① Claim: a certain step is complete
② Go back through this session's tool call results
check item by item
Evidence found
Write it into the report, state the conclusion plainly, no hedging
No evidence
Explicitly flag in the report: not yet verified
↺ Next claim, repeat the check
Audit-style instruction that curbs fake progress reports
Before reporting progress, audit each claim against a tool result from this session. Only report work you can point to evidence for; if something is not yet verified, say so explicitly. Report outcomes faithfully: if tests fail, say so with the output; if a step was skipped, say that; when something is done and verified, state it plainly without hedging.
Nearly eliminated
After adding the audit-style verification instruction, fabricated status reports were nearly eliminated on test tasks specifically designed to induce fabricated progress (Anthropic internal test result)
Days → correct in one pass
Early testers report that some systems that previously took days of iteration to get working, Fable 5 implemented correctly in a single pass (tester report, not an official benchmark)
6Action Boundaries

Drawing the Line: Don't Take Action While Just Analyzing

Fable 5 will occasionally act on its own initiative — drafting an email nobody asked for, or creating a git branch as a backup "just to be safe." The instruction below curbs this, separating "should report" from "should act."

There are two core boundaries. First: when the user is just describing a problem, asking a question, or thinking out loud rather than requesting a change, the deliverable is your assessment report — stop once you've delivered it, don't go fix things before anyone has actually asked. Second: before running a command that changes system state (restart, delete, config edit), confirm the evidence actually supports that specific action — a signal that pattern-matches a known failure may have a different underlying cause.

Draws the action boundary, so it doesn't take action on its own while merely analyzing a problem
When the user is describing a problem, asking a question, or thinking out loud rather than requesting a change, the deliverable is your assessment. Report your findings and stop. Don't apply a fix until they ask for one. Before running a command that changes system state (restarts, deletes, config edits), check that the evidence actually supports that specific action. A signal that pattern-matches to a known failure may have a different cause.
7Subagents · Memory

Making the Most of Parallel Subagents, and Building a Memory System That Accumulates Its Own Experience

Fable 5 is more willing than the previous generation to dispatch subagents in parallel, and more stable at sustaining async communication with long-running subagents. The usage boils down to three things: dispatch more, make clear when it should delegate, and receive messages asynchronously instead of blocking and waiting for each subagent to return one by one. Click below to toggle between the old and new orchestration styles.

OrchestratorSubagent A⏳ waits for returnSubagent B⏳ waits againSubagent C
Serial: every step is blocked on the previous subagent's return; total time is the sum of every subagent's runtime, and the slowest one drags down everything.
Orchestrator⇉ dispatches at onceSubagent ASubagent BSubagent C↩ receives asyncOrchestrator keeps working
Parallel: dispatches all independent subtasks at once, the orchestrator doesn't block and keeps moving forward, and whoever finishes reports back asynchronously. Long-lived subagents retain context across subtasks, saving time and cost via cache reads, and nothing gets stuck waiting on the slowest one. Step in only if a subagent goes off track or is missing context.
Encourages it to use parallel subagents more, without blocking to wait
Delegate independent subtasks to subagents and keep working while they run. Intervene if a subagent goes off track or is missing relevant context.

Also Giving It a Memory System That Accumulates Experience

Fable 5 performs especially well when it can "record past lessons and refer back to them later." Give it somewhere to write notes — a single Markdown file is enough. The writing rules are as follows:

Writing rules for building a memory system
Store one lesson per file with a one-line summary at the top. Record corrections and confirmed approaches alike, including why they mattered. Don't save what the repo or chat history already records; update an existing note rather than creating a duplicate; delete notes that turn out to be wrong.

If you want to cold-start this memory from existing historical sessions, have it review past sessions, use subagents to distill themes and lessons, and store them in a designated file:

Cold-starting the memory system from historical sessions
Reflect on the previous sessions we've had together. Use subagents to identify core themes and lessons, and store them in [X]. Make sure you know to reference [X] for future use.
8Preventing Dropoff

Don't Let It Drop Off Mid-Task: No Stalling for Permission During Autonomous Runs

There are two minor quirks that only occasionally surface in very long sessions, and the document gives fixes for both.

The first: late in a session, it may finish a sentence like "I'll go run X now" without actually issuing the corresponding tool call, or it may stop and ask "would you like me to…" even when it already has enough information. A simple "continue" or "go ahead and do it end to end" gets it moving again. Add the following system reminder to autonomous pipelines to prevent this at the root:

System reminder for autonomous pipelines that prevents it from stalling for permission mid-task
You are operating autonomously. The user is not watching in real time and cannot answer questions mid-task, so asking "Want me to…?" or "Shall I…?" will block the work. For reversible actions that follow from the original request, proceed without asking. Offering follow-ups after the task is done is fine; asking permission after already discussing with the user before doing the work is not. Before ending your turn, check your last paragraph. If it is a plan, an analysis, a question, a list of next steps, or a promise about work you have not done ("I'll…", "let me know when…"), do that work now with tool calls. End your turn only when the task is complete or you are blocked on input only the user can provide.

The second: in very long sessions, it may occasionally suggest starting a new session on its own, propose a summary handoff, or cut down the scope of work by itself. This is mostly triggered by engineering frameworks that display a "remaining token countdown" to the model. Try not to expose the exact context-remaining number to it; if the framework has to show it, add a line of reassurance:

Reassurance to prevent it from stopping proactively just because it sees a context countdown
You have ample context remaining. Do not stop, summarize, or suggest a new session on account of context limits. Continue the work.
9Dedicated Channel & Rollout

Explain the "Why," Add a Dedicated Channel That Must Reach the User, and a Final Rollout Checklist

The last two improvements come with a dedicated channel, followed by a rollout checklist.

Include the "Why" in Requests

Fable 5 performs better when it understands the intent behind a request: with that context, it can tie the task to relevant information instead of guessing at intent on its own. This is especially true for long-running agents pulling together multiple workflows — spell out "why you're asking":

Template for adding the "why" to a request
I'm working on [the larger task] for [who it's for]. They need [what the output enables]. With that in mind: [request].

Sound Human When Wrapping Up a Long Session

In agent conversations with heavy context and lots of tool calls, Fable 5 tends to produce hard-to-read text: dense arrow-chain jargon, implementation details piled deep, references to internal reasoning the user never saw, over-technical phrasing. Add a communication-style addendum so it switches back to plain human language in its final summary, instead of carrying over the working-mode shorthand:

Makes it sound human when wrapping up a long session, instead of carrying over working-mode jargon
Terse shorthand is fine between tool calls (that's you thinking out loud, and brevity there is good). Your final summary is different: it's for a reader who didn't see any of that.

If you've been working for a while without the user watching (overnight, across many tool calls, since they last spoke), your final message is their first look at any of it. Write it as a re-grounding, not a continuation of your working thread: the outcome first, then the one or two things you need from them, each explained as if new. The vocabulary you built up while working is yours, not theirs; leave it behind unless you re-introduce it.

When you write the summary at the end, drop the working shorthand. Write complete sentences. Spell out terms. Don't use arrow chains, hyphen-stacked compounds, or labels you made up earlier. When you mention files, commits, flags, or other identifiers, give each one its own plain-language clause. Open with the outcome: one sentence on what happened or what you found. Then the supporting detail. If you have to choose between short and clear, choose clear.

Add a Dedicated Channel That Must Reach the User: the send-to-user Tool

When running long-running async agents, give it a way to push content the user must see verbatim, even before the turn has ended. This could be a deliverable (a generated code snippet, a drafted message), a progress update with concrete numbers, or a direct answer to a question the user asked mid-task. The tool's input is exactly the message to display — once the model calls it, you render the input directly to the interface, and the tool result just needs to send back a simple confirmation. The key point: the tool input is never summarized — the content arrives verbatim.

An analogy

The send-to-user tool is like passing a note straight to your boss during a meeting, instead of waiting until the meeting ends to relay what the note said. The content arrives intact and on the spot, with no compression along the way.

Regular text reply
Written within the turnContent is mixed into the model's narrative text
End of turnMay get summarized and compressed by the system
What the user sees is a summary
send_to_user tool
As tool inputContent is packaged into the tool call
Delivered mid-taskTool input isn't summarized, and the turn doesn't end
User sees the original text, word for word
JSON definition for the send_to_user tool
{
  "name": "send_to_user",
  "description": "Display a message directly to the user. Use this for progress updates, partial results, or content the user must see exactly as written before the task finishes.",
  "input_schema": {
    "type": "object",
    "properties": {
      "message": {
        "type": "string",
        "description": "The content to display to the user."
      }
    },
    "required": ["message"]
  }
}

Defining the tool alone isn't enough: without an instruction in the system prompt telling it to call the tool, Fable 5 rarely invokes it on its own. Pair it with a guiding line, and restrict it to user-facing content only — don't let narrative or internal reasoning leak into this channel either:

Instruction paired with the send_to_user tool
Between tool calls, when you have content the user must read verbatim (a partial deliverable, a direct answer to their question), call the send_to_user tool with that content. Use send_to_user only for user-facing content, not for narration or reasoning.

Rollout Checklist

The document closes with a recommended set of scaffolding changes — just follow along:

  1. Set a higher difficulty ceiling for tasks. Pick a task harder than what you'd have assigned to the previous-generation model, and let Fable 5 scope it, ask clarifying questions, and execute on its own. Testing it only on easy tasks will make you underestimate its actual ceiling.
  2. Explicitly schedule self-checks into long runs. Independent, fresh-context verification subagents tend to be more reliable than self-criticism. Add the following to long-running tasks:
    Instruction template for periodic self-checks during long-running tasks
    Establish a method for checking your own work at an interval of [X] as you build. Run this every [X interval], verifying your work with subagents against the specification.
  3. Re-audit old skills and prompts. Skills written for the previous generation tend to be too over-specified for Fable 5, and can actually drag down output quality. A lot of those old rule-by-rule writing styles can just be deleted outright — see what the default behavior looks like first. Fable 5 can also update a skill on the fly as it works.
  4. Don't have it restate its internal reasoning. Repeating internal reasoning verbatim into reply text is a hard line for this generation of models. ⚠ Telling the model to echo, transcribe, or narrate its internal reasoning as reply text — whether in a prompt, a skill, or a harness instruction — may trigger Fable 5's reasoning_extraction refusal category, causing the request to auto-downgrade back to Claude Opus 4.8. When migrating, audit old skills and system prompts for instructions like "reflect on this" or "show your thinking" and remove them. For reasoning visibility, switch to reading the structured thinking blocks from adaptive thinking, and use the send-to-user tool to surface progress during long runs.
  5. Install the send-to-user tool. Give long-running async agents a client-side tool that delivers messages to the user verbatim without ending the turn.
In Anthropic's testing, this nearly eliminated fabricated status reports even on tasks designed to elicit them.Prompting Claude Fable 5, Anthropic official documentation
Source: Anthropic's official documentation "Prompting Claude Fable 5" (platform.claude.com). This article is a Chinese-language visual explainer of that vendor's official documentation; the capability claims, internal test results, and early tester feedback in the text are all stated in Anthropic's own terms and have not been independently verified by a third party. All English instructions inside the code blocks are verbatim from the original document, unchanged, and ready to reuse as-is. The same methods apply equally to Claude Mythos 5.