NOTE! I came across the Original Thread by Kanika (HIGHLY recommended you follow!) (@KanikaBK): View the full thread on X
Need ideas to DeChomp your Claude Token Usage? Consider this super thread I uncovered on X.com!
Quick Summary
This guide shares 10 simple and powerful ways to save tokens when using Claude. The main idea is to treat Claude like an expensive helper. Use smart habits so you waste less context and only spend tokens on what really matters.
Core techniques include:
- Resetting your chat every 15 to 20 messages
- Asking for a plan before doing the full work
- Creating a short reusable context card about you and your project
- Turning off tools when you do not need them
- Keeping files small and asking very clear questions
- Making small targeted edits instead of rewriting everything
- Using cheaper tools for easy tasks and saving Claude for hard work
- Splitting big tasks into cheap research and expensive writing phases
- Treating your usage limit as 80 percent of what you really have
- Making Claude aware of your token budget from the very beginning
These habits help you get great results while using far fewer tokens and avoid hitting limits in the middle of your work.
1. Reset Your Chats Every 15–20 Messages
Claude re-reads the entire conversation history on every turn. Long chats become “token black holes” – making every new message more expensive and hitting limits faster.
Treat chats as task-specific rooms, not endless threads.
When a chat gets long: Ask Claude: “Summarise everything important we’ve done here into a ‘project card’ under 300–400 tokens: goal, key decisions, current status, and next three actions.”
Then start a fresh chat and paste: “New chat, same project. Here’s the context from the last thread: [paste card]. Continue from here.”
This compresses thousands of tokens of history into just a few hundred reusable ones.
2. Ask for Plans First, Then Execution
Don’t jump straight into writing full articles, strategies, or code. Start with a cheap planning pass: “Before you write anything, give me a concise numbered plan of steps to complete this task. Keep it under 150 tokens.”
Once the plan looks good, execute in small chunks: “Now implement only steps 1 and 2 in detail. Stop after that and wait for my review.”
This catches mistakes early and avoids expensive full rewrites.
3. Summarise and Compress Your Own Context
Stop pasting the same long background information into every new chat. Create a reusable “context card”: “Compress everything I’ve told you about me and this project into a reusable context card under 400 tokens. Use headings and remove fluff.”
Save the card and paste it into new chats. Update it when needed.
This keeps every new message much cheaper.
4. Turn Off Tools When You Don’t Need Them
Tools (web browsing, code execution, etc.) add heavy extra context. For pure writing, reasoning, or idea generation, explicitly disable them: “For this session, don’t use any external tools unless I explicitly say ‘use tools’.”
Turn tools on only when required, then disable them again. Use tools deliberately, not by default.
5. Keep Files Smaller and Questions More Targeted
Uploading huge documents with vague questions wastes tokens.
- Split big files into logical chunks and upload only the relevant section.
- Or run a quick “locator” pass first: “Which 2–3 sections are most relevant for [goal]?”
Then follow up using only those sections.
6. Ask for Surgical Edits
Avoid rewriting an entire long output just because one part isn’t perfect. Be precise:
- “Edit this for clarity and flow. Keep every idea and the overall structure.”
- “Edit just the section under the heading ‘[Heading]’. Don’t touch anything else.”
You only pay for the parts that actually change.
7. Push Trivial Tasks to Cheaper Tools
Treat Claude as your “expensive brain”. Offload simple grammar fixes, basic paraphrases, tiny code snippets, or canned replies to lighter/cheaper models or built-in tools.
Reserve Claude for high-value work: long documents, complex rewrites, strategy, planning, deep research, and client-facing work.
8. Split Complex Workflows into “Cheap” and “Expensive” Phases
Separate research from writing:
- Phase 1 (Research): “Use tools to gather and summarise the 10 most important points about [topic] as short RESEARCH NOTES under 500 tokens, with links.”
- Phase 2 (Writing): “Now, without using any tools, and based only on the RESEARCH NOTES above, write [output]. Don’t pull in new information.”
You pay for browsing/tools only once.
9. Treat the Limit as 80% of What You Actually Have
Never hit the hard usage cap mid-project. When you sense you’re getting close: “Assume we have 3–5 messages left. Summarise what we’ve done and list the exact next prompts I should run when my usage resets (under 250 tokens).”
Save the plan so you can resume cleanly after the limit resets.
10. Make Claude Budget-Aware from the Start
Set expectations early with a “budget contract”: “For this whole session, assume we have a limited token budget. Default to short, high-information answers unless I say ‘go deep’. Warn me before long responses or using tools. Suggest cheaper alternatives.”
Override only when truly needed: “This one matters – go deep even if it’s expensive.”
Core Principle
Be deliberate and structured. Break big tasks into smaller, cheaper steps. Control context tightly. Reset chats regularly. Use Claude only where its strengths justify the higher cost.
Again, I came across the Original Thread by Kanika (HIGHLY recommended you follow!) (@KanikaBK): View the full thread on X






