BAMPT

The Token-Smart Guide

Spend less of your plan. Get sharper answers.

How to get exactly what you need out of AI without burning through your plan. Plain language. No jargon. For business owners and everyday users alike.

A field guide from BAMPT | This Week in AI | Build AI for your business, including your own.

Start Here

The whole guide in one idea.

Every AI tool charges you in tokens. A token is just a chunk of a word. Roughly speaking, every word you type in and every word it sends back nudges a meter. You usually do not see the meter on a monthly plan, but it is running. Two things quietly drain it faster than people expect: using a more powerful model than the task needs, and dragging a long conversation behind you. Fix those two habits and you get most of the savings with none of the fuss.

Bigger is not better. It is just bigger.

The flagship models cost more per word, and the heaviest draw down your plan the fastest. Most everyday work does not need them. Reaching for the most powerful model by default is the single most common way people overspend.

Long chats get expensive and sloppy.

Every message in a long thread carries the entire conversation with it. That costs more each turn, and the model also gets less sharp as the pile grows. A fresh start is often the smarter move.

Reaching for the most powerful model by default is like booking a moving truck to carry a single grocery bag.

Match the tool to the job, not the job to the shiniest new tool. That one habit does most of the work.

Three words, translated

Token.: A small piece of a word. AI tools count tokens going in and coming out, and that count is what you are spending.
Context window.: How much the model can hold in its head at once. A long chat fills it up, which costs more and muddies the answers.
Model tier.: The fast-and-cheap, the everyday, and the heavy-reasoning versions of a tool. Same family, different jobs.

The Core Move

Match the model to the task.

Claude, ChatGPT, and Gemini all stack their models the same way: a fast lightweight one, a balanced everyday one, and stronger reasoning models above them. The names below are Claude's, but the logic carries across every major tool. Start in the middle, drop down when the task is simple, and climb only as high as the work demands.

FAST & LIGHT

Claude Haiku

ChatGPT mini · Gemini Flash

Quick questions, basic summaries, simple rewrites, fixing tone or grammar, short translations, first-draft brainstorms. The most efficient on your plan, and plenty for anything obvious.

EVERYDAY

Claude Sonnet

ChatGPT standard · Gemini Pro

Your default for almost everything. Writing and editing, research, analysis, building documents and spreadsheets, working through a problem, most coding. Most tasks never outgrow it. When unsure, start here.

ADVANCED REASONING

Claude Opus

ChatGPT & Gemini reasoning tiers

Step up here only when the everyday model struggles: dense analysis, harder strategy, tricky problems where the stakes are higher. Available on the paid plans. Think "Sonnet was not quite enough," not "always use the strongest."

A quick test before you climb a tier. Run the task on the everyday model first. If the answer is good enough, you are done. Only step up when you can point to something specific it got wrong. Upgrade on evidence, not on instinct.

Everyday Habits

Eight habits that save tokens automatically.

None of these require technical skill. They are just better defaults. Adopt three and you will feel the difference in how far your plan stretches.

Default to the everyday model.

Make the balanced tier your starting point and only reach higher when a task earns it. This one habit out-saves all the others combined.

Start a fresh chat for a new topic.

A long thread re-sends its whole history with every message. When you switch subjects, open a clean conversation instead of piling on.

Say what you want up front.

Vague prompts cause back-and-forth, and every round trip costs tokens. State the goal, the format, and the length in your first message.

Ask for the length you need.

"Give me three bullet points" beats a five-paragraph essay you have to trim. You pay for every word it generates, used or not.

Do not paste more than the question needs.

Dropping a forty-page document in to ask one small question makes you carry that whole document every turn. Paste the relevant page, not the binder.

Reuse prompts that work.

Keep a simple notes file of prompts that got great results. Reusing a proven prompt beats re-explaining yourself from scratch each time.

Switch off heavy modes for simple asks.

Deep research and extended thinking modes are wonderful for hard problems and wasteful for easy ones. Turn them on deliberately, not by default.

Fix mistakes by editing, not stacking.

When the model misreads your first message, edit that message and send it again instead of piling corrections on top. A clean redo is shorter and sharper than a long thread of patches.

Level Up

Smarter moves for heavier users.

If AI is part of how your business runs, these go a step further. The first one matters most, because it is the habit almost nobody builds until their answers start getting worse.

Build a "tell me when you are getting full" habit.

As a chat grows, the model holds more and thinks less clearly. Ask it directly: "If this conversation gets long enough that you might lose track, tell me and summarize where we are." It will flag the moment to reset before quality quietly drops.

Use the summarize-and-restart pattern.

When a long session starts drifting or repeating, ask for a tight summary of what you have decided so far. Paste that into a fresh chat. You keep the thread and drop the dead weight.

Let projects and saved instructions carry the context.

Most tools let you store standing instructions or a project space so you are not re-pasting the same background every time. Set it once, stop paying to repeat yourself.

Batch related questions into one prompt.

Ten little follow-ups each re-send the whole conversation. One well-structured message asking for all ten things is dramatically cheaper and usually clearer.

If you build automations, route by task.

For anyone working with the API or no-code tools: send the easy steps like sorting and tagging to a cheap model, and reserve the strong model for the one hard step. Reusing a fixed set of instructions across calls, often called prompt caching, can cut costs sharply.

Know what is better left analog.

Not everything should be automated. Some judgment calls, sensitive client conversations, and quick gut-check tasks cost more time to delegate than to just do. Spending zero tokens is always the cheapest option.

Set a default response style.

Most tools let you tell them once to keep answers short unless you ask for more. Set it and you trim length on every reply without repeating yourself.

Before You Hit Send

The 60-second token audit.

Run this on your next AI session.

Five questions. If you answer yes to any, you have an easy save sitting right there.

Am I using the flagship model for something a lighter one could do? Drop down a tier.

Is this chat really long? Summarize what matters and start fresh.

Did I paste a huge document to ask one small question? Trim it to what counts.

Did I tell it the format and length I want? Save yourself a round trip.

Is a heavy mode running (deep research, extended thinking) that this task does not need?

The People Behind It

Who made this.

Who made this

This guide comes from BAMPT, where we build AI automation systems for service businesses, including our own. We are practitioners first. Everything here is what we do, not theory.

It is written by Chantal Emmanuel, co-founder of BAMPT and CTO of LimeLoop, who breaks down what matters in AI each week for business owners who are curious but tired of the hype. No breathless launches. Just the practical read on what it means for your business.

Want the weekly version? Find This Week in AI wherever you already follow along, and the long-form breakdowns on the BAMPT blog.

Want a second set of eyes on your setup?

The guide helps you spend less. A short conversation helps you find where AI is quietly costing you more than money, in hours, rework, or a process held together by you remembering to run it. Bring how your team uses AI today and we will point you to the highest-leverage place to tighten it up. No pitch, just a clear next step.

Grab 20 minutes