Stop Burning Tokens: How to Avoid Feeding LLMs Broken Code (CyberDudeBivash Guide)
Executive summary
If you feed buggy code to an LLM you’ll get back buggy suggestions — and you’ll pay for them. The secret: fix as much as possible locally first, then send the smallest, most precise context the LLM needs. This guide gives a practical system you can adopt now:
-
Local preflight: lint, unit tests, minimal reproducible example (MRE) generator.
-
Prompt hygiene: diff-only prompts, test-driven prompts, and strict output formats.
-
CI gating: only call LLM from CI when pre-checks pass or when a focused, failing-test payload is published.
-
Token-aware engineering: estimate tokens, calculate cost, and budget.
-
Developer tooling & templates: pre-commit hooks, Python/Node scripts, GitHub Actions examples.
Follow this and you’ll cut wasted tokens, shorten review cycles, and produce higher-quality LLM outputs.
Why engineers waste tokens
Common anti-patterns:
-
Dumping entire repositories into the prompt.
-
Asking for “fix my code” without failing tests or clear error output.
-
Sending code with unresolved syntax errors or missing imports.
-
No preflight: you ask the LLM first, then debug its output manually.
Consequences: wasted tokens (money), longer iteration times, lower signal-to-noise from LLMs, and risky incorrect code merged into prod.
The CyberDudeBivash 5-Step Workflow
-
Local preflight — lint + run tests + reproduce error.
-
Minimize context — produce a minimal reproducible example (MRE).
-
Prompt for a patch — use a strict template asking for patch only or diff only.
-
Validate — run returned patch inside sandbox/tests automatically.
-
CI gate & telemetry — only accept LLM-assisted changes when tests pass and token-cost budget respected.
Practical toolset
-
Linters:
flake8
/pylint
(Python),eslint
(JS/TS). -
Formatters:
black
,prettier
. -
Unit tests:
pytest
,unittest
,jest
. -
Local sandbox: Docker +
docker-compose
or ephemeral VMs. -
Pre-commit:
pre-commit
hooks. -
Token estimation helper: small script (below).
-
CI: GitHub Actions (examples later).
Affiliate picks (recommended — use our affiliate links on your site):
-
JetBrains Fleet / IntelliJ (IDE productivity; affiliate link placeholder).
-
GitHub Copilot (assist, but use after preflight).
-
Replit / Gitpod (ephemeral dev sandboxes).
(Include affiliate disclosure on publish.)
Preflight scripts & pre-prompt checklist
Pre-prompt checklist
-
Code compiles / lints locally (
flake8
/eslint
) -
Unit tests reproduce the failing behavior (
pytest
/jest
) -
Minimal Reproducible Example (MRE) created — unrelated code removed
-
Expected vs actual output logged (include traceback)
-
Token budget estimated for the prompt (see calculator below)
-
CI/CD gating strategy defined (where LLM patch will be validated)
Minimal reproducible example (MRE) template
Create mre.py
that contains only:
-
the function(s) under test
-
the failing test case (assert)
-
any minimal setup data (no large binary blobs)
Example (mre.py
):
Always include the test runner output (stack trace) with your prompt.
Prompt templates — be strict: ask for diff only
Template: "Patch-only prompt"
Important: insist Only return the patch
— not explanations. That avoids extra tokens and speeds up programmatic application.
Example — small Python patch flow
-
Developer reproduces failing test:
-
Build the MRE and include only
add.py
andtests/test_add.py
in the prompt. -
Send the Patch-only prompt (above). LLM returns unified diff:
-
Apply patch and run tests automatically in CI.
Pre-commit & local automation
Add a pre-commit
hook that runs lint and tests before letting you call the LLM:
.pre-commit-config.yaml
call-llm.sh
(only after lint/tests pass)
CI pattern: GitHub Actions — only call LLM when tests reproduce AND MRE provided
/.github/workflows/llm-assist.yml
This enforces: tests must reproduce locally (preflight), MRE must exist, token budget checked, LLM only called from CI with secure keys.
Token estimation & cost calculator (simple, exact arithmetic)
Estimating tokens from characters
A practical rule: 1 token ≈ 4 characters in English (approximation, use your tokenizer for exact).
Formula: estimated_tokens = ceil(total_chars / 4)
Example calculation (step-by-step):
Suppose your prompt (files + traces) is 42,372 characters long.
-
Divide by 4: 42,372 / 4 = 10,593.
-
Round up (if needed): estimated_tokens = 10,593 tokens.
Cost example
Assume model price = $0.02 per 1,000 tokens (example pricing used solely for illustration).
-
Tokens = 10,593.
-
Thousands-of-tokens = 10,593 / 1000 = 10.593.
-
Cost = 10.593 * $0.02 = $0.21186.
-
Rounded to cents = $0.21.
(Every arithmetic step above computed explicitly.)
Tip: keep prompts ≤ 2,000–3,000 tokens when possible to reduce cost and improve latency.
Smart prompt compression strategies
-
Send failure + single-file MRE, not whole repo.
-
Remove comments, large whitespace, and long sample data.
-
Send only failing test and relevant functions.
-
Send diffs instead of full files. If you must send file, gzip and include only essential parts.
-
Use function signatures + types rather than full code when asking for algorithmic logic.
Prompt engineering patterns that save tokens
-
Test-first prompt
-
Diff-only prompt
Provide the current file and the desired behavior; ask for a unified diff patch. -
Small-step prompt
Ask for a single small change (e.g., function fix) rather than end-to-end rewrite. -
Strict format enforcement
“Return JSON only with fields {patch, tests_run, success}” — easier to parse and validate.
Validation harness — run returned patch automatically
validate_patch.py
(conceptual)
Use this in CI step immediately after receiving the patch.
Defensive prompts & guardrails (reduce hallucinations)
-
Ask LLM to not invent imports or API calls. Provide the exact dependency list or require code to only use the existing project imports.
-
Request executable code only; require
pytest
to pass in CI. -
If the LLM returns explanations, automatically reject and re-run with
Only return patch
enforcement.
Common real-world patterns & examples
Pattern: Runtime type errors in Python
-
Preflight: run
mypy
/pytest
. -
Prompt: include failing traceback and function signature.
-
Patch: LLM suggests type coercion or validation.
-
Validation: run tests — success -> merge.
Pattern: Frontend CSS/JS regressions
-
Preflight: run
npm run test
,eslint
, and visual regression (percy) or unit tests. -
Prompt: include failing test and minimal component snippet.
-
Patch: LLM returns specific component diff.
FAQ
Q: When should I NOT use an LLM for code?
A: Don’t use it to fix failing tests if you can’t produce an MRE, or when code involves secrets/crypto primitives you cannot validate locally. Use LLMs more for design/boilerplate than for security-critical code unless heavily validated.
Q: How often should I call an LLM?
A: Prefer fewer, highly focused calls. Use local automation to reduce repetitive prompts.
Q: What about using LLMs as pair-programming assistants?
A: Great, but keep the same disciplines: run tests locally first, then ask LLM to suggest concise changes.
Metrics & KPIs to track
-
Tokens consumed per merged PR (baseline vs. post-adoption).
-
% of LLM-assisted patches that pass CI on first application.
-
Mean time to first green build (MTTFGB) for LLM-assisted PRs.
-
Token cost saved per sprint.
Integration checklist
-
Pre-commit hooks installed and enforced.
-
MRE template created in
/mre/
and required for LLM requests. -
CI workflow includes
estimate_tokens.py
andvalidate_patch.py
. -
Token budget per PR set and monitored.
-
Post-merge telemetry enabled (tokens/PR, success rate).
#CyberDudeBivash #LLM #PromptEngineering #DevOps #CI #Precommit #TokenEfficiency #AIforDev #SoftwareEngineering #Productivity #MRE #Testing
Final quick checklist —
-
Run pre-commit
and pytest
.
-
Create mre.py
capturing the failing test.
-
Run estimate_tokens.py
to verify budget.
-
Trigger llm-assist
CI workflow to call LLM.
-
Validate returned patch automatically (validate_patch.py
).
-
Merge only if CI green and token budget respected.
Run pre-commit
and pytest
.
Create mre.py
capturing the failing test.
Run estimate_tokens.py
to verify budget.
Trigger llm-assist
CI workflow to call LLM.
Validate returned patch automatically (validate_patch.py
).
Merge only if CI green and token budget respected.
Comments
Post a Comment