◈ Token Compression Engine
Your prompts are verbose. Your models don't need them to be.
TokenShrink compresses prompts — same results, fewer tokens. Works with Claude, GPT, Gemini, Ollama — any LLM. Open source ↗
Typical savings: 15-35% on system prompts ·
Works with 8 AI providers ·
51 tests passing
⌘+Enter to shrink
Works with every LLM provider
◎OpenAI
◆Anthropic
△Google AI
✦Mistral
⊙Ollama
□Any LLM
◉
3.6M
tokens saved
◈
100%
Open source
◎
< 200ms
Processing time
✦
All LLMs
Compatible
How It Works
01
Paste your prompt
System messages, user prompts, documents — anything you send to an LLM.
02
We compress it
Our engine replaces verbose phrases with short codes and prepends a tiny decoder header.
03
Use fewer tokens
Use the compressed version in your API calls. Same AI quality, fewer tokens.
Drop-in SDK
Two lines of code. Automatic compression on every API call.
app.js
import { compress } from 'tokenshrink';
import OpenAI from 'openai';
// Compress your system prompt
const { compressed, stats } = compress(longPrompt);
console.log(`Saved ${stats.tokensSaved} tokens`);
// Use with any LLM — OpenAI, Anthropic, local models
const openai = new OpenAI();
const res = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'system', content: compressed }],
});npm install tokenshrinkCompress your
prompts for free
No account required. Open source. Compress and ship.
$ npm install tokenshrink — save tokens instantly