◈ Token Compression Engine


Your prompts are verbose. Your models don't need them to be.
TokenShrink compresses prompts — same results, fewer tokens. Open source.

+Enter to shrink
1.4M
tokens saved
100%
Open source
< 200ms
Processing time
All LLMs
Compatible

How It Works

01

Paste your prompt

System messages, user prompts, documents — anything you send to an LLM.

02

We compress it

Our engine replaces verbose phrases with short codes and prepends a tiny decoder header.

03

Use fewer tokens

Use the compressed version in your API calls. Same AI quality, fewer tokens.

Drop-in SDK

Two lines of code. Automatic compression on every API call.

app.js
import { compress } from 'tokenshrink';
import OpenAI from 'openai';

// Compress your system prompt
const { compressed, stats } = compress(longPrompt);
console.log(`Saved ${stats.tokensSaved} tokens`);

// Use with any LLM — OpenAI, Anthropic, local models
const openai = new OpenAI();
const res = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'system', content: compressed }],
});
npm install tokenshrink

Compress your
prompts for free

No account required. Open source. Compress and ship.

$ npm install tokenshrink save tokens instantly