API Documentation
Compress prompts programmatically via our REST API or npm SDK.
Quick start
1. Get your API key
Sign up (free), then generate an API key from your dashboard.
2. Compress via API
curl -X POST https://tokenshrink.com/api/compress \
-H "Content-Type: application/json" \
-H "x-api-key: ts_live_your_key_here" \
-d '{
"text": "Your long prompt text here...",
"domain": "auto"
}'3. Or use the SDK (v2.0)
npm install tokenshrink
import { compress } from 'tokenshrink';
// Compress a prompt — runs locally, no API call needed
const result = compress('Your long prompt...');
console.log(result.compressed);
console.log(result.stats.tokensSaved); // Real token savings
console.log(result.stats.originalTokens); // Original token count
console.log(result.stats.totalCompressedTokens); // Compressed token count
// Optional: plug in a real tokenizer for exact counts
import { encode } from 'gpt-tokenizer';
const result2 = compress('Your long prompt...', {
tokenizer: (text) => encode(text).length
});
// Use with any LLM provider
import OpenAI from 'openai';
const openai = new OpenAI();
const res = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'system', content: result.compressed }],
});API Reference
POST/api/compress
Headers
Content-Type: application/json
x-api-key: ts_live_... (optional for anonymous, required for API usage)
Request body
{
"text": "string (required) — the text to compress",
"domain": "string (optional) — auto|code|medical|legal|business"
}Response
{
"compressed": "string — full compressed text with Rosetta header",
"rosetta": "string — just the decoder header",
"stats": {
"originalWords": 150,
"compressedWords": 42,
"rosettaWords": 18,
"totalCompressedWords": 60,
"originalTokens": 168,
"compressedTokens": 45,
"rosettaTokens": 22,
"totalCompressedTokens": 67,
"ratio": 2.5,
"tokensSaved": 101,
"dollarsSaved": 0.05,
"strategy": "domain",
"domain": "code",
"tokenizerUsed": "built-in"
}
}GET/api/usage
Returns your current usage stats, monthly history, and recent compressions. Requires authentication (session or API key).
Rate limits
Requests per minute10
Words per request100,000
Monthly limitUnlimited
PriceFree
Token counting
v2.0 uses real token counts instead of word estimates. By default, TokenShrink uses a precomputed lookup table based on cl100k_base (GPT-4). For exact counts with your specific model, pass a custom tokenizer:
import { compress } from 'tokenshrink';
import { encode } from 'gpt-tokenizer';
const result = compress(text, {
tokenizer: (text) => encode(text).length
});Compression domains
Set domain to optimize compression for specific content types. Default is auto.
autoAutomatically detects the best strategycodeProgramming and technical documentationmedicalMedical records, clinical noteslegalContracts, legal documentsbusinessBusiness communications, reports