API Documentation

Compress prompts programmatically via our REST API or npm SDK.

Quick start

1. Get your API key

Sign up (free), then generate an API key from your dashboard.

2. Compress via API

curl -X POST https://tokenshrink.com/api/compress \
  -H "Content-Type: application/json" \
  -H "x-api-key: ts_live_your_key_here" \
  -d '{
    "text": "Your long prompt text here...",
    "domain": "auto"
  }'

3. Or use the SDK (v2.0)

npm install tokenshrink
import { compress } from 'tokenshrink';

// Compress a prompt — runs locally, no API call needed
const result = compress('Your long prompt...');
console.log(result.compressed);
console.log(result.stats.tokensSaved);       // Real token savings
console.log(result.stats.originalTokens);     // Original token count
console.log(result.stats.totalCompressedTokens); // Compressed token count

// Optional: plug in a real tokenizer for exact counts
import { encode } from 'gpt-tokenizer';
const result2 = compress('Your long prompt...', {
  tokenizer: (text) => encode(text).length
});

// Use with any LLM provider
import OpenAI from 'openai';
const openai = new OpenAI();
const res = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'system', content: result.compressed }],
});

API Reference

POST/api/compress

Headers

Content-Type: application/json
x-api-key: ts_live_... (optional for anonymous, required for API usage)

Request body

{
  "text": "string (required) — the text to compress",
  "domain": "string (optional) — auto|code|medical|legal|business"
}

Response

{
  "compressed": "string — full compressed text with Rosetta header",
  "rosetta": "string — just the decoder header",
  "stats": {
    "originalWords": 150,
    "compressedWords": 42,
    "rosettaWords": 18,
    "totalCompressedWords": 60,
    "originalTokens": 168,
    "compressedTokens": 45,
    "rosettaTokens": 22,
    "totalCompressedTokens": 67,
    "ratio": 2.5,
    "tokensSaved": 101,
    "dollarsSaved": 0.05,
    "strategy": "domain",
    "domain": "code",
    "tokenizerUsed": "built-in"
  }
}
GET/api/usage

Returns your current usage stats, monthly history, and recent compressions. Requires authentication (session or API key).

Rate limits

Requests per minute10
Words per request100,000
Monthly limitUnlimited
PriceFree

Token counting

v2.0 uses real token counts instead of word estimates. By default, TokenShrink uses a precomputed lookup table based on cl100k_base (GPT-4). For exact counts with your specific model, pass a custom tokenizer:

import { compress } from 'tokenshrink';
import { encode } from 'gpt-tokenizer';

const result = compress(text, {
  tokenizer: (text) => encode(text).length
});

Compression domains

Set domain to optimize compression for specific content types. Default is auto.

autoAutomatically detects the best strategy
codeProgramming and technical documentation
medicalMedical records, clinical notes
legalContracts, legal documents
businessBusiness communications, reports