DOCS

Skimly API Documentation

Reduce LLM costs with intelligent content blobbing and Smart Compression Timing. Upload large context once, reference it with lightweight pointers in chat requests, and let AI optimize compression decisions.

New: Official SDKs Available

Get started quickly with our official SDKs for Node.js and Python, or use raw HTTP if you prefer. Both approaches are fully supported.

Quick Start

Node.js / TypeScript

npm install @skimly/sdk

import { fromEnv } from '@skimly/sdk'
const client = fromEnv()

const resp = await client.messages.create({
  provider: 'openai',
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello!' }]
})
View full quickstart →

Python

pip install skimly

from skimly import Skimly
client = Skimly.from_env()

resp = client.chat(
    provider="anthropic",
    model="claude-3-sonnet",
    messages=[{"role": "user", "content": "Hello!"}]
)
View full quickstart →

Core Concepts

1. Blob Once

Upload large, rarely-changing content (policies, docs, threads) to get a blob ID.

2. Reference in Chat

Use the blob ID as a pointer in chat requests instead of sending the full content.

3. Save Tokens

Dramatically reduce token usage and costs while maintaining full context.

Smart Compression Technology

Skimly goes beyond simple size thresholds with AI-powered compression that understands your content and workflow:

Smart Timing

Predicts when users will access content based on tool context and content analysis. Build logs are compressed aggressively, error details are preserved.

Content Analysis

Detects patterns in build logs, error stacks, diffs, and other development content to make intelligent compression decisions.

Cost Optimization

Prevents negative savings by analyzing deref costs before compression. Only compresses content that provides real ROI.

Session Tracking

Tracks compression decisions and calculates real ROI across user sessions. Provides honest assessment of actual vs predicted performance.

Documentation Sections

Migration Guide

Switch from OpenAI/Anthropic to Skimly with minimal code changes.

Smart Compression

Advanced compression technology that understands your content and workflow.

Features: Timing predictions, content analysis, cost optimization, session tracking

Production Ready

Enterprise-grade features with easy monitoring and rollback options.

Includes: Usage tracking, cost analysis, compression metrics, audit logs

Key Endpoints

Core Operations

  • POST /v1/chat - Chat completions with compression
  • POST /v1/blobs - Create content blobs
  • GET /v1/fetch - Retrieve blob content
  • POST /v1/transform - Smart content compression

Management

  • GET /v1/keys - List API keys
  • POST /v1/keys - Create new API key
  • DELETE /v1/keys/{id} - Revoke API key
  • GET /v1/config - System configuration
Ready to get started?

Start with our Quickstart Guide to get up and running in minutes, then explore the API Reference for complete documentation of all features.