Skimly API Documentation
Reduce LLM costs with intelligent content blobbing and Smart Compression Timing. Upload large context once, reference it with lightweight pointers in chat requests, and let AI optimize compression decisions.
Get started quickly with our official SDKs for Node.js and Python, or use raw HTTP if you prefer. Both approaches are fully supported.
Quick Start
Node.js / TypeScript
npm install @skimly/sdk import { fromEnv } from '@skimly/sdk' const client = fromEnv() const resp = await client.messages.create({ provider: 'openai', model: 'gpt-4o-mini', messages: [{ role: 'user', content: 'Hello!' }] })View full quickstart →
Python
pip install skimly from skimly import Skimly client = Skimly.from_env() resp = client.chat( provider="anthropic", model="claude-3-sonnet", messages=[{"role": "user", "content": "Hello!"}] )View full quickstart →
Core Concepts
1. Blob Once
Upload large, rarely-changing content (policies, docs, threads) to get a blob ID.
2. Reference in Chat
Use the blob ID as a pointer in chat requests instead of sending the full content.
3. Save Tokens
Dramatically reduce token usage and costs while maintaining full context.
Smart Compression Technology
Skimly goes beyond simple size thresholds with AI-powered compression that understands your content and workflow:
Smart Timing
Predicts when users will access content based on tool context and content analysis. Build logs are compressed aggressively, error details are preserved.
Content Analysis
Detects patterns in build logs, error stacks, diffs, and other development content to make intelligent compression decisions.
Cost Optimization
Prevents negative savings by analyzing deref costs before compression. Only compresses content that provides real ROI.
Session Tracking
Tracks compression decisions and calculates real ROI across user sessions. Provides honest assessment of actual vs predicted performance.
Documentation Sections
Quickstart Guide
Get up and running in minutes with SDK examples and cURL commands.
API Reference
Complete endpoint documentation for all features including Smart Compression.
Webhooks
Set up real-time notifications for your integration.
Migration Guide
Switch from OpenAI/Anthropic to Skimly with minimal code changes.
Smart Compression
Advanced compression technology that understands your content and workflow.
Production Ready
Enterprise-grade features with easy monitoring and rollback options.
Key Endpoints
Core Operations
POST /v1/chat
- Chat completions with compressionPOST /v1/blobs
- Create content blobsGET /v1/fetch
- Retrieve blob contentPOST /v1/transform
- Smart content compression
Management
GET /v1/keys
- List API keysPOST /v1/keys
- Create new API keyDELETE /v1/keys/{id}
- Revoke API keyGET /v1/config
- System configuration
Start with our Quickstart Guide to get up and running in minutes, then explore the API Reference for complete documentation of all features.