Quickstart Guide
Get up and running with Skimly in minutes. Choose your preferred language or use raw HTTP - both approaches are fully supported.
Our v2.0 SDKs provide streaming support, tool calling,TypeScript types, and Anthropic-compatible interfaceswith automatic compression and error handling.
1. Installation
Choose your preferred language and install the appropriate package:
npm install @skimly/sdk@^2.0.0
2. Authentication
Set your API key as an environment variable. Both SDKs will automatically read this:
export SKIMLY_KEY="YOUR_API_KEY" # For production, also set: export SKIMLY_BASE="https://api.skimly.dev"
Before you can use Skimly, you need to create an API key:
- Visit the API Keys page in your dashboard
- Click "Create Key" and choose "Test" mode for development
- Copy the generated key (starts with
sk_test_
) - Set it as your
SKIMLY_KEY
environment variable
New to Skimly? Start with the Onboarding guide for a complete walkthrough.
3. Create a Blob
Upload large, rarely-changing content (like policies, documentation, or long conversation threads) to get a blob ID that you can reference in future requests.
// npm i @skimly/sdk@^2.0.0 import { SkimlyClient } from '@skimly/sdk' const client = new SkimlyClient({ apiKey: process.env.SKIMLY_KEY!, baseURL: 'http://localhost:8000' }) const blob = await client.createBlob('big content...', 'text/plain') console.log('Blob ID:', blob.blob_id) // Idempotent upload (avoids duplicates) const dedupedBlob = await client.createBlobIfNew('big content...', 'text/plain') console.log('Deduped Blob ID:', dedupedBlob.blob_id)
Use createBlobIfNew()
(Node.js) or create_blob_if_new()
(Python) to automatically avoid re-uploading identical content during the same process.
4. Chat with Blob Reference
Use the blob ID as a lightweight pointer in your chat requests instead of sending the full content. This dramatically reduces token usage while maintaining full context.
// npm i @skimly/sdk@^2.0.0 import { SkimlyClient } from '@skimly/sdk' const client = new SkimlyClient({ apiKey: process.env.SKIMLY_KEY!, baseURL: 'http://localhost:8000' }) // Anthropic-style interface with compression const message = await client.messages.create({ provider: 'openai', model: 'gpt-4o-mini', max_tokens: 1024, messages: [{ role: 'user', content: 'Hello from Node!' }] }) console.log(message.content[0].text) console.log('Tokens saved:', message.skimly_meta?.tokens_saved)
5. Switch Providers with One Line
One of the biggest benefits of Skimly is how easy it is to switch between providers. Just change the provider
field and you're done!
OpenAI
# Node.js await client.messages.create({ provider: 'openai', model: 'gpt-4o-mini', max_tokens: 1024, messages: [{ role: 'user', content: 'Hello!' }] }) # Python await client.messages.create({ 'provider': 'openai', 'model': 'gpt-4o-mini', 'max_tokens': 1024, 'messages': [{'role': 'user', 'content': 'Hello!'}] })
Anthropic
# Node.js - Anthropic-compatible interface await client.messages.create({ provider: 'anthropic', model: 'claude-3-5-sonnet-20241022', max_tokens: 1024, messages: [{ role: 'user', content: 'Hello!' }] }) # Python await client.messages.create({ 'provider': 'anthropic', 'model': 'claude-3-5-sonnet-20241022', 'max_tokens': 1024, 'messages': [{'role': 'user', 'content': 'Hello!'}] })
6. Enable Smart Compression
Skimly automatically applies intelligent compression based on content analysis. Use mode: "compact"
to enable advanced compression features.
Smart Compression Example
# Node.js const response = await client.messages.create({ provider: 'openai', model: 'gpt-4o-mini', max_tokens: 2048, messages: [ { role: 'user', content: 'Analyze this error log: b_abc123' } ] }) console.log('Tokens saved:', response.skimly_meta?.tokens_saved) # Python response = await client.messages.create({ 'provider': 'openai', 'model': 'gpt-4o-mini', 'max_tokens': 2048, 'messages': [ {'role': 'user', 'content': 'Analyze this error log: b_abc123'} ] }) print('Tokens saved:', response['skimly_meta']['tokens_saved'])
Skimly automatically compresses verbose AI responses while preserving the ability to expand to full content when needed. Smart Truncation provides up to 99% token savings.
7. Real-time Streaming
Stream AI responses in real-time using Server-Sent Events. Both SDKs provide full streaming support with proper chunking and error handling.
Streaming Example
# Node.js const stream = client.messages.stream({ provider: 'anthropic', model: 'claude-3-5-sonnet-20241022', max_tokens: 1024, messages: [{ role: 'user', content: 'Write a story' }], stream: true }) for await (const chunk of stream) { if (chunk.type === 'content_block_delta' && chunk.delta?.text) { process.stdout.write(chunk.delta.text) } } # Python async for chunk in client.messages.stream({ 'provider': 'anthropic', 'model': 'claude-3-5-sonnet-20241022', 'max_tokens': 1024, 'messages': [{'role': 'user', 'content': 'Write a story'}], 'stream': True }): if chunk['type'] == 'content_block_delta' and chunk.get('delta', {}).get('text'): print(chunk['delta']['text'], end='', flush=True)
8. Advanced Features
Content Transformation
Use the transform endpoint to compress tool results with context awareness.
# Node.js await client.transform({ result: "Large tool output...", tool_name: "bash", command: "npm run build" }) # Python await client.transform( result="Large tool output...", tool_name="bash", command="npm run build" )
Range Reads
Efficiently read portions of large blobs without downloading everything.
# Node.js const content = await client.fetchBlob('b_abc123', { start: 0, end: 1024 }) # Python content = await client.fetch_blob('b_abc123', start=0, end=1024)
9. Next Steps
Use our official SDKs for the best developer experience, or integrate directly with the HTTP endpoints. Both approaches provide the same functionality.