Cost Optimization
Reduce costs while maintaining agent performance
Cost Optimization
Optimize your SwiftClaw agent costs without sacrificing performance.
Model Cost Comparison
Choose cost-effective models:
| Model | Cost per 1M tokens | Use Case |
|---|---|---|
| llama-3 | $0.10 | Simple tasks, classification |
| gpt-3.5-turbo | $0.50 | General purpose |
| claude-3-sonnet | $3.00 | Complex reasoning |
| gpt-4 | $30.00 | Advanced analysis |
Smart Model Routing
Route requests to appropriate models:
agent = Agent(
routing={
"simple": {
"model": "llama-3",
"conditions": ["length < 100", "complexity < 0.3"]
},
"medium": {
"model": "gpt-3.5-turbo",
"conditions": ["length < 500", "complexity < 0.7"]
},
"complex": {
"model": "gpt-4",
"conditions": ["length >= 500", "complexity >= 0.7"]
}
}
)Token Optimization
Reduce Token Usage
# Bad: Verbose system prompt
system_prompt = """
You are a helpful AI assistant designed to help users
with their questions. Please provide detailed and
accurate responses...
"""
# Good: Concise system prompt
system_prompt = "You are a helpful AI assistant."
# Limit response length
agent = Agent(
model="gpt-4",
max_tokens=500 # Limit response size
)Context Window Management
# Truncate old messages
def manage_context(messages, max_tokens=4000):
total_tokens = sum(count_tokens(m) for m in messages)
if total_tokens > max_tokens:
# Keep system message and recent messages
return [messages[0]] + messages[-10:]
return messagesCaching Strategies
Response Caching
Cache common queries:
@agent.cache(ttl="1h")
async def handle_faq(question: str):
return await agent.generate(question)Semantic Caching
Cache similar queries:
@agent.cache(
strategy="semantic",
similarity_threshold=0.95,
ttl="1h"
)
async def handle_query(query: str):
return await agent.generate(query)Batch Processing
Process multiple requests together:
# Bad: Individual requests
for message in messages:
await agent.generate(message)
# Good: Batch processing
await agent.generate_batch(messages)Auto-Scaling Configuration
Scale based on demand:
{
"scaling": {
"minInstances": 1,
"maxInstances": 10,
"scaleDownDelay": "5m",
"targetCPU": 70
}
}Cost-Aware Scaling
{
"scaling": {
"strategy": "cost-optimized",
"schedule": {
"weekday": {
"minInstances": 2,
"maxInstances": 10
},
"weekend": {
"minInstances": 1,
"maxInstances": 5
},
"night": {
"minInstances": 1,
"maxInstances": 3
}
}
}
}Memory Management
Optimize Memory Usage
{
"memory": {
"shortTerm": {
"ttl": "30m",
"maxSize": "10MB"
},
"longTerm": {
"ttl": "7d",
"maxSize": "50MB"
}
}
}Cleanup Policies
# Automatic cleanup of old data
@agent.schedule("0 0 * * *") # Daily at midnight
async def cleanup_memory():
await agent.memory.cleanup(older_than="30d")Tool Call Optimization
Reduce External API Calls
# Cache external API results
@cache(ttl="1h")
async def fetch_weather(city: str):
return await weather_api.get(city)
# Batch API calls
async def fetch_multiple_cities(cities: list):
return await weather_api.batch_get(cities)Monitoring Costs
Cost Dashboard
# View cost breakdown
swiftclaw costs my-agent --period 30d
# Output:
# CATEGORY COST PERCENTAGE
# Model Usage $150.00 60%
# Infrastructure $75.00 30%
# Storage $25.00 10%
# Total $250.00 100%Cost Alerts
Set up cost alerts:
# Alert when daily cost exceeds threshold
swiftclaw alerts create \
--agent my-agent \
--metric daily-cost \
--threshold 50 \
--action notifyCost Optimization Checklist
- Use appropriate models for task complexity
- Implement response caching
- Optimize token usage
- Configure auto-scaling
- Set up cost alerts
- Review usage patterns monthly
- Clean up unused resources
Cost Reduction Strategies
1. Model Fallbacks
Use cheaper fallback models:
{
"model": {
"primary": "gpt-4",
"fallbacks": [
"claude-3-sonnet",
"gpt-3.5-turbo"
]
}
}2. Request Throttling
Limit expensive operations:
@agent.throttle(rate="100/hour")
async def expensive_operation():
return await agent.generate(complex_prompt)3. Scheduled Processing
Process non-urgent tasks during off-peak hours:
@agent.schedule("0 2 * * *") # 2 AM daily
async def batch_analysis():
await process_daily_reports()Cost Savings: Implementing these strategies can reduce costs by 40-60% while maintaining performance.
Next Steps
How is this guide ?
Last updated on