Cost Optimization

Optimize your SwiftClaw agent costs without sacrificing performance.

Model Cost Comparison

Choose cost-effective models:

Model	Cost per 1M tokens	Use Case
llama-3	$0.10	Simple tasks, classification
gpt-3.5-turbo	$0.50	General purpose
claude-3-sonnet	$3.00	Complex reasoning
gpt-4	$30.00	Advanced analysis

Smart Model Routing

Route requests to appropriate models:

agent = Agent(
    routing={
        "simple": {
            "model": "llama-3",
            "conditions": ["length < 100", "complexity < 0.3"]
        },
        "medium": {
            "model": "gpt-3.5-turbo",
            "conditions": ["length < 500", "complexity < 0.7"]
        },
        "complex": {
            "model": "gpt-4",
            "conditions": ["length >= 500", "complexity >= 0.7"]
        }
    }
)

Token Optimization

Reduce Token Usage

# Bad: Verbose system prompt
system_prompt = """
You are a helpful AI assistant designed to help users
with their questions. Please provide detailed and
accurate responses...
"""

# Good: Concise system prompt
system_prompt = "You are a helpful AI assistant."

# Limit response length
agent = Agent(
    model="gpt-4",
    max_tokens=500  # Limit response size
)

Context Window Management

# Truncate old messages
def manage_context(messages, max_tokens=4000):
    total_tokens = sum(count_tokens(m) for m in messages)
    
    if total_tokens > max_tokens:
        # Keep system message and recent messages
        return [messages[0]] + messages[-10:]
    
    return messages

Caching Strategies

Response Caching

Cache common queries:

@agent.cache(ttl="1h")
async def handle_faq(question: str):
    return await agent.generate(question)

Semantic Caching

Cache similar queries:

@agent.cache(
    strategy="semantic",
    similarity_threshold=0.95,
    ttl="1h"
)
async def handle_query(query: str):
    return await agent.generate(query)

Batch Processing

Process multiple requests together:

# Bad: Individual requests
for message in messages:
    await agent.generate(message)

# Good: Batch processing
await agent.generate_batch(messages)

Auto-Scaling Configuration

Scale based on demand:

{
  "scaling": {
    "minInstances": 1,
    "maxInstances": 10,
    "scaleDownDelay": "5m",
    "targetCPU": 70
  }
}

Cost-Aware Scaling

{
  "scaling": {
    "strategy": "cost-optimized",
    "schedule": {
      "weekday": {
        "minInstances": 2,
        "maxInstances": 10
      },
      "weekend": {
        "minInstances": 1,
        "maxInstances": 5
      },
      "night": {
        "minInstances": 1,
        "maxInstances": 3
      }
    }
  }
}

Memory Management

Optimize Memory Usage

{
  "memory": {
    "shortTerm": {
      "ttl": "30m",
      "maxSize": "10MB"
    },
    "longTerm": {
      "ttl": "7d",
      "maxSize": "50MB"
    }
  }
}

Cleanup Policies

# Automatic cleanup of old data
@agent.schedule("0 0 * * *")  # Daily at midnight
async def cleanup_memory():
    await agent.memory.cleanup(older_than="30d")

Tool Call Optimization

Reduce External API Calls

# Cache external API results
@cache(ttl="1h")
async def fetch_weather(city: str):
    return await weather_api.get(city)

# Batch API calls
async def fetch_multiple_cities(cities: list):
    return await weather_api.batch_get(cities)

Monitoring Costs

Cost Dashboard

# View cost breakdown
swiftclaw costs my-agent --period 30d

# Output:
# CATEGORY        COST      PERCENTAGE
# Model Usage     $150.00   60%
# Infrastructure  $75.00    30%
# Storage         $25.00    10%
# Total           $250.00   100%

Cost Alerts

Set up cost alerts:

# Alert when daily cost exceeds threshold
swiftclaw alerts create \
  --agent my-agent \
  --metric daily-cost \
  --threshold 50 \
  --action notify

Cost Optimization Checklist

Cost Reduction Strategies

1. Model Fallbacks

Use cheaper fallback models:

{
  "model": {
    "primary": "gpt-4",
    "fallbacks": [
      "claude-3-sonnet",
      "gpt-3.5-turbo"
    ]
  }
}

2. Request Throttling

Limit expensive operations:

@agent.throttle(rate="100/hour")
async def expensive_operation():
    return await agent.generate(complex_prompt)

3. Scheduled Processing

Process non-urgent tasks during off-peak hours:

@agent.schedule("0 2 * * *")  # 2 AM daily
async def batch_analysis():
    await process_daily_reports()

Cost Savings: Implementing these strategies can reduce costs by 40-60% while maintaining performance.

Next Steps

Cost Optimization

Optimize your SwiftClaw agent costs without sacrificing performance.

Model Cost Comparison

Choose cost-effective models:

Model	Cost per 1M tokens	Use Case
llama-3	$0.10	Simple tasks, classification
gpt-3.5-turbo	$0.50	General purpose
claude-3-sonnet	$3.00	Complex reasoning
gpt-4	$30.00	Advanced analysis

Smart Model Routing

Route requests to appropriate models:

agent = Agent(
    routing={
        "simple": {
            "model": "llama-3",
            "conditions": ["length < 100", "complexity < 0.3"]
        },
        "medium": {
            "model": "gpt-3.5-turbo",
            "conditions": ["length < 500", "complexity < 0.7"]
        },
        "complex": {
            "model": "gpt-4",
            "conditions": ["length >= 500", "complexity >= 0.7"]
        }
    }
)

Token Optimization

Reduce Token Usage

# Bad: Verbose system prompt
system_prompt = """
You are a helpful AI assistant designed to help users
with their questions. Please provide detailed and
accurate responses...
"""

# Good: Concise system prompt
system_prompt = "You are a helpful AI assistant."

# Limit response length
agent = Agent(
    model="gpt-4",
    max_tokens=500  # Limit response size
)

Context Window Management

# Truncate old messages
def manage_context(messages, max_tokens=4000):
    total_tokens = sum(count_tokens(m) for m in messages)
    
    if total_tokens > max_tokens:
        # Keep system message and recent messages
        return [messages[0]] + messages[-10:]
    
    return messages

Caching Strategies

Response Caching

Cache common queries:

@agent.cache(ttl="1h")
async def handle_faq(question: str):
    return await agent.generate(question)

Semantic Caching

Cache similar queries:

@agent.cache(
    strategy="semantic",
    similarity_threshold=0.95,
    ttl="1h"
)
async def handle_query(query: str):
    return await agent.generate(query)

Batch Processing

Process multiple requests together:

# Bad: Individual requests
for message in messages:
    await agent.generate(message)

# Good: Batch processing
await agent.generate_batch(messages)

Auto-Scaling Configuration

Scale based on demand:

{
  "scaling": {
    "minInstances": 1,
    "maxInstances": 10,
    "scaleDownDelay": "5m",
    "targetCPU": 70
  }
}

Cost-Aware Scaling

{
  "scaling": {
    "strategy": "cost-optimized",
    "schedule": {
      "weekday": {
        "minInstances": 2,
        "maxInstances": 10
      },
      "weekend": {
        "minInstances": 1,
        "maxInstances": 5
      },
      "night": {
        "minInstances": 1,
        "maxInstances": 3
      }
    }
  }
}

Memory Management

Optimize Memory Usage

{
  "memory": {
    "shortTerm": {
      "ttl": "30m",
      "maxSize": "10MB"
    },
    "longTerm": {
      "ttl": "7d",
      "maxSize": "50MB"
    }
  }
}

Cleanup Policies

# Automatic cleanup of old data
@agent.schedule("0 0 * * *")  # Daily at midnight
async def cleanup_memory():
    await agent.memory.cleanup(older_than="30d")

Tool Call Optimization

Reduce External API Calls

# Cache external API results
@cache(ttl="1h")
async def fetch_weather(city: str):
    return await weather_api.get(city)

# Batch API calls
async def fetch_multiple_cities(cities: list):
    return await weather_api.batch_get(cities)

Monitoring Costs

Cost Dashboard

# View cost breakdown
swiftclaw costs my-agent --period 30d

# Output:
# CATEGORY        COST      PERCENTAGE
# Model Usage     $150.00   60%
# Infrastructure  $75.00    30%
# Storage         $25.00    10%
# Total           $250.00   100%

Cost Alerts

Set up cost alerts:

# Alert when daily cost exceeds threshold
swiftclaw alerts create \
  --agent my-agent \
  --metric daily-cost \
  --threshold 50 \
  --action notify

Cost Optimization Checklist

Cost Reduction Strategies

1. Model Fallbacks

Use cheaper fallback models:

{
  "model": {
    "primary": "gpt-4",
    "fallbacks": [
      "claude-3-sonnet",
      "gpt-3.5-turbo"
    ]
  }
}

2. Request Throttling

Limit expensive operations:

@agent.throttle(rate="100/hour")
async def expensive_operation():
    return await agent.generate(complex_prompt)

3. Scheduled Processing

Process non-urgent tasks during off-peak hours:

@agent.schedule("0 2 * * *")  # 2 AM daily
async def batch_analysis():
    await process_daily_reports()

Cost Savings: Implementing these strategies can reduce costs by 40-60% while maintaining performance.

Cost Optimization

On this page

Cost Optimization

On this page