Performance Optimization

Optimize your SwiftClaw agents for maximum performance and efficiency.

Model Selection

Choose the right model for your workload:

Task Complexity

# Simple tasks: Use faster, cheaper models
classifier = Agent(model="llama-3")

# Complex reasoning: Use powerful models
analyst = Agent(model="gpt-4")

# Hybrid approach: Route based on complexity
agent = Agent(
    model={
        "simple": "llama-3",
        "complex": "gpt-4"
    }
)

Response Time vs Quality

Model	Response Time	Quality	Cost
llama-3	~500ms	Good	$
gpt-3.5-turbo	~1s	Better	$$
claude-3-sonnet	~2s	Great	$$$
gpt-4	~3s	Best	$$$$

Memory Optimization

Short-Term Memory

Use for session-specific data:

{
  "memory": {
    "shortTerm": {
      "ttl": "1h",
      "maxSize": "10MB"
    }
  }
}

Long-Term Memory

Optimize for frequently accessed data:

{
  "memory": {
    "longTerm": {
      "ttl": "30d",
      "maxSize": "100MB",
      "cache": {
        "enabled": true,
        "strategy": "lru"
      }
    }
  }
}

Memory Search

Use hybrid search for best performance:

{
  "memory": {
    "search": {
      "type": "hybrid",
      "vectorWeight": 0.7,
      "textWeight": 0.3,
      "maxResults": 10
    }
  }
}

Caching Strategies

Response Caching

Cache common responses:

@agent.cache(ttl="1h")
async def get_product_info(product_id: str):
    return await fetch_product(product_id)

Tool Result Caching

Cache expensive tool calls:

@agent.tool
@cache(ttl="30m")
async def search_database(query: str):
    return await db.search(query)

Parallel Processing

Concurrent Tool Calls

Execute tools in parallel:

async def process_request(message):
    # Execute tools concurrently
    results = await asyncio.gather(
        agent.call_tool("search_docs", message),
        agent.call_tool("search_database", message),
        agent.call_tool("fetch_user_data", message)
    )
    
    return combine_results(results)

Batch Processing

Process multiple requests together:

@agent.batch(max_size=10, max_wait="100ms")
async def process_messages(messages):
    return await agent.generate_batch(messages)

Request Optimization

Prompt Engineering

Optimize prompts for efficiency:

# Bad: Verbose prompt
prompt = """
Please analyze the following text and provide a detailed
summary including key points, sentiment analysis, and
recommendations for improvement...
"""

# Good: Concise prompt
prompt = "Summarize: key points, sentiment, recommendations"

Token Management

Reduce token usage:

# Limit context window
agent = Agent(
    model="gpt-4",
    max_tokens=2000,
    context_window=8000
)

# Truncate long inputs
def truncate_context(text, max_tokens=4000):
    tokens = tokenize(text)
    if len(tokens) > max_tokens:
        return detokenize(tokens[:max_tokens])
    return text

Streaming Responses

Enable streaming for better UX:

@agent.on_message
async def handle_message(message):
    async for chunk in agent.generate_stream(message):
        yield chunk

Connection Pooling

Optimize external connections:

# Database connection pool
db = Database(
    pool_size=10,
    max_overflow=20,
    pool_timeout=30
)

# HTTP connection pool
http = HTTPClient(
    pool_connections=10,
    pool_maxsize=20
)

Monitoring Performance

Key Metrics

Track these metrics:

Response Time: P50, P95, P99 latency
Throughput: Requests per second
Error Rate: Failed requests percentage
Token Usage: Tokens per request
Memory Usage: RAM consumption
CPU Usage: Processor utilization

Performance Dashboard

# View real-time metrics
swiftclaw metrics my-agent --live

# Generate performance report
swiftclaw report my-agent --period 7d

Load Testing

Test agent performance:

# Install load testing tool
npm install -g @swiftclaw/load-test

# Run load test
swiftclaw load-test my-agent \
  --requests 1000 \
  --concurrency 50 \
  --duration 5m

Performance Benchmarks

Typical performance targets:

Metric	Target	Excellent
Response Time (P95)	<2s	<1s
Throughput	>100 req/s	>500 req/s
Error Rate	<1%	<0.1%
Availability	>99.9%	>99.99%

Continuous Optimization: Monitor metrics regularly and optimize based on actual usage patterns.

Next Steps

Performance Optimization

Optimize your SwiftClaw agents for maximum performance and efficiency.

Model Selection

Choose the right model for your workload:

Task Complexity

# Simple tasks: Use faster, cheaper models
classifier = Agent(model="llama-3")

# Complex reasoning: Use powerful models
analyst = Agent(model="gpt-4")

# Hybrid approach: Route based on complexity
agent = Agent(
    model={
        "simple": "llama-3",
        "complex": "gpt-4"
    }
)

Response Time vs Quality

Model	Response Time	Quality	Cost
llama-3	~500ms	Good	$
gpt-3.5-turbo	~1s	Better	$$
claude-3-sonnet	~2s	Great	$$$
gpt-4	~3s	Best	$$$$

Memory Optimization

Short-Term Memory

Use for session-specific data:

{
  "memory": {
    "shortTerm": {
      "ttl": "1h",
      "maxSize": "10MB"
    }
  }
}

Long-Term Memory

Optimize for frequently accessed data:

{
  "memory": {
    "longTerm": {
      "ttl": "30d",
      "maxSize": "100MB",
      "cache": {
        "enabled": true,
        "strategy": "lru"
      }
    }
  }
}

Memory Search

Use hybrid search for best performance:

{
  "memory": {
    "search": {
      "type": "hybrid",
      "vectorWeight": 0.7,
      "textWeight": 0.3,
      "maxResults": 10
    }
  }
}

Caching Strategies

Response Caching

Cache common responses:

@agent.cache(ttl="1h")
async def get_product_info(product_id: str):
    return await fetch_product(product_id)

Tool Result Caching

Cache expensive tool calls:

@agent.tool
@cache(ttl="30m")
async def search_database(query: str):
    return await db.search(query)

Parallel Processing

Concurrent Tool Calls

Execute tools in parallel:

async def process_request(message):
    # Execute tools concurrently
    results = await asyncio.gather(
        agent.call_tool("search_docs", message),
        agent.call_tool("search_database", message),
        agent.call_tool("fetch_user_data", message)
    )
    
    return combine_results(results)

Batch Processing

Process multiple requests together:

@agent.batch(max_size=10, max_wait="100ms")
async def process_messages(messages):
    return await agent.generate_batch(messages)

Request Optimization

Prompt Engineering

Optimize prompts for efficiency:

# Bad: Verbose prompt
prompt = """
Please analyze the following text and provide a detailed
summary including key points, sentiment analysis, and
recommendations for improvement...
"""

# Good: Concise prompt
prompt = "Summarize: key points, sentiment, recommendations"

Token Management

Reduce token usage:

# Limit context window
agent = Agent(
    model="gpt-4",
    max_tokens=2000,
    context_window=8000
)

# Truncate long inputs
def truncate_context(text, max_tokens=4000):
    tokens = tokenize(text)
    if len(tokens) > max_tokens:
        return detokenize(tokens[:max_tokens])
    return text

Streaming Responses

Enable streaming for better UX:

@agent.on_message
async def handle_message(message):
    async for chunk in agent.generate_stream(message):
        yield chunk

Connection Pooling

Optimize external connections:

# Database connection pool
db = Database(
    pool_size=10,
    max_overflow=20,
    pool_timeout=30
)

# HTTP connection pool
http = HTTPClient(
    pool_connections=10,
    pool_maxsize=20
)

Monitoring Performance

Key Metrics

Track these metrics:

Response Time: P50, P95, P99 latency
Throughput: Requests per second
Error Rate: Failed requests percentage
Token Usage: Tokens per request
Memory Usage: RAM consumption
CPU Usage: Processor utilization

Performance Dashboard

# View real-time metrics
swiftclaw metrics my-agent --live

# Generate performance report
swiftclaw report my-agent --period 7d

Load Testing

Test agent performance:

# Install load testing tool
npm install -g @swiftclaw/load-test

# Run load test
swiftclaw load-test my-agent \
  --requests 1000 \
  --concurrency 50 \
  --duration 5m

Performance Benchmarks

Typical performance targets:

Metric	Target	Excellent
Response Time (P95)	<2s	<1s
Throughput	>100 req/s	>500 req/s
Error Rate	<1%	<0.1%
Availability	>99.9%	>99.99%

Continuous Optimization: Monitor metrics regularly and optimize based on actual usage patterns.

Performance Optimization

On this page

Performance Optimization

On this page