Load Balancing

SwiftClaw automatically distributes traffic across agent instances for optimal performance and reliability.

Automatic Load Balancing

SwiftClaw handles load balancing automatically:

Round Robin: Distributes requests evenly
Least Connections: Routes to least busy instance
Response Time: Routes to fastest instance
Geographic: Routes to nearest instance

Load Balancing Strategies

Round Robin

Default strategy, distributes requests evenly:

{
  "loadBalancing": {
    "strategy": "round-robin"
  }
}

Least Connections

Routes to instance with fewest active connections:

{
  "loadBalancing": {
    "strategy": "least-connections"
  }
}

Response Time Based

Routes to fastest responding instance:

{
  "loadBalancing": {
    "strategy": "response-time",
    "window": "5m"
  }
}

Geographic Routing

Routes to nearest instance:

{
  "loadBalancing": {
    "strategy": "geographic",
    "regions": ["us-east-1", "eu-west-1", "ap-southeast-1"]
  }
}

Health Checks

SwiftClaw performs automatic health checks:

{
  "healthCheck": {
    "enabled": true,
    "interval": "30s",
    "timeout": "5s",
    "unhealthyThreshold": 3,
    "healthyThreshold": 2
  }
}

Custom Health Checks

Define custom health check logic:

@agent.health_check
async def custom_health():
    # Check database
    if not await db.ping():
        return {"status": "unhealthy", "reason": "database"}
    
    # Check external API
    if not await api.ping():
        return {"status": "unhealthy", "reason": "api"}
    
    # Check memory usage
    if memory_usage() > 90:
        return {"status": "unhealthy", "reason": "memory"}
    
    return {"status": "healthy"}

Session Affinity

Maintain session consistency:

{
  "loadBalancing": {
    "sessionAffinity": {
      "enabled": true,
      "type": "cookie",
      "ttl": "1h"
    }
  }
}

Sticky Sessions

Route user to same instance:

# Enable sticky sessions
agent = Agent(
    name="my-agent",
    session_affinity=True
)

# Session ID automatically tracked
@agent.on_message
async def handle_message(message, session_id):
    # Same session always routes to same instance
    return await agent.generate(message)

Traffic Distribution

Weighted Distribution

Distribute traffic based on weights:

{
  "loadBalancing": {
    "weighted": {
      "instance-1": 50,
      "instance-2": 30,
      "instance-3": 20
    }
  }
}

Canary Deployment

Route percentage of traffic to new version:

# Deploy canary with 10% traffic
swiftclaw deploy my-agent --canary 10

# Gradually increase
swiftclaw promote my-agent --canary 25
swiftclaw promote my-agent --canary 50
swiftclaw promote my-agent --canary 100

Multi-Region Deployment

Deploy across multiple regions:

# Deploy to multiple regions
swiftclaw deploy my-agent \
  --regions us-east-1,eu-west-1,ap-southeast-1

Region Configuration

{
  "regions": {
    "us-east-1": {
      "instances": 3,
      "priority": 1
    },
    "eu-west-1": {
      "instances": 2,
      "priority": 2
    },
    "ap-southeast-1": {
      "instances": 2,
      "priority": 3
    }
  }
}

Connection Limits

Configure connection limits:

{
  "limits": {
    "maxConnections": 1000,
    "maxConnectionsPerInstance": 100,
    "connectionTimeout": "30s",
    "requestTimeout": "60s"
  }
}

Rate Limiting

Protect against overload:

{
  "rateLimit": {
    "enabled": true,
    "requests": 100,
    "window": "1m",
    "strategy": "sliding-window"
  }
}

Per-User Rate Limiting

@agent.rate_limit(requests=10, window="1m", key="user_id")
async def handle_request(user_id: str, message: str):
    return await agent.generate(message)

Circuit Breaker

Prevent cascading failures:

{
  "circuitBreaker": {
    "enabled": true,
    "failureThreshold": 5,
    "timeout": "30s",
    "resetTimeout": "60s"
  }
}

Monitoring Load Distribution

View Load Metrics

# View load distribution
swiftclaw metrics my-agent --metric load-distribution

# View instance health
swiftclaw instances my-agent --health

Load Dashboard

Monitor load in real-time:

# Open load dashboard
swiftclaw dashboard my-agent --view load

Auto-Scaling with Load Balancing

Combine auto-scaling with load balancing:

{
  "scaling": {
    "minInstances": 2,
    "maxInstances": 10,
    "targetCPU": 70,
    "targetConnections": 80
  },
  "loadBalancing": {
    "strategy": "least-connections",
    "healthCheck": {
      "enabled": true,
      "interval": "30s"
    }
  }
}

Best Practices

1. Use Health Checks

Always enable health checks:

{
  "healthCheck": {
    "enabled": true,
    "interval": "30s"
  }
}

2. Configure Timeouts

Set appropriate timeouts:

{
  "timeouts": {
    "connection": "10s",
    "request": "60s",
    "idle": "300s"
  }
}

3. Monitor Metrics

Track load distribution:

Requests per instance
Response times per instance
Error rates per instance
Connection counts

4. Test Failover

Verify failover works:

# Simulate instance failure
swiftclaw test failover my-agent --instance instance-1

Automatic Failover: SwiftClaw automatically routes traffic away from unhealthy instances.

Next Steps

Load Balancing

SwiftClaw automatically distributes traffic across agent instances for optimal performance and reliability.

Automatic Load Balancing

SwiftClaw handles load balancing automatically:

Round Robin: Distributes requests evenly
Least Connections: Routes to least busy instance
Response Time: Routes to fastest instance
Geographic: Routes to nearest instance

Load Balancing Strategies

Round Robin

Default strategy, distributes requests evenly:

{
  "loadBalancing": {
    "strategy": "round-robin"
  }
}

Least Connections

Routes to instance with fewest active connections:

{
  "loadBalancing": {
    "strategy": "least-connections"
  }
}

Response Time Based

Routes to fastest responding instance:

{
  "loadBalancing": {
    "strategy": "response-time",
    "window": "5m"
  }
}

Geographic Routing

Routes to nearest instance:

{
  "loadBalancing": {
    "strategy": "geographic",
    "regions": ["us-east-1", "eu-west-1", "ap-southeast-1"]
  }
}

Health Checks

SwiftClaw performs automatic health checks:

{
  "healthCheck": {
    "enabled": true,
    "interval": "30s",
    "timeout": "5s",
    "unhealthyThreshold": 3,
    "healthyThreshold": 2
  }
}

Custom Health Checks

Define custom health check logic:

@agent.health_check
async def custom_health():
    # Check database
    if not await db.ping():
        return {"status": "unhealthy", "reason": "database"}
    
    # Check external API
    if not await api.ping():
        return {"status": "unhealthy", "reason": "api"}
    
    # Check memory usage
    if memory_usage() > 90:
        return {"status": "unhealthy", "reason": "memory"}
    
    return {"status": "healthy"}

Session Affinity

Maintain session consistency:

{
  "loadBalancing": {
    "sessionAffinity": {
      "enabled": true,
      "type": "cookie",
      "ttl": "1h"
    }
  }
}

Sticky Sessions

Route user to same instance:

# Enable sticky sessions
agent = Agent(
    name="my-agent",
    session_affinity=True
)

# Session ID automatically tracked
@agent.on_message
async def handle_message(message, session_id):
    # Same session always routes to same instance
    return await agent.generate(message)

Traffic Distribution

Weighted Distribution

Distribute traffic based on weights:

{
  "loadBalancing": {
    "weighted": {
      "instance-1": 50,
      "instance-2": 30,
      "instance-3": 20
    }
  }
}

Canary Deployment

Route percentage of traffic to new version:

# Deploy canary with 10% traffic
swiftclaw deploy my-agent --canary 10

# Gradually increase
swiftclaw promote my-agent --canary 25
swiftclaw promote my-agent --canary 50
swiftclaw promote my-agent --canary 100

Multi-Region Deployment

Deploy across multiple regions:

# Deploy to multiple regions
swiftclaw deploy my-agent \
  --regions us-east-1,eu-west-1,ap-southeast-1

Region Configuration

{
  "regions": {
    "us-east-1": {
      "instances": 3,
      "priority": 1
    },
    "eu-west-1": {
      "instances": 2,
      "priority": 2
    },
    "ap-southeast-1": {
      "instances": 2,
      "priority": 3
    }
  }
}

Connection Limits

Configure connection limits:

{
  "limits": {
    "maxConnections": 1000,
    "maxConnectionsPerInstance": 100,
    "connectionTimeout": "30s",
    "requestTimeout": "60s"
  }
}

Rate Limiting

Protect against overload:

{
  "rateLimit": {
    "enabled": true,
    "requests": 100,
    "window": "1m",
    "strategy": "sliding-window"
  }
}

Per-User Rate Limiting

@agent.rate_limit(requests=10, window="1m", key="user_id")
async def handle_request(user_id: str, message: str):
    return await agent.generate(message)

Circuit Breaker

Prevent cascading failures:

{
  "circuitBreaker": {
    "enabled": true,
    "failureThreshold": 5,
    "timeout": "30s",
    "resetTimeout": "60s"
  }
}

Monitoring Load Distribution

View Load Metrics

# View load distribution
swiftclaw metrics my-agent --metric load-distribution

# View instance health
swiftclaw instances my-agent --health

Load Dashboard

Monitor load in real-time:

# Open load dashboard
swiftclaw dashboard my-agent --view load

Auto-Scaling with Load Balancing

Combine auto-scaling with load balancing:

{
  "scaling": {
    "minInstances": 2,
    "maxInstances": 10,
    "targetCPU": 70,
    "targetConnections": 80
  },
  "loadBalancing": {
    "strategy": "least-connections",
    "healthCheck": {
      "enabled": true,
      "interval": "30s"
    }
  }
}

Best Practices

1. Use Health Checks

Always enable health checks:

{
  "healthCheck": {
    "enabled": true,
    "interval": "30s"
  }
}

2. Configure Timeouts

Set appropriate timeouts:

{
  "timeouts": {
    "connection": "10s",
    "request": "60s",
    "idle": "300s"
  }
}

3. Monitor Metrics

Track load distribution:

Requests per instance
Response times per instance
Error rates per instance
Connection counts

4. Test Failover

Verify failover works:

# Simulate instance failure
swiftclaw test failover my-agent --instance instance-1

Automatic Failover: SwiftClaw automatically routes traffic away from unhealthy instances.

Load Balancing

On this page

Load Balancing

On this page