Building Production AI Agents with Strands SDK and Amazon Bedrock AgentCore
AI agents are the next evolution of generative AI—systems that don’t just respond to prompts but can reason, plan, and take actions autonomously. With the release of Strands Agents SDK and Amazon Bedrock AgentCore, building and deploying production-ready AI agents has become significantly easier.
In this guide, I’ll walk you through building a complete AI agent and deploying it to AWS, from development to production.
What is Strands?
Strands is an open-source SDK from AWS for building AI agents. Unlike simple chatbots, Strands agents can:
- Use tools – Call APIs, query databases, execute code
- Maintain context – Remember conversation history and state
- Plan and reason – Break complex tasks into steps
- Stream responses – Provide real-time feedback
Key differentiators:
- Native AWS integration (Bedrock, Lambda, S3, etc.)
- Built-in observability with OpenTelemetry
- Production-ready deployment to Bedrock AgentCore
- Support for multiple LLM providers
Architecture Overview
| Layer | Component | Description |
|---|---|---|
| Application | Your Application | Your code that uses the agent |
| SDK | Strands Agent SDK | Tools (Actions) | Memory (Context) | Planning (Reasoning) |
| Runtime | Amazon Bedrock AgentCore | Runtime (Fargate) | Scaling (Auto) | Monitoring (CloudWatch) |
| Models | Amazon Bedrock | Claude 3.5 Sonnet / Titan / Llama / Mistral |
Getting Started
Prerequisites
- Python 3.10+
- AWS account with Bedrock access
- AWS CLI configured
Installation
| pip install strands-agents bedrock-agentcore |
Building Your First Agent
Step 1: Basic Agent
| from strands import Agent | |
| # Create a simple agent | |
| agent = Agent( | |
| system_prompt="You are a helpful assistant that specializes in AWS cloud architecture." | |
| ) | |
| # Run the agent | |
| response = agent("What's the best way to set up a VPC for a microservices application?") | |
| print(response.message) |
Step 2: Adding Tools
Tools give your agent the ability to take actions. Let’s add tools for AWS cost analysis:
| from strands import Agent, tool | |
| import boto3 | |
| @tool | |
| def get_ec2_instances(region: str = "us-east-1") -> dict: | |
| """Get a list of all EC2 instances in the specified region. | |
| Args: | |
| region: AWS region to query (default: us-east-1) | |
| Returns: | |
| Dictionary containing instance information | |
| """ | |
| ec2 = boto3.client('ec2', region_name=region) | |
| response = ec2.describe_instances() | |
| instances = [] | |
| for reservation in response['Reservations']: | |
| for instance in reservation['Instances']: | |
| instances.append({ | |
| 'id': instance['InstanceId'], | |
| 'type': instance['InstanceType'], | |
| 'state': instance['State']['Name'], | |
| 'launch_time': str(instance.get('LaunchTime', 'N/A')) | |
| }) | |
| return {'instances': instances, 'count': len(instances)} | |
| @tool | |
| def get_monthly_cost(service: str = None) -> dict: | |
| """Get AWS cost for the current month. | |
| Args: | |
| service: Optional AWS service name to filter (e.g., 'Amazon EC2') | |
| Returns: | |
| Dictionary containing cost information | |
| """ | |
| ce = boto3.client('ce', region_name='us-east-1') | |
| from datetime import datetime, timedelta | |
| end = datetime.now() | |
| start = end.replace(day=1) | |
| params = { | |
| 'TimePeriod': { | |
| 'Start': start.strftime('%Y-%m-%d'), | |
| 'End': end.strftime('%Y-%m-%d') | |
| }, | |
| 'Granularity': 'MONTHLY', | |
| 'Metrics': ['UnblendedCost'] | |
| } | |
| if service: | |
| params['Filter'] = { | |
| 'Dimensions': { | |
| 'Key': 'SERVICE', | |
| 'Values': [service] | |
| } | |
| } | |
| response = ce.get_cost_and_usage(**params) | |
| total = 0 | |
| for result in response['ResultsByTime']: | |
| total += float(result['Total']['UnblendedCost']['Amount']) | |
| return { | |
| 'total_cost': round(total, 2), | |
| 'currency': 'USD', | |
| 'period': f"{start.strftime('%Y-%m-%d')} to {end.strftime('%Y-%m-%d')}" | |
| } | |
| # Create agent with tools | |
| agent = Agent( | |
| system_prompt="""You are an AWS cost optimization expert. | |
| Use the available tools to analyze infrastructure and provide recommendations. | |
| Always provide specific, actionable advice.""", | |
| tools=[get_ec2_instances, get_monthly_cost] | |
| ) | |
| # The agent can now query AWS | |
| response = agent("How much am I spending this month and what EC2 instances are running?") | |
| print(response.message) |
Step 3: Streaming Responses
For better UX, stream responses in real-time:
| async def chat_with_agent(): | |
| agent = Agent( | |
| system_prompt="You are a helpful AWS architect.", | |
| tools=[get_ec2_instances, get_monthly_cost] | |
| ) | |
| async for event in agent.stream_async("Analyze my AWS costs"): | |
| if hasattr(event, 'content'): | |
| print(event.content, end='', flush=True) | |
| elif hasattr(event, 'tool_use'): | |
| print(f"\n[Using tool: {event.tool_use.name}]") | |
| # Run | |
| import asyncio | |
| asyncio.run(chat_with_agent()) |
Deploying to Bedrock AgentCore
Option 1: Quick Deploy with Starter Toolkit
The fastest way to get to production:
| # Install the toolkit | |
| pip install bedrock-agentcore-starter-toolkit | |
| # Configure your agent | |
| agentcore configure –entrypoint my_agent.py | |
| # Test locally (requires Docker) | |
| agentcore launch –local | |
| # Deploy to AWS | |
| agentcore launch | |
| # Test the deployed agent | |
| agentcore invoke '{"prompt": "Analyze my AWS costs"}' |
Your agent file (my_agent.py):
| from bedrock_agentcore.runtime import BedrockAgentCoreApp | |
| from strands import Agent, tool | |
| import boto3 | |
| app = BedrockAgentCoreApp() | |
| @tool | |
| def get_ec2_instances(region: str = "us-east-1") -> dict: | |
| """Get EC2 instances in the specified region.""" | |
| ec2 = boto3.client('ec2', region_name=region) | |
| response = ec2.describe_instances() | |
| instances = [] | |
| for reservation in response['Reservations']: | |
| for instance in reservation['Instances']: | |
| instances.append({ | |
| 'id': instance['InstanceId'], | |
| 'type': instance['InstanceType'], | |
| 'state': instance['State']['Name'] | |
| }) | |
| return {'instances': instances} | |
| agent = Agent( | |
| system_prompt="You are an AWS infrastructure expert.", | |
| tools=[get_ec2_instances] | |
| ) | |
| @app.entrypoint | |
| def invoke(payload): | |
| user_message = payload.get("prompt", "Hello") | |
| result = agent(user_message) | |
| return {"result": result.message} | |
| @app.entrypoint | |
| async def stream(payload): | |
| user_message = payload.get("prompt", "Hello") | |
| async for event in agent.stream_async(user_message): | |
| yield event | |
| if __name__ == "__main__": | |
| app.run() |
Option 2: Custom Deployment with Docker
For more control, build your own container:
agent.py:
| from fastapi import FastAPI, HTTPException | |
| from fastapi.responses import StreamingResponse | |
| from pydantic import BaseModel | |
| from strands import Agent, tool | |
| import boto3 | |
| import json | |
| app = FastAPI(title="AWS Cost Agent", version="1.0.0") | |
| # Define tools | |
| @tool | |
| def analyze_costs() -> dict: | |
| """Analyze current AWS spending.""" | |
| ce = boto3.client('ce', region_name='us-east-1') | |
| # … implementation | |
| return {"total": 1234.56, "currency": "USD"} | |
| @tool | |
| def get_recommendations() -> dict: | |
| """Get cost optimization recommendations.""" | |
| # … implementation | |
| return {"recommendations": […]} | |
| # Initialize agent | |
| agent = Agent( | |
| system_prompt="You are an AWS cost optimization expert.", | |
| tools=[analyze_costs, get_recommendations] | |
| ) | |
| class InvocationRequest(BaseModel): | |
| input: dict | |
| class InvocationResponse(BaseModel): | |
| output: dict | |
| @app.post("/invocations", response_model=InvocationResponse) | |
| async def invoke_agent(request: InvocationRequest): | |
| """Synchronous agent invocation.""" | |
| try: | |
| prompt = request.input.get("prompt", "") | |
| if not prompt: | |
| raise HTTPException(status_code=400, detail="No prompt provided") | |
| result = agent(prompt) | |
| return InvocationResponse(output={"message": result.message}) | |
| except Exception as e: | |
| raise HTTPException(status_code=500, detail=str(e)) | |
| @app.post("/invocations/stream") | |
| async def stream_agent(request: InvocationRequest): | |
| """Streaming agent invocation.""" | |
| async def generate(): | |
| prompt = request.input.get("prompt", "") | |
| async for event in agent.stream_async(prompt): | |
| if hasattr(event, 'content'): | |
| yield f"data: {json.dumps({'content': event.content})}\n\n" | |
| yield "data: [DONE]\n\n" | |
| return StreamingResponse( | |
| generate(), | |
| media_type="text/event-stream" | |
| ) | |
| @app.get("/ping") | |
| async def ping(): | |
| """Health check endpoint (required by AgentCore).""" | |
| return {"status": "healthy"} |
Dockerfile:
| FROM –platform=linux/arm64 ghcr.io/astral-sh/uv:python3.11-bookworm-slim | |
| WORKDIR /app | |
| # Install dependencies | |
| COPY pyproject.toml uv.lock ./ | |
| RUN uv sync –frozen –no-cache | |
| # Copy application | |
| COPY agent.py ./ | |
| # AgentCore requires port 8080 | |
| EXPOSE 8080 | |
| # Run with OpenTelemetry for observability | |
| CMD ["opentelemetry-instrument", "uv", "run", "uvicorn", "agent:app", "–host", "0.0.0.0", "–port", "8080"] |
pyproject.toml:
| [project] | |
| name = "aws-cost-agent" | |
| version = "1.0.0" | |
| requires-python = ">=3.11" | |
| dependencies = [ | |
| "strands-agents>=0.1.0", | |
| "fastapi>=0.109.0", | |
| "uvicorn[standard]>=0.27.0", | |
| "boto3>=1.34.0", | |
| "aws-opentelemetry-distro>=0.10.1", | |
| ] |
Build and deploy:
| # Build for ARM64 (required by AgentCore) | |
| docker buildx build –platform linux/arm64 -t aws-cost-agent:latest . | |
| # Test locally | |
| docker run -p 8080:8080 \ | |
| -e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" \ | |
| -e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" \ | |
| -e AWS_REGION="us-east-1" \ | |
| aws-cost-agent:latest | |
| # Push to ECR | |
| aws ecr get-login-password –region us-east-1 | \ | |
| docker login –username AWS –password-stdin $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com | |
| docker tag aws-cost-agent:latest $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/aws-cost-agent:latest | |
| docker push $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/aws-cost-agent:latest |
Create AgentCore Runtime with Terraform
| # agentcore.tf | |
| resource "aws_iam_role" "agent_runtime" { | |
| name = "AgentCoreRuntime" | |
| assume_role_policy = jsonencode({ | |
| Version = "2012-10-17" | |
| Statement = [{ | |
| Action = "sts:AssumeRole" | |
| Effect = "Allow" | |
| Principal = { | |
| Service = "bedrock-agentcore.amazonaws.com" | |
| } | |
| }] | |
| }) | |
| } | |
| resource "aws_iam_role_policy" "agent_permissions" { | |
| name = "AgentPermissions" | |
| role = aws_iam_role.agent_runtime.id | |
| policy = jsonencode({ | |
| Version = "2012-10-17" | |
| Statement = [ | |
| { | |
| Effect = "Allow" | |
| Action = [ | |
| "bedrock:InvokeModel", | |
| "bedrock:InvokeModelWithResponseStream" | |
| ] | |
| Resource = "*" | |
| }, | |
| { | |
| Effect = "Allow" | |
| Action = [ | |
| "ce:GetCostAndUsage", | |
| "ec2:DescribeInstances" | |
| ] | |
| Resource = "*" | |
| }, | |
| { | |
| Effect = "Allow" | |
| Action = [ | |
| "logs:CreateLogGroup", | |
| "logs:CreateLogStream", | |
| "logs:PutLogEvents" | |
| ] | |
| Resource = "*" | |
| } | |
| ] | |
| }) | |
| } | |
| # Note: AgentCore resources may need boto3 deployment | |
| # as Terraform provider support is still evolving |
Invoking Your Deployed Agent
| import boto3 | |
| import json | |
| def invoke_agent(prompt: str, session_id: str = None): | |
| """Invoke the deployed agent.""" | |
| client = boto3.client('bedrock-agentcore', region_name='us-east-1') | |
| if not session_id: | |
| import uuid | |
| session_id = str(uuid.uuid4()) | |
| response = client.invoke_agent_runtime( | |
| agentRuntimeArn='arn:aws:bedrock-agentcore:us-east-1:123456789:runtime/aws-cost-agent', | |
| runtimeSessionId=session_id, | |
| payload=json.dumps({ | |
| "input": {"prompt": prompt} | |
| }), | |
| qualifier="DEFAULT" | |
| ) | |
| result = json.loads(response['payload'].read()) | |
| return result | |
| # Use it | |
| response = invoke_agent("What's my AWS spending this month?") | |
| print(response['output']['message']) |
Observability with CloudWatch
AgentCore integrates with CloudWatch Transaction Search for tracing:
- Enable Transaction Search in CloudWatch console
- Add OpenTelemetry to your agent:
| # In your Dockerfile CMD | |
| CMD ["opentelemetry-instrument", "uvicorn", "agent:app", "–host", "0.0.0.0", "–port", "8080"] |
- View traces in CloudWatch Application Signals
Key metrics to monitor:
- Invocation latency
- Tool execution time
- Token usage
- Error rates
Best Practices
1. Tool Design
| @tool | |
| def good_tool(specific_param: str, optional_param: int = 10) -> dict: | |
| """Clear description of what this tool does. | |
| Args: | |
| specific_param: Exactly what this parameter expects | |
| optional_param: What this controls (default: 10) | |
| Returns: | |
| Dictionary with well-defined structure | |
| """ | |
| # Implementation with error handling | |
| try: | |
| result = do_something(specific_param) | |
| return {"status": "success", "data": result} | |
| except Exception as e: | |
| return {"status": "error", "message": str(e)} |
2. Error Handling
| @app.post("/invocations") | |
| async def invoke_agent(request: InvocationRequest): | |
| try: | |
| result = agent(request.input.get("prompt", "")) | |
| return {"output": {"message": result.message}} | |
| except boto3.exceptions.Boto3Error as e: | |
| # AWS-specific errors | |
| raise HTTPException(status_code=503, detail=f"AWS error: {str(e)}") | |
| except Exception as e: | |
| # Log for debugging | |
| logger.exception("Agent invocation failed") | |
| raise HTTPException(status_code=500, detail="Internal error") |
3. Session Management
| # Use consistent session IDs for conversation continuity | |
| from strands import Agent | |
| from strands.memory import ConversationMemory | |
| agent = Agent( | |
| system_prompt="…", | |
| memory=ConversationMemory(max_messages=20) | |
| ) | |
| # Each session maintains its own context | |
| session_agents = {} | |
| def get_agent_for_session(session_id: str) -> Agent: | |
| if session_id not in session_agents: | |
| session_agents[session_id] = Agent( | |
| system_prompt="…", | |
| memory=ConversationMemory() | |
| ) | |
| return session_agents[session_id] |
Cost Considerations
| Component | Pricing |
|---|---|
| Bedrock AgentCore | Per invocation + compute time |
| Bedrock LLM (Claude 3.5 Sonnet) | $3/M input, $15/M output tokens |
| ECR Storage | ~$0.10/GB/month |
| CloudWatch Logs | $0.50/GB ingested |
Optimization tips:
- Use Claude Instant for simple tasks
- Implement response caching
- Set appropriate timeouts
- Monitor token usage
What’s Next?
- Multi-agent systems – Agents that collaborate
- Knowledge bases – RAG with Bedrock Knowledge Bases
- Guardrails – Content filtering and safety
- Custom models – Fine-tuned models for specific domains
Have you built agents with Strands? Share your use cases in the comments!
About the author: David Petrocelli is a Senior Cloud Architect at Caylent, PhD in Computer Science, and University Professor. He specializes in AWS architecture and generative AI applications.
Deja un comentario