Building Production AI Agents with Strands SDK and Amazon Bedrock AgentCore

AI agents are the next evolution of generative AI—systems that don’t just respond to prompts but can reason, plan, and take actions autonomously. With the release of Strands Agents SDK and Amazon Bedrock AgentCore, building and deploying production-ready AI agents has become significantly easier.

In this guide, I’ll walk you through building a complete AI agent and deploying it to AWS, from development to production.

What is Strands?

Strands is an open-source SDK from AWS for building AI agents. Unlike simple chatbots, Strands agents can:

Use tools – Call APIs, query databases, execute code
Maintain context – Remember conversation history and state
Plan and reason – Break complex tasks into steps
Stream responses – Provide real-time feedback

Key differentiators:

Native AWS integration (Bedrock, Lambda, S3, etc.)
Built-in observability with OpenTelemetry
Production-ready deployment to Bedrock AgentCore
Support for multiple LLM providers

Architecture Overview

Layer	Component	Description
Application	Your Application	Your code that uses the agent
SDK	Strands Agent SDK	Tools (Actions) \| Memory (Context) \| Planning (Reasoning)
Runtime	Amazon Bedrock AgentCore	Runtime (Fargate) \| Scaling (Auto) \| Monitoring (CloudWatch)
Models	Amazon Bedrock	Claude 3.5 Sonnet / Titan / Llama / Mistral

Getting Started

Prerequisites

Python 3.10+
AWS account with Bedrock access
AWS CLI configured

Installation

pip install strands-agents bedrock-agentcore

view raw strands-install.sh hosted with ❤ by GitHub

Building Your First Agent

Step 1: Basic Agent

	from strands import Agent

	# Create a simple agent
	agent = Agent(
	system_prompt="You are a helpful assistant that specializes in AWS cloud architecture."
	)

	# Run the agent
	response = agent("What's the best way to set up a VPC for a microservices application?")
	print(response.message)

view raw strands-basic-agent.py hosted with ❤ by GitHub

Step 2: Adding Tools

Tools give your agent the ability to take actions. Let’s add tools for AWS cost analysis:

	from strands import Agent, tool
	import boto3

	@tool
	def get_ec2_instances(region: str = "us-east-1") -> dict:
	"""Get a list of all EC2 instances in the specified region.

	Args:
	region: AWS region to query (default: us-east-1)

	Returns:
	Dictionary containing instance information
	"""
	ec2 = boto3.client('ec2', region_name=region)
	response = ec2.describe_instances()

	instances = []
	for reservation in response['Reservations']:
	for instance in reservation['Instances']:
	instances.append({
	'id': instance['InstanceId'],
	'type': instance['InstanceType'],
	'state': instance['State']['Name'],
	'launch_time': str(instance.get('LaunchTime', 'N/A'))
	})

	return {'instances': instances, 'count': len(instances)}


	@tool
	def get_monthly_cost(service: str = None) -> dict:
	"""Get AWS cost for the current month.

	Args:
	service: Optional AWS service name to filter (e.g., 'Amazon EC2')

	Returns:
	Dictionary containing cost information
	"""
	ce = boto3.client('ce', region_name='us-east-1')

	from datetime import datetime, timedelta
	end = datetime.now()
	start = end.replace(day=1)

	params = {
	'TimePeriod': {
	'Start': start.strftime('%Y-%m-%d'),
	'End': end.strftime('%Y-%m-%d')
	},
	'Granularity': 'MONTHLY',
	'Metrics': ['UnblendedCost']
	}

	if service:
	params['Filter'] = {
	'Dimensions': {
	'Key': 'SERVICE',
	'Values': [service]
	}
	}

	response = ce.get_cost_and_usage(**params)

	total = 0
	for result in response['ResultsByTime']:
	total += float(result['Total']['UnblendedCost']['Amount'])

	return {
	'total_cost': round(total, 2),
	'currency': 'USD',
	'period': f"{start.strftime('%Y-%m-%d')} to {end.strftime('%Y-%m-%d')}"
	}


	# Create agent with tools
	agent = Agent(
	system_prompt="""You are an AWS cost optimization expert.
	Use the available tools to analyze infrastructure and provide recommendations.
	Always provide specific, actionable advice.""",
	tools=[get_ec2_instances, get_monthly_cost]
	)

	# The agent can now query AWS
	response = agent("How much am I spending this month and what EC2 instances are running?")
	print(response.message)

view raw strands-agent-tools.py hosted with ❤ by GitHub

Step 3: Streaming Responses

For better UX, stream responses in real-time:

	async def chat_with_agent():
	agent = Agent(
	system_prompt="You are a helpful AWS architect.",
	tools=[get_ec2_instances, get_monthly_cost]
	)

	async for event in agent.stream_async("Analyze my AWS costs"):
	if hasattr(event, 'content'):
	print(event.content, end='', flush=True)
	elif hasattr(event, 'tool_use'):
	print(f"\n[Using tool: {event.tool_use.name}]")

	# Run
	import asyncio
	asyncio.run(chat_with_agent())

view raw strands-streaming.py hosted with ❤ by GitHub

Deploying to Bedrock AgentCore

Option 1: Quick Deploy with Starter Toolkit

The fastest way to get to production:

	# Install the toolkit
	pip install bedrock-agentcore-starter-toolkit

	# Configure your agent
	agentcore configure –entrypoint my_agent.py

	# Test locally (requires Docker)
	agentcore launch –local

	# Deploy to AWS
	agentcore launch

	# Test the deployed agent
	agentcore invoke '{"prompt": "Analyze my AWS costs"}'

view raw agentcore-deploy.sh hosted with ❤ by GitHub

Your agent file (my_agent.py):

	from bedrock_agentcore.runtime import BedrockAgentCoreApp
	from strands import Agent, tool
	import boto3

	app = BedrockAgentCoreApp()

	@tool
	def get_ec2_instances(region: str = "us-east-1") -> dict:
	"""Get EC2 instances in the specified region."""
	ec2 = boto3.client('ec2', region_name=region)
	response = ec2.describe_instances()
	instances = []
	for reservation in response['Reservations']:
	for instance in reservation['Instances']:
	instances.append({
	'id': instance['InstanceId'],
	'type': instance['InstanceType'],
	'state': instance['State']['Name']
	})
	return {'instances': instances}


	agent = Agent(
	system_prompt="You are an AWS infrastructure expert.",
	tools=[get_ec2_instances]
	)

	@app.entrypoint
	def invoke(payload):
	user_message = payload.get("prompt", "Hello")
	result = agent(user_message)
	return {"result": result.message}

	@app.entrypoint
	async def stream(payload):
	user_message = payload.get("prompt", "Hello")
	async for event in agent.stream_async(user_message):
	yield event

	if __name__ == "__main__":
	app.run()

view raw my_agent.py hosted with ❤ by GitHub

Option 2: Custom Deployment with Docker

For more control, build your own container:

agent.py:

	from fastapi import FastAPI, HTTPException
	from fastapi.responses import StreamingResponse
	from pydantic import BaseModel
	from strands import Agent, tool
	import boto3
	import json

	app = FastAPI(title="AWS Cost Agent", version="1.0.0")

	# Define tools
	@tool
	def analyze_costs() -> dict:
	"""Analyze current AWS spending."""
	ce = boto3.client('ce', region_name='us-east-1')
	# … implementation
	return {"total": 1234.56, "currency": "USD"}

	@tool
	def get_recommendations() -> dict:
	"""Get cost optimization recommendations."""
	# … implementation
	return {"recommendations": […]}

	# Initialize agent
	agent = Agent(
	system_prompt="You are an AWS cost optimization expert.",
	tools=[analyze_costs, get_recommendations]
	)


	class InvocationRequest(BaseModel):
	input: dict


	class InvocationResponse(BaseModel):
	output: dict


	@app.post("/invocations", response_model=InvocationResponse)
	async def invoke_agent(request: InvocationRequest):
	"""Synchronous agent invocation."""
	try:
	prompt = request.input.get("prompt", "")
	if not prompt:
	raise HTTPException(status_code=400, detail="No prompt provided")

	result = agent(prompt)
	return InvocationResponse(output={"message": result.message})

	except Exception as e:
	raise HTTPException(status_code=500, detail=str(e))


	@app.post("/invocations/stream")
	async def stream_agent(request: InvocationRequest):
	"""Streaming agent invocation."""
	async def generate():
	prompt = request.input.get("prompt", "")
	async for event in agent.stream_async(prompt):
	if hasattr(event, 'content'):
	yield f"data: {json.dumps({'content': event.content})}\n\n"
	yield "data: [DONE]\n\n"

	return StreamingResponse(
	generate(),
	media_type="text/event-stream"
	)


	@app.get("/ping")
	async def ping():
	"""Health check endpoint (required by AgentCore)."""
	return {"status": "healthy"}

view raw agent.py hosted with ❤ by GitHub

Dockerfile:

	FROM –platform=linux/arm64 ghcr.io/astral-sh/uv:python3.11-bookworm-slim

	WORKDIR /app

	# Install dependencies
	COPY pyproject.toml uv.lock ./
	RUN uv sync –frozen –no-cache

	# Copy application
	COPY agent.py ./

	# AgentCore requires port 8080
	EXPOSE 8080

	# Run with OpenTelemetry for observability
	CMD ["opentelemetry-instrument", "uv", "run", "uvicorn", "agent:app", "–host", "0.0.0.0", "–port", "8080"]

view raw Dockerfile hosted with ❤ by GitHub

pyproject.toml:

	[project]
	name = "aws-cost-agent"
	version = "1.0.0"
	requires-python = ">=3.11"
	dependencies = [
	"strands-agents>=0.1.0",
	"fastapi>=0.109.0",
	"uvicorn[standard]>=0.27.0",
	"boto3>=1.34.0",
	"aws-opentelemetry-distro>=0.10.1",
	]

view raw pyproject.toml hosted with ❤ by GitHub

Build and deploy:

	# Build for ARM64 (required by AgentCore)
	docker buildx build –platform linux/arm64 -t aws-cost-agent:latest .

	# Test locally
	docker run -p 8080:8080 \
	-e AWS_ACCESS_KEY_ID="$AWS_ACCESS_KEY_ID" \
	-e AWS_SECRET_ACCESS_KEY="$AWS_SECRET_ACCESS_KEY" \
	-e AWS_REGION="us-east-1" \
	aws-cost-agent:latest

	# Push to ECR
	aws ecr get-login-password –region us-east-1 \| \
	docker login –username AWS –password-stdin $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com

	docker tag aws-cost-agent:latest $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/aws-cost-agent:latest
	docker push $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/aws-cost-agent:latest

view raw docker-deploy.sh hosted with ❤ by GitHub

Create AgentCore Runtime with Terraform

	# agentcore.tf
	resource "aws_iam_role" "agent_runtime" {
	name = "AgentCoreRuntime"

	assume_role_policy = jsonencode({
	Version = "2012-10-17"
	Statement = [{
	Action = "sts:AssumeRole"
	Effect = "Allow"
	Principal = {
	Service = "bedrock-agentcore.amazonaws.com"
	}
	}]
	})
	}

	resource "aws_iam_role_policy" "agent_permissions" {
	name = "AgentPermissions"
	role = aws_iam_role.agent_runtime.id

	policy = jsonencode({
	Version = "2012-10-17"
	Statement = [
	{
	Effect = "Allow"
	Action = [
	"bedrock:InvokeModel",
	"bedrock:InvokeModelWithResponseStream"
	]
	Resource = "*"
	},
	{
	Effect = "Allow"
	Action = [
	"ce:GetCostAndUsage",
	"ec2:DescribeInstances"
	]
	Resource = "*"
	},
	{
	Effect = "Allow"
	Action = [
	"logs:CreateLogGroup",
	"logs:CreateLogStream",
	"logs:PutLogEvents"
	]
	Resource = "*"
	}
	]
	})
	}

	# Note: AgentCore resources may need boto3 deployment
	# as Terraform provider support is still evolving

view raw agentcore.tf hosted with ❤ by GitHub

Invoking Your Deployed Agent

	import boto3
	import json

	def invoke_agent(prompt: str, session_id: str = None):
	"""Invoke the deployed agent."""
	client = boto3.client('bedrock-agentcore', region_name='us-east-1')

	if not session_id:
	import uuid
	session_id = str(uuid.uuid4())

	response = client.invoke_agent_runtime(
	agentRuntimeArn='arn:aws:bedrock-agentcore:us-east-1:123456789:runtime/aws-cost-agent',
	runtimeSessionId=session_id,
	payload=json.dumps({
	"input": {"prompt": prompt}
	}),
	qualifier="DEFAULT"
	)

	result = json.loads(response['payload'].read())
	return result


	# Use it
	response = invoke_agent("What's my AWS spending this month?")
	print(response['output']['message'])

view raw invoke-agent.py hosted with ❤ by GitHub

Observability with CloudWatch

AgentCore integrates with CloudWatch Transaction Search for tracing:

Enable Transaction Search in CloudWatch console
Add OpenTelemetry to your agent:

	# In your Dockerfile CMD
	CMD ["opentelemetry-instrument", "uvicorn", "agent:app", "–host", "0.0.0.0", "–port", "8080"]

view raw observability-cmd.sh hosted with ❤ by GitHub

View traces in CloudWatch Application Signals

Key metrics to monitor:

Invocation latency
Tool execution time
Token usage
Error rates

Best Practices

1. Tool Design

	@tool
	def good_tool(specific_param: str, optional_param: int = 10) -> dict:
	"""Clear description of what this tool does.

	Args:
	specific_param: Exactly what this parameter expects
	optional_param: What this controls (default: 10)

	Returns:
	Dictionary with well-defined structure
	"""
	# Implementation with error handling
	try:
	result = do_something(specific_param)
	return {"status": "success", "data": result}
	except Exception as e:
	return {"status": "error", "message": str(e)}

view raw tool-design.py hosted with ❤ by GitHub

2. Error Handling

	@app.post("/invocations")
	async def invoke_agent(request: InvocationRequest):
	try:
	result = agent(request.input.get("prompt", ""))
	return {"output": {"message": result.message}}
	except boto3.exceptions.Boto3Error as e:
	# AWS-specific errors
	raise HTTPException(status_code=503, detail=f"AWS error: {str(e)}")
	except Exception as e:
	# Log for debugging
	logger.exception("Agent invocation failed")
	raise HTTPException(status_code=500, detail="Internal error")

view raw error-handling.py hosted with ❤ by GitHub

3. Session Management

	# Use consistent session IDs for conversation continuity
	from strands import Agent
	from strands.memory import ConversationMemory

	agent = Agent(
	system_prompt="…",
	memory=ConversationMemory(max_messages=20)
	)

	# Each session maintains its own context
	session_agents = {}

	def get_agent_for_session(session_id: str) -> Agent:
	if session_id not in session_agents:
	session_agents[session_id] = Agent(
	system_prompt="…",
	memory=ConversationMemory()
	)
	return session_agents[session_id]

view raw session-management.py hosted with ❤ by GitHub

Cost Considerations

Component	Pricing
Bedrock AgentCore	Per invocation + compute time
Bedrock LLM (Claude 3.5 Sonnet)	$3/M input, $15/M output tokens
ECR Storage	~$0.10/GB/month
CloudWatch Logs	$0.50/GB ingested

Optimization tips:

Use Claude Instant for simple tasks
Implement response caching
Set appropriate timeouts
Monitor token usage

What’s Next?

Multi-agent systems – Agents that collaborate
Knowledge bases – RAG with Bedrock Knowledge Bases
Guardrails – Content filtering and safety
Custom models – Fine-tuned models for specific domains

Have you built agents with Strands? Share your use cases in the comments!

About the author: David Petrocelli is a Senior Cloud Architect at Caylent, PhD in Computer Science, and University Professor. He specializes in AWS architecture and generative AI applications.

Building Production AI Agents with Strands SDK and Amazon Bedrock AgentCore

Building Production AI Agents with Strands SDK and Amazon Bedrock AgentCore

What is Strands?

Architecture Overview

Getting Started

Prerequisites

Installation

Building Your First Agent

Step 1: Basic Agent

Step 2: Adding Tools

Step 3: Streaming Responses

Deploying to Bedrock AgentCore

Option 1: Quick Deploy with Starter Toolkit

Option 2: Custom Deployment with Docker

Create AgentCore Runtime with Terraform

Invoking Your Deployed Agent

Observability with CloudWatch

Best Practices

1. Tool Design

2. Error Handling

3. Session Management

Cost Considerations

What’s Next?

Share this:

Leave a comment Cancel reply