Observability Guide
Monitor and debug your LayerCode agents with structured logging and observability tools.
Logging
Loguru Integration
The toolkit uses Loguru for structured logging.
Basic Logging
from loguru import logger
logger.info("Processing webhook")
logger.warning("Rate limit approaching")
logger.error("Failed to connect to API")
Structured Logging
logger.info(
"Agent response generated",
extra={
"call_id": "call_123",
"agent": "starter",
"response_length": 150,
"latency_ms": 450,
}
)
Log Levels
Configure log level with the --verbose flag:
# INFO level (default)
uv run layercode-create-app run --agent starter
# DEBUG level
uv run layercode-create-app run --agent starter --verbose
Log Files
Configure log file rotation:
from loguru import logger
logger.add(
"logs/app.log",
rotation="500 MB",
retention="10 days",
level="INFO",
format="{time:YYYY-MM-DD HH:mm:ss} | {level} | {message}",
serialize=True, # JSON format
)
Logfire
Logfire provides advanced observability for FastAPI and PydanticAI.
Setup
- Sign up at logfire.pydantic.dev
- Get your API token
- Add to
.env:
Features
Logfire automatically instruments:
- FastAPI requests - Request/response timing, status codes
- PydanticAI agents - Agent runs, tool calls, model usage
- Database queries - SQL queries and timing
- External API calls - HTTP request tracking
Viewing Data
Access your Logfire dashboard to:
- View request traces
- Analyze agent performance
- Monitor error rates
- Track model token usage
- Set up alerts
Custom Spans
Add custom instrumentation:
import logfire
async def process_order(order_id: str):
with logfire.span("process_order", order_id=order_id):
# Your order processing logic
order = await fetch_order(order_id)
with logfire.span("validate_order"):
validate(order)
with logfire.span("charge_payment"):
await charge_customer(order)
return order
Performance Monitoring
Track agent performance:
import logfire
from time import time
async def process(self, event: LayercodeEvent) -> str:
start = time()
try:
result = await self.agent.run(event.transcript)
logfire.info(
"Agent completed",
duration_ms=(time() - start) * 1000,
call_id=event.call_id,
agent=self.__class__.__name__,
)
return result.data
except Exception as e:
logfire.error(
"Agent failed",
error=str(e),
call_id=event.call_id,
)
raise
Metrics
Key Metrics to Track
- Request Metrics
- Requests per minute
- Response time (p50, p95, p99)
-
Error rate
-
Agent Metrics
- Agent response time
- Tool call frequency
- Token usage
-
Success rate
-
System Metrics
- CPU usage
- Memory usage
- Network I/O
- Disk usage
Custom Metrics
Track custom metrics with Logfire:
import logfire
class MetricsCollector:
def __init__(self):
self.request_count = 0
self.error_count = 0
async def track_request(self):
self.request_count += 1
logfire.metric("requests_total", self.request_count)
async def track_error(self):
self.error_count += 1
logfire.metric("errors_total", self.error_count)
metrics = MetricsCollector()
Alerting
Logfire Alerts
Configure alerts in your Logfire dashboard:
- High Error Rate
- Condition: Error rate > 5%
-
Action: Send email/Slack notification
-
Slow Response Times
- Condition: p95 latency > 2s
-
Action: Page on-call engineer
-
High Token Usage
- Condition: Token usage > 1M/hour
- Action: Send warning
Custom Alerts
Implement custom alerting logic:
import httpx
async def check_and_alert(metrics: dict):
if metrics["error_rate"] > 0.05:
await send_alert(
severity="critical",
message=f"High error rate: {metrics['error_rate']:.1%}",
)
async def send_alert(severity: str, message: str):
# Send to Slack
async with httpx.AsyncClient() as client:
await client.post(
"https://hooks.slack.com/services/YOUR/WEBHOOK/URL",
json={"text": f"[{severity.upper()}] {message}"}
)
Error Tracking
Sentry Integration
Add Sentry for error tracking:
- Install:
uv add sentry-sdk - Configure:
import sentry_sdk
sentry_sdk.init(
dsn="https://your-dsn@sentry.io/project",
traces_sample_rate=1.0,
)
- Errors are automatically reported to Sentry
Error Context
Add context to error reports:
import sentry_sdk
async def process(self, event: LayercodeEvent) -> str:
with sentry_sdk.configure_scope() as scope:
scope.set_context("event", {
"call_id": event.call_id,
"agent_id": event.agent_id,
"type": event.type,
})
try:
result = await self.agent.run(event.transcript)
return result.data
except Exception as e:
sentry_sdk.capture_exception(e)
raise
Debugging
Debug Mode
Enable debug logging:
Request Tracing
Log all incoming requests:
from starlette.middleware.base import BaseHTTPMiddleware
class RequestLoggingMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
logger.debug(
"Incoming request",
method=request.method,
path=request.url.path,
headers=dict(request.headers),
)
response = await call_next(request)
logger.debug(
"Response",
status_code=response.status_code,
)
return response
Agent Debugging
Log agent internals:
async def process(self, event: LayercodeEvent) -> str:
logger.debug(f"Event received: {event.type}")
logger.debug(f"Transcript: {event.transcript}")
result = await self.agent.run(event.transcript)
logger.debug(f"Agent result: {result.data}")
logger.debug(f"Tool calls: {result.all_messages()}")
return result.data
Performance Profiling
cProfile
Profile your agent:
import cProfile
import pstats
async def profile_agent():
profiler = cProfile.Profile()
profiler.enable()
# Run agent
await agent.process(event)
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(20)
Memory Profiling
Track memory usage:
import tracemalloc
tracemalloc.start()
# Run your code
await agent.process(event)
current, peak = tracemalloc.get_traced_memory()
logger.info(f"Memory usage: {current / 1024 / 1024:.2f} MB (peak: {peak / 1024 / 1024:.2f} MB)")
tracemalloc.stop()
Best Practices
- Log at appropriate levels
- DEBUG: Detailed diagnostic info
- INFO: General informational messages
- WARNING: Warning messages
-
ERROR: Error messages
-
Include context
- Always include
call_idandagent_id -
Add relevant business context
-
Avoid logging sensitive data
- Don't log API keys or tokens
-
Sanitize user data
-
Use structured logging
- Log as JSON for easier parsing
-
Include consistent fields
-
Monitor in production
- Set up alerts for critical issues
- Review dashboards regularly
Next Steps
- Deployment Guide - Deploy to production
- Troubleshooting - Common issues