Skip to content

Observability Guide

Monitor and debug your LayerCode agents with structured logging and observability tools.

Logging

Loguru Integration

The toolkit uses Loguru for structured logging.

Basic Logging

from loguru import logger

logger.info("Processing webhook")
logger.warning("Rate limit approaching")
logger.error("Failed to connect to API")

Structured Logging

logger.info(
    "Agent response generated",
    extra={
        "call_id": "call_123",
        "agent": "starter",
        "response_length": 150,
        "latency_ms": 450,
    }
)

Log Levels

Configure log level with the --verbose flag:

# INFO level (default)
uv run layercode-create-app run --agent starter

# DEBUG level
uv run layercode-create-app run --agent starter --verbose

Log Files

Configure log file rotation:

from loguru import logger

logger.add(
    "logs/app.log",
    rotation="500 MB",
    retention="10 days",
    level="INFO",
    format="{time:YYYY-MM-DD HH:mm:ss} | {level} | {message}",
    serialize=True,  # JSON format
)

Logfire

Logfire provides advanced observability for FastAPI and PydanticAI.

Setup

  1. Sign up at logfire.pydantic.dev
  2. Get your API token
  3. Add to .env:
LOGFIRE_TOKEN=lf_live_...

Features

Logfire automatically instruments:

  • FastAPI requests - Request/response timing, status codes
  • PydanticAI agents - Agent runs, tool calls, model usage
  • Database queries - SQL queries and timing
  • External API calls - HTTP request tracking

Viewing Data

Access your Logfire dashboard to:

  • View request traces
  • Analyze agent performance
  • Monitor error rates
  • Track model token usage
  • Set up alerts

Custom Spans

Add custom instrumentation:

import logfire

async def process_order(order_id: str):
    with logfire.span("process_order", order_id=order_id):
        # Your order processing logic
        order = await fetch_order(order_id)

        with logfire.span("validate_order"):
            validate(order)

        with logfire.span("charge_payment"):
            await charge_customer(order)

        return order

Performance Monitoring

Track agent performance:

import logfire
from time import time

async def process(self, event: LayercodeEvent) -> str:
    start = time()

    try:
        result = await self.agent.run(event.transcript)

        logfire.info(
            "Agent completed",
            duration_ms=(time() - start) * 1000,
            call_id=event.call_id,
            agent=self.__class__.__name__,
        )

        return result.data

    except Exception as e:
        logfire.error(
            "Agent failed",
            error=str(e),
            call_id=event.call_id,
        )
        raise

Metrics

Key Metrics to Track

  1. Request Metrics
  2. Requests per minute
  3. Response time (p50, p95, p99)
  4. Error rate

  5. Agent Metrics

  6. Agent response time
  7. Tool call frequency
  8. Token usage
  9. Success rate

  10. System Metrics

  11. CPU usage
  12. Memory usage
  13. Network I/O
  14. Disk usage

Custom Metrics

Track custom metrics with Logfire:

import logfire

class MetricsCollector:
    def __init__(self):
        self.request_count = 0
        self.error_count = 0

    async def track_request(self):
        self.request_count += 1
        logfire.metric("requests_total", self.request_count)

    async def track_error(self):
        self.error_count += 1
        logfire.metric("errors_total", self.error_count)

metrics = MetricsCollector()

Alerting

Logfire Alerts

Configure alerts in your Logfire dashboard:

  1. High Error Rate
  2. Condition: Error rate > 5%
  3. Action: Send email/Slack notification

  4. Slow Response Times

  5. Condition: p95 latency > 2s
  6. Action: Page on-call engineer

  7. High Token Usage

  8. Condition: Token usage > 1M/hour
  9. Action: Send warning

Custom Alerts

Implement custom alerting logic:

import httpx

async def check_and_alert(metrics: dict):
    if metrics["error_rate"] > 0.05:
        await send_alert(
            severity="critical",
            message=f"High error rate: {metrics['error_rate']:.1%}",
        )

async def send_alert(severity: str, message: str):
    # Send to Slack
    async with httpx.AsyncClient() as client:
        await client.post(
            "https://hooks.slack.com/services/YOUR/WEBHOOK/URL",
            json={"text": f"[{severity.upper()}] {message}"}
        )

Error Tracking

Sentry Integration

Add Sentry for error tracking:

  1. Install: uv add sentry-sdk
  2. Configure:
import sentry_sdk

sentry_sdk.init(
    dsn="https://your-dsn@sentry.io/project",
    traces_sample_rate=1.0,
)
  1. Errors are automatically reported to Sentry

Error Context

Add context to error reports:

import sentry_sdk

async def process(self, event: LayercodeEvent) -> str:
    with sentry_sdk.configure_scope() as scope:
        scope.set_context("event", {
            "call_id": event.call_id,
            "agent_id": event.agent_id,
            "type": event.type,
        })

        try:
            result = await self.agent.run(event.transcript)
            return result.data
        except Exception as e:
            sentry_sdk.capture_exception(e)
            raise

Debugging

Debug Mode

Enable debug logging:

uv run layercode-create-app run --agent starter --verbose

Request Tracing

Log all incoming requests:

from starlette.middleware.base import BaseHTTPMiddleware

class RequestLoggingMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        logger.debug(
            "Incoming request",
            method=request.method,
            path=request.url.path,
            headers=dict(request.headers),
        )

        response = await call_next(request)

        logger.debug(
            "Response",
            status_code=response.status_code,
        )

        return response

Agent Debugging

Log agent internals:

async def process(self, event: LayercodeEvent) -> str:
    logger.debug(f"Event received: {event.type}")
    logger.debug(f"Transcript: {event.transcript}")

    result = await self.agent.run(event.transcript)

    logger.debug(f"Agent result: {result.data}")
    logger.debug(f"Tool calls: {result.all_messages()}")

    return result.data

Performance Profiling

cProfile

Profile your agent:

import cProfile
import pstats

async def profile_agent():
    profiler = cProfile.Profile()
    profiler.enable()

    # Run agent
    await agent.process(event)

    profiler.disable()
    stats = pstats.Stats(profiler)
    stats.sort_stats('cumulative')
    stats.print_stats(20)

Memory Profiling

Track memory usage:

import tracemalloc

tracemalloc.start()

# Run your code
await agent.process(event)

current, peak = tracemalloc.get_traced_memory()
logger.info(f"Memory usage: {current / 1024 / 1024:.2f} MB (peak: {peak / 1024 / 1024:.2f} MB)")

tracemalloc.stop()

Best Practices

  1. Log at appropriate levels
  2. DEBUG: Detailed diagnostic info
  3. INFO: General informational messages
  4. WARNING: Warning messages
  5. ERROR: Error messages

  6. Include context

  7. Always include call_id and agent_id
  8. Add relevant business context

  9. Avoid logging sensitive data

  10. Don't log API keys or tokens
  11. Sanitize user data

  12. Use structured logging

  13. Log as JSON for easier parsing
  14. Include consistent fields

  15. Monitor in production

  16. Set up alerts for critical issues
  17. Review dashboards regularly

Next Steps