LayerCode Gym¶

Does your voice AI agent even lift, bro?

v0.0.1-alpha

This toolkit is an early alpha and may contain bugs or breaking changes. Please test thoroughly before using in production.

LayerCode Gym is an unofficial testing environment for voice AI agents built on Layercode.com. It simulates real voice clients end-to-end, allowing you to run hundreds of test scenarios and understand how your agent will perform in production.

Why LayerCode Gym?¶

Voice AI agents are complex systems with many failure modes:

Transcription errors under various audio conditions
Response latency and timing issues
Conversation flow and context handling
Edge cases and unexpected user inputs

LayerCode Gym helps you catch these issues before production by providing a comprehensive testing framework that simulates real-world usage at scale.

Key Features¶

Three Types of User Simulators¶

Fixed Text Messages - Send predetermined text responses (fastest, perfect for regression testing)
Pre-recorded Audio Files - Stream audio files to stress-test transcription and agent behavior
AI Agent Personas - Use PydanticAI to simulate realistic users with specific personalities and goals

Comprehensive Analytics¶

After each conversation, get:

Full transcript with timing metrics (TTFAB, latency stats)
Combined audio file for playback review
Turn-by-turn conversation logs
Optional LLM-as-judge scoring via callbacks

Scale Testing¶

Run hundreds of conversations concurrently
Batch evaluation with progress tracking
Automated regression detection
Load testing capabilities

Pluggable Architecture¶

Custom TTS engines (ElevenLabs, Azure, etc.)
Custom LLM providers via PydanticAI
Custom evaluation callbacks
Extensible simulator protocols

Quick Example¶

from layercode_gym import LayercodeClient, UserSimulator

# Create a simple text-based simulator
simulator = UserSimulator.from_text(
    messages=[
        "Hello! I'm interested in your services.",
        "Tell me more about pricing.",
        "Thank you, goodbye."
    ],
    send_as_text=True  # Fast, no TTS needed
)

# Run the conversation
client = LayercodeClient(simulator=simulator)
conversation_id = await client.run()

# Results saved to conversations/<conversation_id>/

Who Is This For?¶

Voice AI developers building production agents on Layercode
QA teams needing automated testing for voice interfaces
Product teams evaluating agent performance before launch
Researchers studying voice AI behavior at scale

What's Different?¶

Unlike manual testing in the Layercode dashboard:

Automated: Run tests programmatically without manual intervention
Scalable: Test hundreds of scenarios concurrently
Reproducible: Version control your test scenarios
Measurable: Get detailed metrics and analytics
Continuous: Integrate with CI/CD pipelines

Getting Started¶

Ready to test your voice AI agent? Head over to Getting Started to set up LayerCode Gym and run your first test.

Project Status¶

This is an unofficial project - not affiliated with or supported by Layercode. It's a community tool for developers who want to test their Layercode agents more thoroughly.

Contributions, bug reports, and feature requests are welcome on GitHub.