Skip to content

Agent Tools Introduction

AgentTools is an experimental module that provides a set of utilities for building advanced agentic workflows, code-generating and self-fixing agents.

Import the module as follows:

julia
using PromptingTools.Experimental.AgentTools
# to access unexported functionality
const AT = PromptingTools.Experimental.AgentTools

Highlights

The main functions to be aware of are:

  • AIGenerate - Lazy counterpart of aigenerate(). All ai* functions have a corresponding AI*::AICall struct that allows for deferred execution (triggered by run! method).

  • last_output, last_message - Simple utilities to access the last output and message of the AI calls like AIGenerate.

  • airetry! - A utility to automatically retry the AI call with the same inputs if the AI model fails to generate a valid output. It allows retrying many times and providing feedback to the AI model about the failure to increase its robustness. AIGenerate and other AI calls have a field config::RetryConfig where you can globally adjust the retrying behavior.

  • print_samples - airetry! implements a Monte Carlo Tree Search under the hood when trying to find the best way to fix the AI model's failure. print_samples is a utility to print the "samples" generated by the MCTS to better understand the attempts made by the AI model to fix the failure.

  • AICode extensions like aicodefixer_feedback and error_feedback - AICode is a wrapper that extracts any Julia code provided in the AIMessage (response from the AI model) and executes it (including catch any errors). aicodefixer_feedback and error_feedback are utilities that automatically review an outcome of AICode evaluation and generate the corresponding feedback for the AI model.

The main contribution of this module is providing the "lazy" counterparts to the ai... functions, which allow us to build a workflow, which can be re-executed many times with the same inputs.

For example, AIGenerate() will create a lazy instance of aigenerate, which is an instance of AICall with aigenerate as its ai-calling function. It uses exactly the same arguments and keyword arguments as aigenerate (see ?aigenerate for details). The notion of "lazy" refers to the fact that it does NOT generate any output when instantiated (only when run! is called).

Or said differently, the AICall struct and all its flavors (AIGenerate, ...) are designed to facilitate a deferred execution model (lazy evaluation) for AI functions that interact with a Language Learning Model (LLM). It stores the necessary information for an AI call and executes the underlying AI function only when supplied with a UserMessage or when the run! method is applied. This allows us to remember user inputs and trigger the LLM call repeatedly if needed, which enables automatic fixing (see ?airetry!).

Examples

Automatic Fixing of AI Calls

We need to switch from aigenerate to AIGenerate to get the lazy version of the function.

julia
output = AIGenerate("Say hi!"; model="gpt4t") |> run!

How is it useful? We can use the same "inputs" for repeated calls, eg, when we want to validate or regenerate some outputs. We have a function airetry! to help us with that.

The signature of airetry is airetry(condition_function, aicall::AICall, feedback_function).

It evaluates the condition condition_function on the aicall object (eg, we evaluate f_cond(aicall) -> Bool). If it fails, we call feedback_function on the aicall object to provide feedback for the AI model (eg, f_feedback(aicall) -> String) and repeat the process until it passes or until max_retries value is exceeded.

We can catch API failures (no feedback needed, so none is provided)

julia
# API failure because of a non-existent model
# RetryConfig allows us to change the "retry" behaviour of any lazy call
output = AIGenerate("say hi!"; config = RetryConfig(; catch_errors = true),
    model = "NOTEXIST")
run!(output) # fails

# we ask to wait 2s between retries and retry 2 times (can be set in `config` in aicall as well)
airetry!(isvalid, output; retry_delay = 2, max_retries = 2)

Or we can use it for output validation (eg, its format, its content, etc.) and feedback generation.

Let's play a color guessing game (I'm thinking "yellow"). We'll implement two formatting checks with airetry!:

julia
# Notice that we ask for two samples (`n_samples=2`) at each attempt (to improve our chances). 
# Both guesses are scored at each time step, and the best one is chosen for the next step.
# And with OpenAI, we can set `api_kwargs = (;n=2)` to get both samples simultaneously (cheaper and faster)!
out = AIGenerate(
    "Guess what color I'm thinking. It could be: blue, red, black, white, yellow. Answer with 1 word only";
    verbose = false,
    config = RetryConfig(; n_samples = 2), api_kwargs = (; n = 2))
run!(out)

## Check that the output is 1 word only, third argument is the feedback that will be provided if the condition fails
## Notice: functions operate on `aicall` as the only argument. We can use utilities like `last_output` and `last_message` to access the last message and output in the conversation.
airetry!(x -> length(split(last_output(x), r" |\\.")) == 1, out,
    "You must answer with 1 word only.")

# Note: you could also use the do-syntax, eg, 
airetry!(out, "You must answer with 1 word only.") do aicall
    length(split(last_output(aicall), r" |\\.")) == 1
end

You can even add the guessing itself as an airetry! condition of last_output(out) == "yellow" and provide feedback if the guess is wrong.

References

# PromptingTools.Experimental.AgentTools.AIGenerateFunction.
julia
AIGenerate(args...; kwargs...)

Creates a lazy instance of aigenerate. It is an instance of AICall with aigenerate as the function.

Use exactly the same arguments and keyword arguments as aigenerate (see ?aigenerate for details).

source


# PromptingTools.Experimental.AgentTools.AICallType.
julia
AICall(func::F, args...; kwargs...) where {F<:Function}

AIGenerate(args...; kwargs...)
AIEmbed(args...; kwargs...)
AIExtract(args...; kwargs...)

A lazy call wrapper for AI functions in the PromptingTools module, such as aigenerate.

The AICall struct is designed to facilitate a deferred execution model (lazy evaluation) for AI functions that interact with a Language Learning Model (LLM). It stores the necessary information for an AI call and executes the underlying AI function only when supplied with a UserMessage or when the run! method is applied. This approach allows for more flexible and efficient handling of AI function calls, especially in interactive environments.

Seel also: run!, AICodeFixer

Fields

  • func::F: The AI function to be called lazily. This should be a function like aigenerate or other ai* functions.

  • schema::Union{Nothing, PT.AbstractPromptSchema}: Optional schema to structure the prompt for the AI function.

  • conversation::Vector{PT.AbstractMessage}: A vector of messages that forms the conversation context for the AI call.

  • kwargs::NamedTuple: Keyword arguments to be passed to the AI function.

  • success::Union{Nothing, Bool}: Indicates whether the last call was successful (true) or not (false). Nothing if the call hasn't been made yet.

  • error::Union{Nothing, Exception}: Stores any exception that occurred during the last call. Nothing if no error occurred or if the call hasn't been made yet.

Example

Initiate an AICall like any ai* function, eg, AIGenerate:

julia
aicall = AICall(aigenerate)

# With arguments and kwargs like ai* functions
# from `aigenerate(schema, conversation; model="abc", api_kwargs=(; temperature=0.1))`
# to
aicall = AICall(aigenerate, schema, conversation; model="abc", api_kwargs=(; temperature=0.1)

# Or with a template
aicall = AIGenerate(:JuliaExpertAsk; ask="xyz", model="abc", api_kwargs=(; temperature=0.1))

Trigger the AICall with run! (it returns the update AICall struct back):

julia
aicall |> run!
````

You can also use `AICall` as a functor to trigger the AI call with a `UserMessage` or simply the text to send:

julia aicall(UserMessage("Hello, world!")) # Triggers the lazy call result = run!(aicall) # Explicitly runs the AI call ``` This can be used to "reply" to previous message / continue the stored conversation

Notes

  • The AICall struct is a key component in building flexible and efficient Agentic pipelines

  • The lazy evaluation model allows for setting up the call parameters in advance and deferring the actual execution until it is explicitly triggered.

  • This struct is particularly useful in scenarios where the timing of AI function execution needs to be deferred or where multiple potential calls need to be prepared and selectively executed.

source


# PromptingTools.last_outputFunction.

Extracts the last output (generated text answer) from the RAGResult.

source

Helpful accessor for AICall blocks. Returns the last output in the conversation (eg, the string/data in the last message).

source

Helpful accessor for the last generated output (msg.content) in conversation. Returns the last output in the conversation (eg, the string/data in the last message).

source


# PromptingTools.last_messageFunction.
julia
PT.last_message(result::RAGResult)

Extract the last message from the RAGResult. It looks for final_answer first, then answer fields in the conversations dictionary. Returns nothing if not found.

source

Helpful accessor for AICall blocks. Returns the last message in the conversation.

source

Helpful accessor for the last message in conversation. Returns the last message in the conversation.

source


# PromptingTools.Experimental.AgentTools.airetry!Function.
julia
airetry!(
    f_cond::Function, aicall::AICallBlock, feedback::Union{AbstractString, Function} = "";
    verbose::Bool = true, throw::Bool = false, evaluate_all::Bool = true, feedback_expensive::Bool = false,
    max_retries::Union{Nothing, Int} = nothing, retry_delay::Union{Nothing, Int} = nothing)

Evaluates the condition f_cond on the aicall object. If the condition is not met, it will return the best sample to retry from and provide feedback (string or function) to aicall. That's why it's mutating. It will retry maximum max_retries times, with throw=true, an error will be thrown if the condition is not met after max_retries retries.

Note: aicall must be run first via run!(aicall) before calling airetry!.

Function signatures

  • f_cond(aicall::AICallBlock) -> Bool, ie, it must accept the aicall object and return a boolean value.

  • feedback can be a string or feedback(aicall::AICallBlock) -> String, ie, it must accept the aicall object and return a string.

You can leverage the last_message, last_output, and AICode functions to access the last message, last output and execute code blocks in the conversation, respectively. See examples below.

Good Use Cases

  • Retry with API failures/drops (add retry_delay=2 to wait 2s between retries)

  • Check the output format / type / length / etc

  • Check the output with aiclassify call (LLM Judge) to catch unsafe/NSFW/out-of-scope content

  • Provide hints to the model to guide it to the correct answer

Gotchas

  • If controlling keyword arguments are set to nothing, they will fall back to the default values in aicall.config. You can override them by passing the keyword arguments explicitly.

  • If there multiple airetry! checks, they are evaluted sequentially. As long as throw==false, they will be all evaluated even if they failed previous checks.

  • Only samples which passed previous evaluations are evaluated (sample.success is true). If there are no successful samples, the function will evaluate only the active sample (aicall.active_sample_id) and nothing else.

  • Feedback from all "ancestor" evaluations is added upon retry, not feedback from the "sibblings" or other branches. To have only ONE long BRANCH (no sibblings), make sure to keep RetryConfig(; n_samples=1). That way the model will always see ALL previous feedback.

  • We implement a version of Monte Carlo Tree Search (MCTS) to always pick the most promising sample to restart from (you can tweak the options in RetryConfig to change the behaviour).

  • For large number of parallel branches (ie, "shallow and wide trees"), you might benefit from switching scoring to scoring=ThompsonSampling() (similar to how Bandit algorithms work).

  • Open-source/local models can struggle with too long conversation, you might want to experiment with in-place feedback (set RetryConfig(; feedback_inplace=true)).

Arguments

  • f_cond::Function: A function that accepts the aicall object and returns a boolean value. Retry will be attempted if the condition is not met (f_cond -> false).

  • aicall::AICallBlock: The aicall object to evaluate the condition on.

  • feedback::Union{AbstractString, Function}: Feedback to provide if the condition is not met. If a function is provided, it must accept the aicall object as the only argument and return a string.

  • verbose::Integer=1: A verbosity level for logging the retry attempts and warnings. A higher value indicates more detailed logging.

  • throw::Bool=false: If true, it will throw an error if the function f_cond does not return true after max_retries retries.

  • evaluate_all::Bool=false: If true, it will evaluate all the "successful" samples in the aicall object. Otherwise, it will only evaluate the active sample.

  • feedback_expensive::Bool=false: If false, it will provide feedback to all samples that fail the condition. If feedback function is expensive to call (eg, another ai* function), set this to true and feedback will be provided only to the sample we will retry from.

  • max_retries::Union{Nothing, Int}=nothing: Maximum number of retries. If not provided, it will fall back to the max_retries in aicall.config.

  • retry_delay::Union{Nothing, Int}=nothing: Delay between retries in seconds. If not provided, it will fall back to the retry_delay in aicall.config.

Returns

  • The aicall object with the updated conversation, and samples (saves the evaluations and their scores/feedback).

Example

You can use airetry! to catch API errors in run! and auto-retry the call. RetryConfig is how you influence all the subsequent retry behaviours - see ?RetryConfig for more details.

julia
# API failure because of a non-existent model
out = AIGenerate("say hi!"; config = RetryConfig(; catch_errors = true),
    model = "NOTEXIST")
run!(out) # fails

# we ask to wait 2s between retries and retry 2 times (can be set in `config` in aicall as well)
airetry!(isvalid, out; retry_delay = 2, max_retries = 2)

If you provide arguments to the aicall, we try to honor them as much as possible in the following calls, eg, set low verbosity

julia
out = AIGenerate("say hi!"; config = RetryConfig(; catch_errors = true),
model = "NOTEXIST", verbose=false)
run!(out)
# No info message, you just see `success = false` in the properties of the AICall

Let's show a toy example to demonstrate the runtime checks / guardrails for the model output. We'll play a color guessing game (I'm thinking "yellow"):

julia
# Notice that we ask for two samples (`n_samples=2`) at each attempt (to improve our chances). 
# Both guesses are scored at each time step, and the best one is chosen for the next step.
# And with OpenAI, we can set `api_kwargs = (;n=2)` to get both samples simultaneously (cheaper and faster)!
out = AIGenerate(
    "Guess what color I'm thinking. It could be: blue, red, black, white, yellow. Answer with 1 word only";
    verbose = false,
    config = RetryConfig(; n_samples = 2), api_kwargs = (; n = 2))
run!(out)


## Check that the output is 1 word only, third argument is the feedback that will be provided if the condition fails
## Notice: functions operate on `aicall` as the only argument. We can use utilities like `last_output` and `last_message` to access the last message and output in the conversation.
airetry!(x -> length(split(last_output(x), r" |\.")) == 1, out,
    "You must answer with 1 word only.")


## Let's ensure that the output is in lowercase - simple and short
airetry!(x -> all(islowercase, last_output(x)), out, "You must answer in lowercase.")
# [ Info: Condition not met. Retrying...


## Let's add final hint - it took us 2 retries
airetry!(x -> startswith(last_output(x), "y"), out, "It starts with "y"")
# [ Info: Condition not met. Retrying...
# [ Info: Condition not met. Retrying...


## We end up with the correct answer
last_output(out)
# Output: "yellow"

Let's explore how we got here. We save the various attempts in a "tree" (SampleNode object) You can access it in out.samples, which is the ROOT of the tree (top level). Currently "active" sample ID is out.active_sample_id -> that's the same as conversation field in your AICall.

julia
# Root node:
out.samples
# Output: SampleNode(id: 46839, stats: 6/12, length: 2)

# Active sample (our correct answer):
out.active_sample_id 
# Output: 50086

# Let's obtain the active sample node with this ID  - use getindex notation or function find_node
out.samples[out.active_sample_id]
# Output: SampleNode(id: 50086, stats: 1/1, length: 7)

# The SampleNode has two key fields: data and feedback. Data is where the conversation is stored:
active_sample = out.samples[out.active_sample_id]
active_sample.data == out.conversation # Output: true -> This is the winning guess!

We also get a clear view of the tree structure of all samples with print_samples:

julia
julia> print_samples(out.samples)
SampleNode(id: 46839, stats: 6/12, score: 0.5, length: 2)
├─ SampleNode(id: 12940, stats: 5/8, score: 1.41, length: 4)
│  ├─ SampleNode(id: 34315, stats: 3/4, score: 1.77, length: 6)
│  │  ├─ SampleNode(id: 20493, stats: 1/1, score: 2.67, length: 7)
│  │  └─ SampleNode(id: 50086, stats: 1/1, score: 2.67, length: 7)
│  └─ SampleNode(id: 2733, stats: 1/2, score: 1.94, length: 5)
└─ SampleNode(id: 48343, stats: 1/4, score: 1.36, length: 4)
   ├─ SampleNode(id: 30088, stats: 0/1, score: 1.67, length: 5)
   └─ SampleNode(id: 44816, stats: 0/1, score: 1.67, length: 5)

You can use the id to grab and inspect any of these nodes, eg,

julia
out.samples[2733]
# Output: SampleNode(id: 2733, stats: 1/2, length: 5)

We can also iterate through all samples and extract whatever information we want with PostOrderDFS or PreOrderDFS (exported from AbstractTrees.jl)

julia
for sample in PostOrderDFS(out.samples)
    # Data is the universal field for samples, we put `conversation` in there
    # Last item in data is the last message in coversation
    msg = sample.data[end]
    if msg isa PT.AIMessage # skip feedback
        # get only the message content, ie, the guess
        println("ID: $(sample.id), Answer: $(msg.content)")
    end
end

# ID: 20493, Answer: yellow
# ID: 50086, Answer: yellow
# ID: 2733, Answer: red
# ID: 30088, Answer: blue
# ID: 44816, Answer: blue

Note: airetry! will attempt to fix the model max_retries times. If you set throw=true, it will throw an ErrorException if the condition is not met after max_retries retries.

Let's define a mini program to guess the number and use airetry! to guide the model to the correct answer:

julia
"""
    llm_guesser()

Mini program to guess the number provided by the user (betwee 1-100).
"""
function llm_guesser(user_number::Int)
    @assert 1 <= user_number <= 100
    prompt = """
I'm thinking a number between 1-100. Guess which one it is. 
You must respond only with digits and nothing else. 
Your guess:"""
    ## 2 samples at a time, max 5 fixing rounds
    out = AIGenerate(prompt; config = RetryConfig(; n_samples = 2, max_retries = 5),
        api_kwargs = (; n = 2)) |> run!
    ## Check the proper output format - must parse to Int, use do-syntax
    ## We can provide feedback via a function!
    function feedback_f(aicall)
        "Output: $(last_output(aicall))
Feedback: You must respond only with digits!!"
    end
    airetry!(out, feedback_f) do aicall
        !isnothing(tryparse(Int, last_output(aicall)))
    end
    ## Give a hint on bounds
    lower_bound = (user_number ÷ 10) * 10
    upper_bound = lower_bound + 10
    airetry!(
        out, "The number is between or equal to $lower_bound to $upper_bound.") do aicall
        guess = tryparse(Int, last_output(aicall))
        lower_bound <= guess <= upper_bound
    end
    ## You can make at most 3x guess now -- if there is max_retries in `config.max_retries` left
    max_retries = out.config.retries + 3
    function feedback_f2(aicall)
        guess = tryparse(Int, last_output(aicall))
        "Your guess of $(guess) is wrong, it's $(abs(guess-user_number)) numbers away."
    end
    airetry!(out, feedback_f2; max_retries) do aicall
        tryparse(Int, last_output(aicall)) == user_number
    end

    ## Evaluate the best guess
    @info "Results: Guess: $(last_output(out)) vs User: $user_number (Number of calls made: $(out.config.calls))"
    return out
end

# Let's play the game
out = llm_guesser(33)
[ Info: Condition not met. Retrying...
[ Info: Condition not met. Retrying...
[ Info: Condition not met. Retrying...
[ Info: Condition not met. Retrying...
[ Info: Results: Guess: 33 vs User: 33 (Number of calls made: 10)

Yay! We got it 😃

Now, we could explore different samples (eg, print_samples(out.samples)) or see what the model guessed at each step:

julia
print_samples(out.samples)
## SampleNode(id: 57694, stats: 6/14, score: 0.43, length: 2)
## ├─ SampleNode(id: 35603, stats: 5/10, score: 1.23, length: 4)
## │  ├─ SampleNode(id: 55394, stats: 1/4, score: 1.32, length: 6)
## │  │  ├─ SampleNode(id: 20737, stats: 0/1, score: 1.67, length: 7)
## │  │  └─ SampleNode(id: 52910, stats: 0/1, score: 1.67, length: 7)
## │  └─ SampleNode(id: 43094, stats: 3/4, score: 1.82, length: 6)
## │     ├─ SampleNode(id: 14966, stats: 1/1, score: 2.67, length: 7)
## │     └─ SampleNode(id: 32991, stats: 1/1, score: 2.67, length: 7)
## └─ SampleNode(id: 20506, stats: 1/4, score: 1.4, length: 4)
##    ├─ SampleNode(id: 37581, stats: 0/1, score: 1.67, length: 5)
##    └─ SampleNode(id: 46632, stats: 0/1, score: 1.67, length: 5)

# Lastly, let's check all the guesses AI made across all samples. 
# Our winning guess was ID 32991 (`out.active_sample_id`)

for sample in PostOrderDFS(out.samples)
    [println("ID: $(sample.id), Guess: $(msg.content)")
     for msg in sample.data if msg isa PT.AIMessage]
end
## ID: 20737, Guess: 50
## ID: 20737, Guess: 35
## ID: 20737, Guess: 37
## ID: 52910, Guess: 50
## ID: 52910, Guess: 35
## ID: 52910, Guess: 32
## ID: 14966, Guess: 50
## ID: 14966, Guess: 35
## ID: 14966, Guess: 33
## ID: 32991, Guess: 50
## ID: 32991, Guess: 35
## ID: 32991, Guess: 33
## etc...

Note that if there are multiple "branches" the model will see only the feedback of its own and its ancestors not the other "branches". If you wanted to provide ALL feedback, set RetryConfig(; n_samples=1) to remove any "branching". It fixing will be done sequentially in one conversation and the model will see all feedback (less powerful if the model falls into a bad state). Alternatively, you can tweak the feedback function.

See Also

References: airetry is inspired by the Language Agent Tree Search paper and by DSPy Assertions paper.

source


# PromptingTools.Experimental.AgentTools.print_samplesFunction.

Pretty prints the samples tree starting from node. Usually, node is the root of the tree. Example: print_samples(aicall.samples).

source


# PromptingTools.AICodeType.
julia
AICode(code::AbstractString; auto_eval::Bool=true, safe_eval::Bool=false, 
skip_unsafe::Bool=false, capture_stdout::Bool=true, verbose::Bool=false,
prefix::AbstractString="", suffix::AbstractString="", remove_tests::Bool=false, execution_timeout::Int = 60)

AICode(msg::AIMessage; auto_eval::Bool=true, safe_eval::Bool=false, 
skip_unsafe::Bool=false, skip_invalid::Bool=false, capture_stdout::Bool=true,
verbose::Bool=false, prefix::AbstractString="", suffix::AbstractString="", remove_tests::Bool=false, execution_timeout::Int = 60)

A mutable structure representing a code block (received from the AI model) with automatic parsing, execution, and output/error capturing capabilities.

Upon instantiation with a string, the AICode object automatically runs a code parser and executor (via PromptingTools.eval!()), capturing any standard output (stdout) or errors. This structure is useful for programmatically handling and evaluating Julia code snippets.

See also: PromptingTools.extract_code_blocks, PromptingTools.eval!

Workflow

  • Until cb::AICode has been evaluated, cb.success is set to nothing (and so are all other fields).

  • The text in cb.code is parsed (saved to cb.expression).

  • The parsed expression is evaluated.

  • Outputs of the evaluated expression are captured in cb.output.

  • Any stdout outputs (e.g., from println) are captured in cb.stdout.

  • If an error occurs during evaluation, it is saved in cb.error.

  • After successful evaluation without errors, cb.success is set to true. Otherwise, it is set to false and you can inspect the cb.error to understand why.

Properties

  • code::AbstractString: The raw string of the code to be parsed and executed.

  • expression: The parsed Julia expression (set after parsing code).

  • stdout: Captured standard output from the execution of the code.

  • output: The result of evaluating the code block.

  • success::Union{Nothing, Bool}: Indicates whether the code block executed successfully (true), unsuccessfully (false), or has yet to be evaluated (nothing).

  • error::Union{Nothing, Exception}: Any exception raised during the execution of the code block.

Keyword Arguments

  • auto_eval::Bool: If set to true, the code block is automatically parsed and evaluated upon instantiation. Defaults to true.

  • safe_eval::Bool: If set to true, the code block checks for package operations (e.g., installing new packages) and missing imports, and then evaluates the code inside a bespoke scratch module. This is to ensure that the evaluation does not alter any user-defined variables or the global state. Defaults to false.

  • skip_unsafe::Bool: If set to true, we skip any lines in the code block that are deemed unsafe (eg, Pkg operations). Defaults to false.

  • skip_invalid::Bool: If set to true, we skip code blocks that do not even parse. Defaults to false.

  • verbose::Bool: If set to true, we print out any lines that are skipped due to being unsafe. Defaults to false.

  • capture_stdout::Bool: If set to true, we capture any stdout outputs (eg, test failures) in cb.stdout. Defaults to true.

  • prefix::AbstractString: A string to be prepended to the code block before parsing and evaluation. Useful to add some additional code definition or necessary imports. Defaults to an empty string.

  • suffix::AbstractString: A string to be appended to the code block before parsing and evaluation. Useful to check that tests pass or that an example executes. Defaults to an empty string.

  • remove_tests::Bool: If set to true, we remove any @test or @testset macros from the code block before parsing and evaluation. Defaults to false.

  • execution_timeout::Int: The maximum time (in seconds) allowed for the code block to execute. Defaults to 60 seconds.

Methods

  • Base.isvalid(cb::AICode): Check if the code block has executed successfully. Returns true if cb.success == true.

Examples

julia
code = AICode("println("Hello, World!")") # Auto-parses and evaluates the code, capturing output and errors.
isvalid(code) # Output: true
code.stdout # Output: "Hello, World!
"

We try to evaluate "safely" by default (eg, inside a custom module, to avoid changing user variables). You can avoid that with save_eval=false:

julia
code = AICode("new_variable = 1"; safe_eval=false)
isvalid(code) # Output: true
new_variable # Output: 1

You can also call AICode directly on an AIMessage, which will extract the Julia code blocks, concatenate them and evaluate them:

julia
msg = aigenerate("In Julia, how do you create a vector of 10 random numbers?")
code = AICode(msg)
# Output: AICode(Success: True, Parsed: True, Evaluated: True, Error Caught: N/A, StdOut: True, Code: 2 Lines)

# show the code
code.code |> println
# Output: 
# numbers = rand(10)
# numbers = rand(1:100, 10)

# or copy it to the clipboard
code.code |> clipboard

# or execute it in the current module (=Main)
eval(code.expression)

source


# PromptingTools.Experimental.AgentTools.aicodefixer_feedbackFunction.
julia
aicodefixer_feedback(cb::AICode; max_length::Int = 512) -> NamedTuple(; feedback::String)
aicodefixer_feedback(conversation::AbstractVector{<:PT.AbstractMessage}; max_length::Int = 512) -> NamedTuple(; feedback::String)
aicodefixer_feedback(msg::PT.AIMessage; max_length::Int = 512) -> NamedTuple(; feedback::String)
aicodefixer_feedback(aicall::AICall; max_length::Int = 512) -> NamedTuple(; feedback::String)

Generate feedback for an AI code fixing session based on the AICode block /or conversation history (that will be used to extract and evaluate a code block). Function is designed to be extensible for different types of feedback and code evaluation outcomes.

The highlevel wrapper accepts a conversation and returns new kwargs for the AICall.

Individual feedback functions are dispatched on different subtypes of AbstractCodeOutcome and can be extended/overwritten to provide more detailed feedback.

See also: AIGenerate, AICodeFixer

Arguments

  • cb::AICode: AICode block to evaluate and provide feedback on.

  • max_length::Int=512: An optional argument that specifies the maximum length of the feedback message.

Returns

  • NamedTuple: A feedback message as a kwarg in NamedTuple based on the analysis of the code provided in the conversation.

Example

julia
cb = AICode(msg; skip_unsafe = true, capture_stdout = true)
new_kwargs = aicodefixer_feedback(cb)

new_kwargs = aicodefixer_feedback(msg)
new_kwargs = aicodefixer_feedback(conversation)

Notes

This function is part of the AI code fixing system, intended to interact with code in AIMessage and provide feedback on improving it.

The highlevel wrapper accepts a conversation and returns new kwargs for the AICall.

It dispatches for the code feedback based on the subtypes of AbstractCodeOutcome below:

  • CodeEmpty: No code found in the message.

  • CodeFailedParse: Code parsing error.

  • CodeFailedEval: Runtime evaluation error.

  • CodeFailedTimeout: Code execution timed out.

  • CodeSuccess: Successful code execution.

You can override the individual methods to customize the feedback.

source


# PromptingTools.Experimental.AgentTools.error_feedbackFunction.
julia
error_feedback(e::Any; max_length::Int = 512)

Set of specialized methods to provide feedback on different types of errors (e).

source