Agent Tools Introduction
AgentTools
is an experimental module that provides a set of utilities for building advanced agentic workflows, code-generating and self-fixing agents.
Import the module as follows:
using PromptingTools.Experimental.AgentTools
# to access unexported functionality
const AT = PromptingTools.Experimental.AgentTools
Highlights
The main functions to be aware of are:
AIGenerate
- Lazy counterpart ofaigenerate()
. Allai*
functions have a correspondingAI*::AICall
struct that allows for deferred execution (triggered byrun!
method).last_output
,last_message
- Simple utilities to access the last output and message of the AI calls likeAIGenerate
.airetry!
- A utility to automatically retry the AI call with the same inputs if the AI model fails to generate a valid output. It allows retrying many times and providing feedback to the AI model about the failure to increase its robustness.AIGenerate
and other AI calls have a fieldconfig::RetryConfig
where you can globally adjust the retrying behavior.print_samples
-airetry!
implements a Monte Carlo Tree Search under the hood when trying to find the best way to fix the AI model's failure.print_samples
is a utility to print the "samples" generated by the MCTS to better understand the attempts made by the AI model to fix the failure.AICode
extensions likeaicodefixer_feedback
anderror_feedback
-AICode
is a wrapper that extracts any Julia code provided in theAIMessage
(response from the AI model) and executes it (including catch any errors).aicodefixer_feedback
anderror_feedback
are utilities that automatically review an outcome ofAICode
evaluation and generate the corresponding feedback for the AI model.
The main contribution of this module is providing the "lazy" counterparts to the ai...
functions, which allow us to build a workflow, which can be re-executed many times with the same inputs.
For example, AIGenerate()
will create a lazy instance of aigenerate
, which is an instance of AICall
with aigenerate
as its ai-calling function. It uses exactly the same arguments and keyword arguments as aigenerate
(see ?aigenerate
for details). The notion of "lazy" refers to the fact that it does NOT generate any output when instantiated (only when run!
is called).
Or said differently, the AICall
struct and all its flavors (AIGenerate
, ...) are designed to facilitate a deferred execution model (lazy evaluation) for AI functions that interact with a Language Learning Model (LLM). It stores the necessary information for an AI call and executes the underlying AI function only when supplied with a UserMessage
or when the run!
method is applied. This allows us to remember user inputs and trigger the LLM call repeatedly if needed, which enables automatic fixing (see ?airetry!
).
Examples
Automatic Fixing of AI Calls
We need to switch from aigenerate
to AIGenerate
to get the lazy version of the function.
output = AIGenerate("Say hi!"; model="gpt4t") |> run!
How is it useful? We can use the same "inputs" for repeated calls, eg, when we want to validate or regenerate some outputs. We have a function airetry!
to help us with that.
The signature of airetry
is airetry(condition_function, aicall::AICall, feedback_function)
.
It evaluates the condition condition_function
on the aicall
object (eg, we evaluate f_cond(aicall) -> Bool
). If it fails, we call feedback_function
on the aicall
object to provide feedback for the AI model (eg, f_feedback(aicall) -> String
) and repeat the process until it passes or until max_retries
value is exceeded.
We can catch API failures (no feedback needed, so none is provided)
# API failure because of a non-existent model
# RetryConfig allows us to change the "retry" behaviour of any lazy call
output = AIGenerate("say hi!"; config = RetryConfig(; catch_errors = true),
model = "NOTEXIST")
run!(output) # fails
# we ask to wait 2s between retries and retry 2 times (can be set in `config` in aicall as well)
airetry!(isvalid, output; retry_delay = 2, max_retries = 2)
Or we can use it for output validation (eg, its format, its content, etc.) and feedback generation.
Let's play a color guessing game (I'm thinking "yellow"). We'll implement two formatting checks with airetry!
:
# Notice that we ask for two samples (`n_samples=2`) at each attempt (to improve our chances).
# Both guesses are scored at each time step, and the best one is chosen for the next step.
# And with OpenAI, we can set `api_kwargs = (;n=2)` to get both samples simultaneously (cheaper and faster)!
out = AIGenerate(
"Guess what color I'm thinking. It could be: blue, red, black, white, yellow. Answer with 1 word only";
verbose = false,
config = RetryConfig(; n_samples = 2), api_kwargs = (; n = 2))
run!(out)
## Check that the output is 1 word only, third argument is the feedback that will be provided if the condition fails
## Notice: functions operate on `aicall` as the only argument. We can use utilities like `last_output` and `last_message` to access the last message and output in the conversation.
airetry!(x -> length(split(last_output(x), r" |\\.")) == 1, out,
"You must answer with 1 word only.")
# Note: you could also use the do-syntax, eg,
airetry!(out, "You must answer with 1 word only.") do aicall
length(split(last_output(aicall), r" |\\.")) == 1
end
You can even add the guessing itself as an airetry!
condition of last_output(out) == "yellow"
and provide feedback if the guess is wrong.
References
AIGenerate(args...; kwargs...)
Creates a lazy instance of aigenerate
. It is an instance of AICall
with aigenerate
as the function.
Use exactly the same arguments and keyword arguments as aigenerate
(see ?aigenerate
for details).
AICall(func::F, args...; kwargs...) where {F<:Function}
AIGenerate(args...; kwargs...)
AIEmbed(args...; kwargs...)
AIExtract(args...; kwargs...)
A lazy call wrapper for AI functions in the PromptingTools
module, such as aigenerate
.
The AICall
struct is designed to facilitate a deferred execution model (lazy evaluation) for AI functions that interact with a Language Learning Model (LLM). It stores the necessary information for an AI call and executes the underlying AI function only when supplied with a UserMessage
or when the run!
method is applied. This approach allows for more flexible and efficient handling of AI function calls, especially in interactive environments.
Seel also: run!
, AICodeFixer
Fields
func::F
: The AI function to be called lazily. This should be a function likeaigenerate
or otherai*
functions.schema::Union{Nothing, PT.AbstractPromptSchema}
: Optional schema to structure the prompt for the AI function.conversation::Vector{PT.AbstractMessage}
: A vector of messages that forms the conversation context for the AI call.kwargs::NamedTuple
: Keyword arguments to be passed to the AI function.success::Union{Nothing, Bool}
: Indicates whether the last call was successful (true) or not (false).Nothing
if the call hasn't been made yet.error::Union{Nothing, Exception}
: Stores any exception that occurred during the last call.Nothing
if no error occurred or if the call hasn't been made yet.
Example
Initiate an AICall
like any ai* function, eg, AIGenerate
:
aicall = AICall(aigenerate)
# With arguments and kwargs like ai* functions
# from `aigenerate(schema, conversation; model="abc", api_kwargs=(; temperature=0.1))`
# to
aicall = AICall(aigenerate, schema, conversation; model="abc", api_kwargs=(; temperature=0.1)
# Or with a template
aicall = AIGenerate(:JuliaExpertAsk; ask="xyz", model="abc", api_kwargs=(; temperature=0.1))
Trigger the AICall with run!
(it returns the update AICall
struct back):
aicall |> run!
````
You can also use `AICall` as a functor to trigger the AI call with a `UserMessage` or simply the text to send:
julia aicall(UserMessage("Hello, world!")) # Triggers the lazy call result = run!(aicall) # Explicitly runs the AI call ``` This can be used to "reply" to previous message / continue the stored conversation
Notes
The
AICall
struct is a key component in building flexible and efficient Agentic pipelinesThe lazy evaluation model allows for setting up the call parameters in advance and deferring the actual execution until it is explicitly triggered.
This struct is particularly useful in scenarios where the timing of AI function execution needs to be deferred or where multiple potential calls need to be prepared and selectively executed.
Extracts the last output (generated text answer) from the RAGResult.
Helpful accessor for AICall blocks. Returns the last output in the conversation (eg, the string/data in the last message).
Helpful accessor for the last generated output (msg.content
) in conversation
. Returns the last output in the conversation (eg, the string/data in the last message).
last_output(mem::ConversationMemory)
Get the last AI message in the conversation.
PT.last_message(result::RAGResult)
Extract the last message from the RAGResult. It looks for final_answer
first, then answer
fields in the conversations
dictionary. Returns nothing
if not found.
Helpful accessor for AICall blocks. Returns the last message in the conversation.
Helpful accessor for the last message in conversation
. Returns the last message in the conversation.
last_message(mem::ConversationMemory)
Get the last message in the conversation.
airetry!(
f_cond::Function, aicall::AICallBlock, feedback::Union{AbstractString, Function} = "";
verbose::Bool = true, throw::Bool = false, evaluate_all::Bool = true, feedback_expensive::Bool = false,
max_retries::Union{Nothing, Int} = nothing, retry_delay::Union{Nothing, Int} = nothing)
Evaluates the condition f_cond
on the aicall
object. If the condition is not met, it will return the best sample to retry from and provide feedback
(string or function) to aicall
. That's why it's mutating. It will retry maximum max_retries
times, with throw=true
, an error will be thrown if the condition is not met after max_retries
retries.
Note: aicall
must be run first via run!(aicall)
before calling airetry!
.
Function signatures
f_cond(aicall::AICallBlock) -> Bool
, ie, it must accept the aicall object and return a boolean value.feedback
can be a string orfeedback(aicall::AICallBlock) -> String
, ie, it must accept the aicall object and return a string.
You can leverage the last_message
, last_output
, and AICode
functions to access the last message, last output and execute code blocks in the conversation, respectively. See examples below.
Good Use Cases
Retry with API failures/drops (add
retry_delay=2
to wait 2s between retries)Check the output format / type / length / etc
Check the output with
aiclassify
call (LLM Judge) to catch unsafe/NSFW/out-of-scope contentProvide hints to the model to guide it to the correct answer
Gotchas
If controlling keyword arguments are set to nothing, they will fall back to the default values in
aicall.config
. You can override them by passing the keyword arguments explicitly.If there multiple
airetry!
checks, they are evaluted sequentially. As long asthrow==false
, they will be all evaluated even if they failed previous checks.Only samples which passed previous evaluations are evaluated (
sample.success
istrue
). If there are no successful samples, the function will evaluate only the active sample (aicall.active_sample_id
) and nothing else.Feedback from all "ancestor" evaluations is added upon retry, not feedback from the "sibblings" or other branches. To have only ONE long BRANCH (no sibblings), make sure to keep
RetryConfig(; n_samples=1)
. That way the model will always see ALL previous feedback.We implement a version of Monte Carlo Tree Search (MCTS) to always pick the most promising sample to restart from (you can tweak the options in
RetryConfig
to change the behaviour).For large number of parallel branches (ie, "shallow and wide trees"), you might benefit from switching scoring to
scoring=ThompsonSampling()
(similar to how Bandit algorithms work).Open-source/local models can struggle with too long conversation, you might want to experiment with
in-place feedback
(setRetryConfig(; feedback_inplace=true)
).
Arguments
f_cond::Function
: A function that accepts theaicall
object and returns a boolean value. Retry will be attempted if the condition is not met (f_cond -> false
).aicall::AICallBlock
: Theaicall
object to evaluate the condition on.feedback::Union{AbstractString, Function}
: Feedback to provide if the condition is not met. If a function is provided, it must accept theaicall
object as the only argument and return a string.verbose::Integer=1
: A verbosity level for logging the retry attempts and warnings. A higher value indicates more detailed logging.throw::Bool=false
: If true, it will throw an error if the functionf_cond
does not returntrue
aftermax_retries
retries.evaluate_all::Bool=false
: If true, it will evaluate all the "successful" samples in theaicall
object. Otherwise, it will only evaluate the active sample.feedback_expensive::Bool=false
: If false, it will provide feedback to all samples that fail the condition. Iffeedback
function is expensive to call (eg, another ai* function), set this totrue
and feedback will be provided only to the sample we will retry from.max_retries::Union{Nothing, Int}=nothing
: Maximum number of retries. If not provided, it will fall back to themax_retries
inaicall.config
.retry_delay::Union{Nothing, Int}=nothing
: Delay between retries in seconds. If not provided, it will fall back to theretry_delay
inaicall.config
.
Returns
- The
aicall
object with the updatedconversation
, andsamples
(saves the evaluations and their scores/feedback).
Example
You can use airetry!
to catch API errors in run!
and auto-retry the call. RetryConfig
is how you influence all the subsequent retry behaviours - see ?RetryConfig
for more details.
# API failure because of a non-existent model
out = AIGenerate("say hi!"; config = RetryConfig(; catch_errors = true),
model = "NOTEXIST")
run!(out) # fails
# we ask to wait 2s between retries and retry 2 times (can be set in `config` in aicall as well)
airetry!(isvalid, out; retry_delay = 2, max_retries = 2)
If you provide arguments to the aicall, we try to honor them as much as possible in the following calls, eg, set low verbosity
out = AIGenerate("say hi!"; config = RetryConfig(; catch_errors = true),
model = "NOTEXIST", verbose=false)
run!(out)
# No info message, you just see `success = false` in the properties of the AICall
Let's show a toy example to demonstrate the runtime checks / guardrails for the model output. We'll play a color guessing game (I'm thinking "yellow"):
# Notice that we ask for two samples (`n_samples=2`) at each attempt (to improve our chances).
# Both guesses are scored at each time step, and the best one is chosen for the next step.
# And with OpenAI, we can set `api_kwargs = (;n=2)` to get both samples simultaneously (cheaper and faster)!
out = AIGenerate(
"Guess what color I'm thinking. It could be: blue, red, black, white, yellow. Answer with 1 word only";
verbose = false,
config = RetryConfig(; n_samples = 2), api_kwargs = (; n = 2))
run!(out)
## Check that the output is 1 word only, third argument is the feedback that will be provided if the condition fails
## Notice: functions operate on `aicall` as the only argument. We can use utilities like `last_output` and `last_message` to access the last message and output in the conversation.
airetry!(x -> length(split(last_output(x), r" |\.")) == 1, out,
"You must answer with 1 word only.")
## Let's ensure that the output is in lowercase - simple and short
airetry!(x -> all(islowercase, last_output(x)), out, "You must answer in lowercase.")
# [ Info: Condition not met. Retrying...
## Let's add final hint - it took us 2 retries
airetry!(x -> startswith(last_output(x), "y"), out, "It starts with "y"")
# [ Info: Condition not met. Retrying...
# [ Info: Condition not met. Retrying...
## We end up with the correct answer
last_output(out)
# Output: "yellow"
Let's explore how we got here. We save the various attempts in a "tree" (SampleNode object) You can access it in out.samples
, which is the ROOT of the tree (top level). Currently "active" sample ID is out.active_sample_id
-> that's the same as conversation
field in your AICall.
# Root node:
out.samples
# Output: SampleNode(id: 46839, stats: 6/12, length: 2)
# Active sample (our correct answer):
out.active_sample_id
# Output: 50086
# Let's obtain the active sample node with this ID - use getindex notation or function find_node
out.samples[out.active_sample_id]
# Output: SampleNode(id: 50086, stats: 1/1, length: 7)
# The SampleNode has two key fields: data and feedback. Data is where the conversation is stored:
active_sample = out.samples[out.active_sample_id]
active_sample.data == out.conversation # Output: true -> This is the winning guess!
We also get a clear view of the tree structure of all samples with print_samples
:
julia> print_samples(out.samples)
SampleNode(id: 46839, stats: 6/12, score: 0.5, length: 2)
├─ SampleNode(id: 12940, stats: 5/8, score: 1.41, length: 4)
│ ├─ SampleNode(id: 34315, stats: 3/4, score: 1.77, length: 6)
│ │ ├─ SampleNode(id: 20493, stats: 1/1, score: 2.67, length: 7)
│ │ └─ SampleNode(id: 50086, stats: 1/1, score: 2.67, length: 7)
│ └─ SampleNode(id: 2733, stats: 1/2, score: 1.94, length: 5)
└─ SampleNode(id: 48343, stats: 1/4, score: 1.36, length: 4)
├─ SampleNode(id: 30088, stats: 0/1, score: 1.67, length: 5)
└─ SampleNode(id: 44816, stats: 0/1, score: 1.67, length: 5)
You can use the id
to grab and inspect any of these nodes, eg,
out.samples[2733]
# Output: SampleNode(id: 2733, stats: 1/2, length: 5)
We can also iterate through all samples and extract whatever information we want with PostOrderDFS
or PreOrderDFS
(exported from AbstractTrees.jl)
for sample in PostOrderDFS(out.samples)
# Data is the universal field for samples, we put `conversation` in there
# Last item in data is the last message in coversation
msg = sample.data[end]
if msg isa PT.AIMessage # skip feedback
# get only the message content, ie, the guess
println("ID: $(sample.id), Answer: $(msg.content)")
end
end
# ID: 20493, Answer: yellow
# ID: 50086, Answer: yellow
# ID: 2733, Answer: red
# ID: 30088, Answer: blue
# ID: 44816, Answer: blue
Note: airetry!
will attempt to fix the model max_retries
times. If you set throw=true
, it will throw an ErrorException if the condition is not met after max_retries
retries.
Let's define a mini program to guess the number and use airetry!
to guide the model to the correct answer:
"""
llm_guesser()
Mini program to guess the number provided by the user (betwee 1-100).
"""
function llm_guesser(user_number::Int)
@assert 1 <= user_number <= 100
prompt = """
I'm thinking a number between 1-100. Guess which one it is.
You must respond only with digits and nothing else.
Your guess:"""
## 2 samples at a time, max 5 fixing rounds
out = AIGenerate(prompt; config = RetryConfig(; n_samples = 2, max_retries = 5),
api_kwargs = (; n = 2)) |> run!
## Check the proper output format - must parse to Int, use do-syntax
## We can provide feedback via a function!
function feedback_f(aicall)
"Output: $(last_output(aicall))
Feedback: You must respond only with digits!!"
end
airetry!(out, feedback_f) do aicall
!isnothing(tryparse(Int, last_output(aicall)))
end
## Give a hint on bounds
lower_bound = (user_number ÷ 10) * 10
upper_bound = lower_bound + 10
airetry!(
out, "The number is between or equal to $lower_bound to $upper_bound.") do aicall
guess = tryparse(Int, last_output(aicall))
lower_bound <= guess <= upper_bound
end
## You can make at most 3x guess now -- if there is max_retries in `config.max_retries` left
max_retries = out.config.retries + 3
function feedback_f2(aicall)
guess = tryparse(Int, last_output(aicall))
"Your guess of $(guess) is wrong, it's $(abs(guess-user_number)) numbers away."
end
airetry!(out, feedback_f2; max_retries) do aicall
tryparse(Int, last_output(aicall)) == user_number
end
## Evaluate the best guess
@info "Results: Guess: $(last_output(out)) vs User: $user_number (Number of calls made: $(out.config.calls))"
return out
end
# Let's play the game
out = llm_guesser(33)
[ Info: Condition not met. Retrying...
[ Info: Condition not met. Retrying...
[ Info: Condition not met. Retrying...
[ Info: Condition not met. Retrying...
[ Info: Results: Guess: 33 vs User: 33 (Number of calls made: 10)
Yay! We got it 😃
Now, we could explore different samples (eg, print_samples(out.samples)
) or see what the model guessed at each step:
print_samples(out.samples)
## SampleNode(id: 57694, stats: 6/14, score: 0.43, length: 2)
## ├─ SampleNode(id: 35603, stats: 5/10, score: 1.23, length: 4)
## │ ├─ SampleNode(id: 55394, stats: 1/4, score: 1.32, length: 6)
## │ │ ├─ SampleNode(id: 20737, stats: 0/1, score: 1.67, length: 7)
## │ │ └─ SampleNode(id: 52910, stats: 0/1, score: 1.67, length: 7)
## │ └─ SampleNode(id: 43094, stats: 3/4, score: 1.82, length: 6)
## │ ├─ SampleNode(id: 14966, stats: 1/1, score: 2.67, length: 7)
## │ └─ SampleNode(id: 32991, stats: 1/1, score: 2.67, length: 7)
## └─ SampleNode(id: 20506, stats: 1/4, score: 1.4, length: 4)
## ├─ SampleNode(id: 37581, stats: 0/1, score: 1.67, length: 5)
## └─ SampleNode(id: 46632, stats: 0/1, score: 1.67, length: 5)
# Lastly, let's check all the guesses AI made across all samples.
# Our winning guess was ID 32991 (`out.active_sample_id`)
for sample in PostOrderDFS(out.samples)
[println("ID: $(sample.id), Guess: $(msg.content)")
for msg in sample.data if msg isa PT.AIMessage]
end
## ID: 20737, Guess: 50
## ID: 20737, Guess: 35
## ID: 20737, Guess: 37
## ID: 52910, Guess: 50
## ID: 52910, Guess: 35
## ID: 52910, Guess: 32
## ID: 14966, Guess: 50
## ID: 14966, Guess: 35
## ID: 14966, Guess: 33
## ID: 32991, Guess: 50
## ID: 32991, Guess: 35
## ID: 32991, Guess: 33
## etc...
Note that if there are multiple "branches" the model will see only the feedback of its own and its ancestors not the other "branches". If you wanted to provide ALL feedback, set RetryConfig(; n_samples=1)
to remove any "branching". It fixing will be done sequentially in one conversation and the model will see all feedback (less powerful if the model falls into a bad state). Alternatively, you can tweak the feedback function.
See Also
References: airetry
is inspired by the Language Agent Tree Search paper and by DSPy Assertions paper.
Pretty prints the samples tree starting from node
. Usually, node
is the root of the tree. Example: print_samples(aicall.samples)
.
AICode(code::AbstractString; auto_eval::Bool=true, safe_eval::Bool=false,
skip_unsafe::Bool=false, capture_stdout::Bool=true, verbose::Bool=false,
prefix::AbstractString="", suffix::AbstractString="", remove_tests::Bool=false, execution_timeout::Int = 60)
AICode(msg::AIMessage; auto_eval::Bool=true, safe_eval::Bool=false,
skip_unsafe::Bool=false, skip_invalid::Bool=false, capture_stdout::Bool=true,
verbose::Bool=false, prefix::AbstractString="", suffix::AbstractString="", remove_tests::Bool=false, execution_timeout::Int = 60)
A mutable structure representing a code block (received from the AI model) with automatic parsing, execution, and output/error capturing capabilities.
Upon instantiation with a string, the AICode
object automatically runs a code parser and executor (via PromptingTools.eval!()
), capturing any standard output (stdout
) or errors. This structure is useful for programmatically handling and evaluating Julia code snippets.
See also: PromptingTools.extract_code_blocks
, PromptingTools.eval!
Workflow
Until
cb::AICode
has been evaluated,cb.success
is set tonothing
(and so are all other fields).The text in
cb.code
is parsed (saved tocb.expression
).The parsed expression is evaluated.
Outputs of the evaluated expression are captured in
cb.output
.Any
stdout
outputs (e.g., fromprintln
) are captured incb.stdout
.If an error occurs during evaluation, it is saved in
cb.error
.After successful evaluation without errors,
cb.success
is set totrue
. Otherwise, it is set tofalse
and you can inspect thecb.error
to understand why.
Properties
code::AbstractString
: The raw string of the code to be parsed and executed.expression
: The parsed Julia expression (set after parsingcode
).stdout
: Captured standard output from the execution of the code.output
: The result of evaluating the code block.success::Union{Nothing, Bool}
: Indicates whether the code block executed successfully (true
), unsuccessfully (false
), or has yet to be evaluated (nothing
).error::Union{Nothing, Exception}
: Any exception raised during the execution of the code block.
Keyword Arguments
auto_eval::Bool
: If set totrue
, the code block is automatically parsed and evaluated upon instantiation. Defaults totrue
.safe_eval::Bool
: If set totrue
, the code block checks for package operations (e.g., installing new packages) and missing imports, and then evaluates the code inside a bespoke scratch module. This is to ensure that the evaluation does not alter any user-defined variables or the global state. Defaults tofalse
.skip_unsafe::Bool
: If set totrue
, we skip any lines in the code block that are deemed unsafe (eg,Pkg
operations). Defaults tofalse
.skip_invalid::Bool
: If set totrue
, we skip code blocks that do not even parse. Defaults tofalse
.verbose::Bool
: If set totrue
, we print out any lines that are skipped due to being unsafe. Defaults tofalse
.capture_stdout::Bool
: If set totrue
, we capture any stdout outputs (eg, test failures) incb.stdout
. Defaults totrue
.prefix::AbstractString
: A string to be prepended to the code block before parsing and evaluation. Useful to add some additional code definition or necessary imports. Defaults to an empty string.suffix::AbstractString
: A string to be appended to the code block before parsing and evaluation. Useful to check that tests pass or that an example executes. Defaults to an empty string.remove_tests::Bool
: If set totrue
, we remove any@test
or@testset
macros from the code block before parsing and evaluation. Defaults tofalse
.execution_timeout::Int
: The maximum time (in seconds) allowed for the code block to execute. Defaults to 60 seconds.
Methods
Base.isvalid(cb::AICode)
: Check if the code block has executed successfully. Returnstrue
ifcb.success == true
.
Examples
code = AICode("println("Hello, World!")") # Auto-parses and evaluates the code, capturing output and errors.
isvalid(code) # Output: true
code.stdout # Output: "Hello, World!
"
We try to evaluate "safely" by default (eg, inside a custom module, to avoid changing user variables). You can avoid that with save_eval=false
:
code = AICode("new_variable = 1"; safe_eval=false)
isvalid(code) # Output: true
new_variable # Output: 1
You can also call AICode directly on an AIMessage, which will extract the Julia code blocks, concatenate them and evaluate them:
msg = aigenerate("In Julia, how do you create a vector of 10 random numbers?")
code = AICode(msg)
# Output: AICode(Success: True, Parsed: True, Evaluated: True, Error Caught: N/A, StdOut: True, Code: 2 Lines)
# show the code
code.code |> println
# Output:
# numbers = rand(10)
# numbers = rand(1:100, 10)
# or copy it to the clipboard
code.code |> clipboard
# or execute it in the current module (=Main)
eval(code.expression)
aicodefixer_feedback(cb::AICode; max_length::Int = 512) -> NamedTuple(; feedback::String)
aicodefixer_feedback(conversation::AbstractVector{<:PT.AbstractMessage}; max_length::Int = 512) -> NamedTuple(; feedback::String)
aicodefixer_feedback(msg::PT.AIMessage; max_length::Int = 512) -> NamedTuple(; feedback::String)
aicodefixer_feedback(aicall::AICall; max_length::Int = 512) -> NamedTuple(; feedback::String)
Generate feedback for an AI code fixing session based on the AICode block /or conversation history (that will be used to extract and evaluate a code block). Function is designed to be extensible for different types of feedback and code evaluation outcomes.
The highlevel wrapper accepts a conversation and returns new kwargs for the AICall.
Individual feedback functions are dispatched on different subtypes of AbstractCodeOutcome
and can be extended/overwritten to provide more detailed feedback.
See also: AIGenerate
, AICodeFixer
Arguments
cb::AICode
: AICode block to evaluate and provide feedback on.max_length::Int=512
: An optional argument that specifies the maximum length of the feedback message.
Returns
NamedTuple
: A feedback message as a kwarg in NamedTuple based on the analysis of the code provided in the conversation.
Example
cb = AICode(msg; skip_unsafe = true, capture_stdout = true)
new_kwargs = aicodefixer_feedback(cb)
new_kwargs = aicodefixer_feedback(msg)
new_kwargs = aicodefixer_feedback(conversation)
Notes
This function is part of the AI code fixing system, intended to interact with code in AIMessage and provide feedback on improving it.
The highlevel wrapper accepts a conversation and returns new kwargs for the AICall.
It dispatches for the code feedback based on the subtypes of AbstractCodeOutcome
below:
CodeEmpty
: No code found in the message.CodeFailedParse
: Code parsing error.CodeFailedEval
: Runtime evaluation error.CodeFailedTimeout
: Code execution timed out.CodeSuccess
: Successful code execution.
You can override the individual methods to customize the feedback.
error_feedback(e::Any; max_length::Int = 512)
Set of specialized methods to provide feedback on different types of errors (e
).