Develop Your Test Case
All test cases are defined in definition.toml
files with the structure described below.
Folder Structure
Definitions are saved in the following file paths code_generation/category/test_case_name/definition.toml
.
Anatomy of definition.toml
Required fields in definition.toml
include:
- name: Corresponding to the file path.
- contributor: The creator of the test case (and their collaborators).
- criteria: The evaluation criteria (eg, parsing, execution, unit_tests, examples).
- prompt: The problem statement or task.
- version: The version of the test case. Starts at "1.0".
- examples: Example scenarios for testing, provided as a vector of executable statements using the function name (eg,
my_function(1, 2)
). - unit_tests: Tests to validate the code, provided as a vector of
@test X = Z
statements. - imports: Packages that are made available to the model (to avoid failures due to a failed dependency).
- reference_solution: A reference solution to the problem, provided as a string of Julia code (no code fences).
There are several optional fields:
- examples_setup: Code to run before each example eval, provided as a string of Julia code (no code fences). Used to setup any variables or functions needed for the examples.
- examples_teardown: Code to run after each example eval, provided as a string of Julia code (no code fences). Used to clean up any variables or functions needed for the examples.
- unittestssetup: Code to run before each unit test eval, provided as a string of Julia code (no code fences). Used to setup any variables or functions needed for the unit tests.
- unitteststeardown: Code to run after each unit test eval, provided as a string of Julia code (no code fences). Used to clean up any variables or functions needed for the unit tests.
The above fields can improve re-use of code across the examples/unit tests.
See an example in examples/create_definition.jl
.
You can validate your test case definitions with validate_definition
.
Feedback and Improvements
We highly value community input. If you have suggestions or ideas for improvement, please open an issue. All contributions are welcome!