Overview
Generators define how stimuli (questions/tasks) are created. You can use multiple generators in a single target, and they’ll be combined.Generator Types
Oneshot
Let a LLM (Anthropic only) generate stimuli for you.Must be
"oneshot_qs"Model name (Only Anthropic models supported e.g.,
"claude-sonnet-4-5-20250929")Number of questions to generate
Generation temperature (0.0-2.0). Higher = more creative/random
Optional: Path to a custom template file for question generation. Paths are normalized to Template Requirements:
/templates/{template_name} format.When template_content is provided, the file is written to the repository.Best Practice: Use just the filename (e.g., "my_template.txt") or the full normalized path (e.g., "templates/my_template.txt"). The system will normalize it to /templates/{template_name}.- Must include
${numq}variable (number of questions) - Must include
${prompt_u}variable (teacher prompt content) - File size limit: 1MB
- Encoding: UTF-8
- Uses Python’s
string.Templateformat
Optional: Template file content (for upload). When provided along with Template Variables:
template_path, the content is written to the template file in the repository.If you provide template_content without template_path, a default filename will be generated. It’s recommended to provide both template_path and template_content together.${numq}: Number of questions to generate${prompt_u}: Teacher prompt content (unconditioned stimulus)
template_path and template_content to customize the question generation style and format.
Hardcoded
Predefine a list of questions you want the prompted model to respond to.Must be
"hardcoded"List of question strings
Number of questions (should match length of questions array)
Dataset Questions
Sample from established datasets like SQuAD, GSM8K, MMLU, HellaSwag.Must be
"from_dataset"Dataset name (e.g.,
"squad", "gsm8k", "mmlu", "hellaswag")Number of questions to sample from dataset
Random seed for reproducible sampling
Common Generator Fields
These fields can be used with any generator type:If
true, use conditioned_stimulus (student_prompt) for trajectory generation instead of unconditioned_stimulus (teacher_prompt). When set, adds trajectory_override_stimulus field to stimulus output. Default: false (trajectories use unconditioned stimulus).Persona
A dataset curated specifically to bake personas.Must be
"persona"Number of questions to generate
Random seed for reproducibility
Generation temperature (0.0-2.0)
Combining Generators
Combining multiple generators creates more diverse training datasets. Use multiple generators for a target:Examples
Code Generation
Code Generation with Custom Template
Math Problems
Specific Test Cases
Best Practices
Mix Generator Types
Mix Generator Types
Combine different generators for diverse training data
Use Seeds for Reproducibility
Use Seeds for Reproducibility
Set
seed values when using from_dataset or persona for reproducible resultsStart with Hardcoded
Start with Hardcoded
Test your pipeline with hardcoded questions before scaling to AI generation
Tune Temperature
Tune Temperature
Adjust temperature based on creativity needs (lower = more focused, higher = more creative)
Use Custom Templates
Use Custom Templates
For
oneshot_qs generators, use template_path and template_content to customize question generation style. Templates use Python’s string.Template format with ${numq} and ${prompt_u} variables.