Skip to main content

Overview

Bake configuration controls model training behavior, including datasets, training parameters, model adapters, and integrations. In bgit, bakes are configured in the BAKE section of your input.yml file.
Naming Best Practice: Always provide a name field for your bakes. bgit appends a hash to bake names (e.g., v1 becomes v1_abc123def456) to ensure uniqueness. Named bakes are easier to identify in recipe.yml and when tracking model lineage.

Core Configuration

Name

name
string
required
Bake name identifier. Required: Always provide a descriptive name.
BAKE:
  name: main_bake
  datasets:
The name will be hashed (e.g., main_bake_abc123def456) to ensure uniqueness. Use meaningful names like production_bake, experiment_formal_tone_bake, or yoda_personality_bake.

Datasets

datasets
array
required
List of targets to use as training data
BAKE:
  datasets:
    - target: coding_target
      weight: 0.7
    - target: math_target
      weight: 0.3
Each dataset has:
  • target (string, required): Target name
  • weight (float): Sampling weight (higher = more frequently sampled)
Using multiple targets: Including multiple targets in your bake acts as regularization to preserve past behavior. By combining targets from previous bakes with new targets, you can maintain the model’s existing capabilities while adding new ones. This is particularly useful in sequential baking workflows where you want to build on previous work without losing what the model already learned.Finding past targets: Use bgit target ls to list all available targets, and bgit target <target_name> to view details of a specific target. This helps you remember which targets were used in previous bakes so you can reference them in new bake configurations.

Training Parameters

epochs
integer
Number of training epochs
BAKE:
  epochs: 3
micro_batch_size
integer
Micro batch size
BAKE:
  micro_batch_size: 1
gradient_accumulation_steps
integer
Gradient accumulation steps for effective batch size
BAKE:
  gradient_accumulation_steps: 4
total_trajectories
integer
Total number of trajectories to use for training
BAKE:
  total_trajectories: 1000
seed
integer
Random seed for reproducibility
BAKE:
  seed: 42

Model Configuration

model
object
Model and adapter configuration
BAKE:
  model:
    baked_adapter_config:
      r: 8
      lora_alpha: 16
      lora_dropout: 0.05
      bias: none
      target_modules: all-linear
Important: In bgit, you don’t configure parent_model_name manually. This is handled automatically:
  • First bake: Uses the repository’s base model (set during bgit init)
  • Sequential bakes: Automatically uses PARENT_MODEL from .bread (set after previous bakes)
Fields you can configure:
  • baked_adapter_config: LoRA configuration (see below)
  • type: Model type (advanced, defaults to "bake")

LoRA Configuration

model.baked_adapter_config
object
LoRA (Low-Rank Adaptation) configuration
BAKE:
  model:
    baked_adapter_config:
      r: 8                    # LoRA rank
      lora_alpha: 16          # Alpha parameter
      lora_dropout: 0.05      # Dropout rate
      bias: none              # Bias handling
      target_modules: all-linear  # Target modules

Optimizer & Scheduler

optimizer
object
Optimizer configuration
BAKE:
  optimizer:
    learning_rate: 0.0001
scheduler
object
Learning rate scheduler
BAKE:
  scheduler:
    type: huggingface

Advanced Configuration

DeepSpeed

deepspeed
object
DeepSpeed ZeRO configuration
BAKE:
  deepspeed:
    zero_optimization:
      stage: 2
ZeRO Stages:
  • Stage 0: Disabled
  • Stage 1: Optimizer state partitioning
  • Stage 2: + Gradient partitioning
  • Stage 3: + Parameter partitioning

Complete Example

BAKE:
  name: production_bake
  
  datasets:
    - target: coding_target
      weight: 1.0
  
  epochs: 5
  micro_batch_size: 1
  gradient_accumulation_steps: 4
  total_trajectories: 10000
  seed: 42
  
  model:
    baked_adapter_config:
      r: 16
      lora_alpha: 32
      lora_dropout: 0.1
      bias: none
      target_modules: all-linear
  
  optimizer:
    learning_rate: 0.0001
  
  scheduler:
    type: huggingface
  
  deepspeed:
    zero_optimization:
      stage: 2

Minimal Example

The simplest bake configuration:
BAKE:
  name: main_bake
  datasets:
    - target: main_target  # References a named target
      weight: 1.0
This uses default values for all other parameters. Note that main_target should be the name of a target defined in your TARGET section (e.g., TARGET: name: main_target).

Field Reference Table

FieldTypeRequiredDescription
nameStringYesBake identifier
datasetsArrayYesTraining data sources
epochsIntegerNoNumber of training epochs
micro_batch_sizeIntegerNoBatch size per device
gradient_accumulation_stepsIntegerNoGradient accumulation
seedIntegerNoRandom seed
modelObjectNoModel configuration
optimizerObjectNoOptimizer settings
schedulerObjectNoLR scheduler
deepspeedObjectNoDeepSpeed config

Sequential Bakes

When running sequential bakes, the PARENT_MODEL is automatically set from .bread. You don’t need to configure it in input.yml:
# First bake
BAKE:
  name: initial_bake
  datasets:
    - target: main_target
      weight: 1.0
# Creates: user/repo/initial_bake_abc123/120

# Second bake (automatically uses initial_bake_abc123/120 as parent)
BAKE:
  name: refined_bake
  datasets:
    - target: main_target
      weight: 1.0
# Creates: user/repo/refined_bake_def456/150 (parent: initial_bake_abc123/120)

Next Steps