Bake Configuration

Overview

Bake configuration controls model training behavior, including datasets, training parameters, model adapters, and integrations. In bgit, bakes are configured in the BAKE section of your input.yml file.

Naming Best Practice: Always provide a name field for your bakes. bgit appends a hash to bake names (e.g., v1 becomes v1_abc123def456) to ensure uniqueness. Named bakes are easier to identify in recipe.yml and when tracking model lineage.

Core Configuration

Name

name

string

required

Bake name identifier. Required: Always provide a descriptive name.

BAKE:
  name: main_bake
  datasets:

The name will be hashed (e.g., main_bake_abc123def456) to ensure uniqueness. Use meaningful names like production_bake, experiment_formal_tone_bake, or yoda_personality_bake.

Datasets

datasets

array

required

List of targets to use as training data

BAKE:
  datasets:
    - target: coding_target
      weight: 0.7
    - target: math_target
      weight: 0.3

Each dataset has:

target (string, required): Target name
weight (float): Sampling weight (higher = more frequently sampled)

Using multiple targets: Including multiple targets in your bake acts as regularization to preserve past behavior. By combining targets from previous bakes with new targets, you can maintain the model’s existing capabilities while adding new ones. This is particularly useful in sequential baking workflows where you want to build on previous work without losing what the model already learned.Finding past targets: Use bgit target ls to list all available targets, and bgit target <target_name> to view details of a specific target. This helps you remember which targets were used in previous bakes so you can reference them in new bake configurations.

Training Parameters

epochs

integer

Number of training epochs

BAKE:
  epochs: 3

micro_batch_size

integer

Micro batch size

BAKE:
  micro_batch_size: 1

gradient_accumulation_steps

integer

Gradient accumulation steps for effective batch size

BAKE:
  gradient_accumulation_steps: 4

total_trajectories

integer

Total number of trajectories to use for training

BAKE:
  total_trajectories: 1000

seed

integer

Random seed for reproducibility

BAKE:
  seed: 42

Model Configuration

model

object

Model and adapter configuration

BAKE:
  model:
    baked_adapter_config:
      r: 8
      lora_alpha: 16
      lora_dropout: 0.05
      bias: none
      target_modules: all-linear

Important: In bgit, you don’t configure parent_model_name manually. This is handled automatically:

First bake: Uses the repository’s base model (set during bgit init)
Sequential bakes: Automatically uses PARENT_MODEL from .bread (set after previous bakes)

Fields you can configure:

baked_adapter_config: LoRA configuration (see below)
type: Model type (advanced, defaults to "bake")

LoRA Configuration

model.baked_adapter_config

object

LoRA (Low-Rank Adaptation) configuration

BAKE:
  model:
    baked_adapter_config:
      r: 8                    # LoRA rank
      lora_alpha: 16          # Alpha parameter
      lora_dropout: 0.05      # Dropout rate
      bias: none              # Bias handling
      target_modules: all-linear  # Target modules

Optimizer & Scheduler

optimizer

object

Optimizer configuration

BAKE:
  optimizer:
    learning_rate: 0.0001

scheduler

object

Learning rate scheduler

BAKE:
  scheduler:
    type: huggingface

Advanced Configuration

DeepSpeed

deepspeed

object

DeepSpeed ZeRO configuration

BAKE:
  deepspeed:
    zero_optimization:
      stage: 2

ZeRO Stages:

Stage 0: Disabled
Stage 1: Optimizer state partitioning
Stage 2: + Gradient partitioning
Stage 3: + Parameter partitioning

Complete Example

BAKE:
  name: production_bake
  
  datasets:
    - target: coding_target
      weight: 1.0
  
  epochs: 5
  micro_batch_size: 1
  gradient_accumulation_steps: 4
  total_trajectories: 10000
  seed: 42
  
  model:
    baked_adapter_config:
      r: 16
      lora_alpha: 32
      lora_dropout: 0.1
      bias: none
      target_modules: all-linear
  
  optimizer:
    learning_rate: 0.0001
  
  scheduler:
    type: huggingface
  
  deepspeed:
    zero_optimization:
      stage: 2

Minimal Example

The simplest bake configuration:

BAKE:
  name: main_bake
  datasets:
    - target: main_target  # References a named target
      weight: 1.0

This uses default values for all other parameters. Note that main_target should be the name of a target defined in your TARGET section (e.g., TARGET: name: main_target).

Field Reference Table

Field	Type	Required	Description
`name`	String	Yes	Bake identifier
`datasets`	Array	Yes	Training data sources
`epochs`	Integer	No	Number of training epochs
`micro_batch_size`	Integer	No	Batch size per device
`gradient_accumulation_steps`	Integer	No	Gradient accumulation
`seed`	Integer	No	Random seed
`model`	Object	No	Model configuration
`optimizer`	Object	No	Optimizer settings
`scheduler`	Object	No	LR scheduler
`deepspeed`	Object	No	DeepSpeed config

Sequential Bakes

When running sequential bakes, the PARENT_MODEL is automatically set from .bread. You don’t need to configure it in input.yml:

# First bake
BAKE:
  name: initial_bake
  datasets:
    - target: main_target
      weight: 1.0
# Creates: user/repo/initial_bake_abc123/120

# Second bake (automatically uses initial_bake_abc123/120 as parent)
BAKE:
  name: refined_bake
  datasets:
    - target: main_target
      weight: 1.0
# Creates: user/repo/refined_bake_def456/150 (parent: initial_bake_abc123/120)

Getting Started

Configuration

Guides

Overview

Core Configuration

Name

Datasets

Training Parameters

Model Configuration

LoRA Configuration

Optimizer & Scheduler

Advanced Configuration

DeepSpeed

Complete Example

Minimal Example

Field Reference Table

Sequential Bakes

Next Steps

Target Configuration

Sequential Baking

Getting Started

Configuration

Guides

​Overview

​Core Configuration

​Name

​Datasets

​Training Parameters

​Model Configuration

​LoRA Configuration

​Optimizer & Scheduler

​Advanced Configuration

​DeepSpeed

​Complete Example

​Minimal Example

​Field Reference Table

​Sequential Bakes

​Next Steps

Target Configuration

Sequential Baking

Overview

Core Configuration

Name

Datasets

Training Parameters

Model Configuration

LoRA Configuration

Optimizer & Scheduler

Advanced Configuration

DeepSpeed

Complete Example

Minimal Example

Field Reference Table

Sequential Bakes

Next Steps