Layer 3 of 9

Weights. parameters are frozen histories of gradients.

Reuse giants; specialize safely.

Weights store everything the optimizer discovered — billions of floats distilled from data and schedules.

Fine-tuning nudges subsets toward domain nuance; LoRA freezes backbone tensors while learning thin adapters.

See Deadwood's solution Next layer: Training algorithm

What this layer does

Lifecycle

Foundation checkpoints ship from labs under licenses — Deadwood tracks lineage automatically.

Quantization maps FP32 tensors into INT8/INT4 blocks with calibration datasets.

Distillation trains compact students on teacher logits; merges fuse complementary adapters.

The problem without Deadwood

Without custodianship, your team inherits every sharp edge below.

Download opaque blobs from untrusted mirrors.
Hand-patch adapters incompatible with your tokenizer.
Rebuild quantization recipes whenever GPUs change.

Typical DIY cost

Timeline: 3–6 weeks per adaptation
Budget: $40k–$180k GPU burn
Expertise: ML engineers comfortable with safetensors + CUDA graphs

Deadwood's solution

Opinionated APIs wire custodied data, runners, and proofs together — no boilerplate archaeology.

from deadwood import ModelLibrary

library = ModelLibrary()

model = library.select(
    task="classification",
    constraint="memory",
    hardware="rtx4090",
)

finetuned = library.finetune(
    model="llama-7b",
    data=clean_data,
    method="lora",
)

How Deadwood custodies this layer

ModelLibrary aligns checkpoints with custodied datasets — mismatched tokenizers never silently corrupt batches.

Quantization presets adapt per SKU — LoRA ranks resize automatically when VRAM budgets swing.

Next steps

Continue the tour

Follow how custody chains into Training algorithm.

Next: Training algorithm

Run a workload

Provision runners and metered jobs — describe the outcome, not every knob.

Start a job

Talk to custodians

White-glove onboarding for regulated teams and bespoke stacks.

Schedule a demo

← Previous: Architecture