Layer 2 of 9

Architecture. attention turns sequences into programs.

Pick structure before you touch GPUs.

Transformers route information across tokens using learned attention maps — effectively programmable memory bandwidth.

Encoder stacks summarize context; decoder stacks generate tokens autoregressively; hybrids mix both.

What this layer does

Why attention works

Each token emits queries that attend to keys elsewhere in the sequence. Softmax yields differentiable routing.

BERT-style models predict masked tokens; GPT-style models predict the next token; T5 frames tasks as text-to-text.

Parameter counts (7B, 70B, etc.) describe tensor shards inside blueprint templates you still must choose responsibly.

The problem without Deadwood

Without custodianship, your team inherits every sharp edge below.

  • Survey dozens of papers comparing widths, depths, RoPE vs sinusoidal PE.
  • Prototype tiny Transformer forks to benchmark throughput.
  • Negotiate licensing for proprietary modifications.

Typical DIY cost

Timeline
6–12 weeks experimentation
Budget
$80k–$250k research time
Expertise
Research engineers + GPU profiling

Deadwood's solution

Opinionated APIs wire custodied data, runners, and proofs together — no boilerplate archaeology.

from deadwood import ArchitectureLab

lab = ArchitectureLab()

blueprint = lab.select(
    modality="text",
    context_window=8192,
    inference_budget="a100-40gb",
)

blueprint.summary()  # depth · heads · MoE flags

How Deadwood custodies this layer

ArchitectureLab encodes hardware envelopes — memory ceilings, tensor parallel widths, and Snowflake-side preprocessing assumptions.

Instead of guessing head counts, you declare throughput targets and Deadwood snaps to reviewed templates.

Next steps

Continue the tour

Follow how custody chains into Weights.

Next: Weights

Run a workload

Provision runners and metered jobs — describe the outcome, not every knob.

Start a job

Talk to custodians

White-glove onboarding for regulated teams and bespoke stacks.

Schedule a demo