Agent-Era Prompting Summary (Nate B. Jones Transcript)¶

Source discussed by operator: "If You're Prompting Like It's Last Month, You're Already Late" (February 2026).

This summary captures the transcript's design-relevant claims for GovZero and gzkit.

Core Thesis¶

Prompting has split into four different disciplines. Chat-era prompt craft alone is no longer sufficient for long-running autonomous agents.

Agents that run for hours or days require intent, context, and specification to be encoded before execution starts.

Four Disciplines¶

Prompt Craft
Clear instructions, examples/counterexamples, guardrails, output shape, ambiguity handling.
Still required, but now table stakes.
Context Engineering
Curate the token environment, not just one prompt.
Control surfaces, retrieval, memory, and project conventions determine quality ceiling.
Intent Engineering
Encode goals, values, trade-off rules, and escalation boundaries.
Prevents "optimizing the wrong metric" failures.
Specification Engineering
Write executable specs for work that spans long time horizons.
Treat organizational documents as machine-operable specifications.

Five Specification Primitives¶

Self-contained problem statements
Task is solvable without implicit tribal context.
Acceptance criteria
"Done" is independently verifiable.
Constraint architecture
Musts, must-nots, preferences, escalation triggers.
Decomposition
Work broken into independently executable/verifiable units.
Evaluation design
Recurring test/eval cases with regression detection after updates.

TDD and BDD Mapping¶

gzkit's existing gate model directly supports Nate's framework:

Gate 2 (TDD) strengthens acceptance criteria and evaluation design through executable unit-test evidence.
Gate 4 (BDD) strengthens self-contained problem statements and intent alignment through behavior-level contract checks.
Together, TDD + BDD reduce "looks-right" outputs by enforcing measurable correctness at both implementation and behavior layers.

Why This Matters For GovZero¶

GovZero already encodes several agent-era strengths:

Gate model and lane doctrine
Human authority boundary (attestation)
Evidence-first closeout and audit flow
Canonical control surfaces (AGENTS.md, CLAUDE.md, instructions, skills)

The transcript implies the next maturity step: readiness must be measured as a runtime contract, not only described in docs.

gzkit Design Implications¶

Make readiness executable
Add and maintain gz readiness audit with deterministic PASS/FAIL semantics.
Keep context surfaces synchronized
Discovery index, control surfaces, runbooks, and command docs must stay coherent.
Elevate specification quality
OBPI/ADR templates must preserve self-contained tasks, constraints, and acceptance criteria.
Institutionalize evaluation
Include parity/readiness regression checks in routine quality gates.
Preserve human judgment boundary
Automation can score readiness, but final authority remains with human attestation.

Practical Operating Rule¶

For long-running agent tasks, do not rely on real-time correction. Encode context, intent, constraints, and verification criteria up front, then run deterministic checks before completion claims.

Companion Reference¶

For a deeper practitioner-level synthesis with Anthropic/OpenAI corroboration, see: docs/user/reference/agent-input-disciplines.md