A living architecture exercise that composes the cookbook capabilities into a public-service style assistant.

Learning stageLiving worked exampleProof statusArchitecture sketch live; implementation proofs will be linked as they land.

Worked example: Danish citizen-facing LLM

Scenario

A Danish public-service organization wants an assistant that helps citizens understand approved guidance, find relevant forms, and know when to contact a human case worker. The assistant must not invent rights, make case decisions, or hide uncertainty.

Learning goal

Use the cookbook as a design checklist. The exercise is not to build a chatbot. The exercise is to design the system around it.

Capability map

AI capability as a system: define the layers before choosing the model.
Adoption through artifacts: build a small inspectable prototype with one real citizen journey.
Model-agnostic workflows: keep model choice behind a route policy.
Multimodel orchestration: choose local, EU-hosted, or premium model routes by privacy and task type.
Guardrails: refuse unsupported advice, block prompt injection, and escalate high-impact uncertainty.
LLM evaluation: score normal, adversarial, Danish, and English prompts.
Red-teaming: attack retrieval, refusal, escalation, and source attribution.
Agent authority and secrets: keep tool access scoped and auditable.
Handover: leave operators with runbooks, scorecards, logs, and disable paths.

Architecture sketch

Frontend: simple citizen-facing interface with clear non-decision language.
Sources: approved public guidance, forms, and internal policy summaries that have been cleared for the assistant.
Retrieval: source-bounded search with citations or source references.
Model routing: default to an approved hosted/EU route for user-facing answers; use local or private routes for sensitive context preparation where needed.
Guardrails: topic boundary, prompt-injection boundary, personal-data handling, unsupported-answer refusal, and escalation language.
Evaluation: around 30 prompts, at least half adversarial, mixed Danish and English.
Red-team: scoped run against retrieval injection, overconfident legal-style advice, and escalation bypass.
Audit: log route, sources, guardrail decision, escalation decision, and operator-visible failure reason.
Handover: owner, verification command, scorecard, source update process, rollback, and review cadence.

Practice loop

Pick one concrete citizen journey.
Write five normal questions.
Write five questions that should refuse or escalate.
Identify approved sources and unsupported topics.
Draft the answer policy before drafting prompts.
Run the questions through the current architecture and update the capability pages based on failures.

Proof artifact

The finished proof should include:

One architecture note.
One prompt dataset.
One evaluation scorecard.
One red-team report.
One handover packet.
One public-safe summary of what changed after testing.

Current status

This is a living learning example. It is intentionally honest: the architecture is mapped, but the guardrails, evaluation, and red-team proofs still need to be built and linked.

Next build

Start with guardrails. Without a refusal/escalation boundary, the rest of the architecture is only a sketch.