Dungeon
Structured adversarial environments for evaluating latent world models
DWMB is a suite of compact, procedurally generated grid-world POMDPs designed to require agents to infer latent hazards, topology, and non-local causal triggers. Traps look like safe floor; switches far away arm or disarm them. We propose Preemptive Inference Rate (PIR): the fraction of hazards whose danger was predicted above threshold before first activation—separating safe latent inference from “unsafe success” by trial-and-error.
The document is a benchmark definition and preregistered evaluation protocol: formal POMDP spec, JSON schema, difficulty tiers T1–T5, uniform belief-extraction convention, baselines (model-free RL, MuZero-style, Dreamer/RSSM, LLM+memory, JEPA-style), and statistical analysis plan. No empirical results yet; the hypothesis that belief-based agents achieve higher PIR is falsifiable and pending validation.
Read the report