Zero actionable failures — every driver gauge classified, every gap explained
Summary: Zero actionable failures — every driver gauge classified, every gap explained (v0.5.1b13).
Over the past few releases the cross-driver gauge pass rate has been climbing — 99.5% at v0.5.1b4, 99.9% by last week's refresh. The last 0.1% was a handful of failures that either could not be fixed in SecantusDB (a Java-driver SDAM cascade triggered by a server-side APIStrictError), reproduced only under heavy parallel load (two mongo-go-driver flakes), or assumed a multi-node replica-set deployment SecantusDB deliberately doesn't simulate (Ruby's w: 2 write-concern test). Reporting them as plain "failures" overstated the gap — but silently dropping them would let real regressions hide in the same column.
v0.5.1b13 introduces validation_summary/expected_failures.py — a small per-gauge registry of (pattern, rationale) entries. The cross-driver summary now separates "Failed" (unexpected, a real bug we need to fix) from "Expected" (a documented gap with a one-line reason that ships in the report). The pass-rate column stays honest; a new Adjusted column reports the rate excluding expected failures from the denominator — "how much of the conformable surface actually conforms." Current numbers: 7,186 tests, 6,254 passed, 0 unexpected failures, 5 expected failures, 927 skipped — 100.0% adjusted across every driver.
This release also bundles the gauge improvements that landed since v0.5.1b4: mapReduce returns a graceful empty result for non-canonical bodies (Java driver-core), $changeStream against a standalone topology is rejected with code 40573 (Java SessionsTest), Node CSOT explain-plus-timeoutMS tests pass via a new block_connection / block_time_ms failpoint pair, getParameter advertises authenticationMechanisms: ["SCRAM-SHA-256"] for the Java driver's SDAM probe, and createIndexes / create reject unknown options up-front (Ruby's wildcardProjection + commitQuorum shapes). The Java apiStrict command-name gate stays disabled — enabling it would fix one test and break six others via the SDAM cascade — and that trade-off is the first entry in the new expected-failures registry.