01a drift detection probe for agent workflows

RVBY.

we catch hallucinations one step early.

Most tools catch hallucinations after they show up in the output. We think you can catch them earlier — inside the model's hidden states, before the bad reasoning becomes a bad token. Cheap when you can. Strong when you must.

02the drift

an agent forgets halfway through.

A 20-step agent traces a complex task. By step 11 the model has wandered. The next step looks fluent — confident even — but it isn't grounded in the source any more. By step 14 the whole chain is contaminated. Most observability tools see it after the run. By then you've already paid for it.

Hover the orbs → watch step 11 begin to drift.

09extract entity from passage0.94

10cross-reference table B0.91

11derive year of acquisition0.41 ⚠

12summarize findings→ rerouted

03the probe

drift shows up in hidden states first.

We trained a 1M-parameter probe to read the model's hidden activations and score whether reasoning is still grounded. Supervision comes from TRIBE v2 — Meta's brain encoder, trained on fMRI scans of people reading. It scores semantic alignment in a way NLI can't: graded, topical, source-aware.

In production TRIBE never runs. We distilled the signal once, offline. The probe runs in under a millisecond on CPU.

Drag the probe → scan the activations.

04the reroute

cheap when you can. strong when you must.

Each step starts on your cheap open-weight model. If the probe says grounded, we keep the answer. If it says drifting, we reroute that single step to a frontier model — Sonnet, GPT-4 — using your own API key. The chain heals. The next step starts cheap again.

Swap your endpoint URL. That's the install.

- base_url = "https://api.openai.com/v1"
+ base_url = "https://rvby.dev/v1"

05the math

a 20-step session,
from $0.30 to $0.05.

avg cost reduction

0ms

probe latency / cpu

probe parameters

design partners

ceo // brain × ai

Livia.

Biomedical engineering, AI-assisted research. Built ML systems where multi-step reasoning has to stay grounded — there isn't room for confident drift in clinical contexts. Now bringing that constraint to agent infrastructure.

cto // systems

Opemipo.

LLM workflow tools, research-to-production. Likes niche open-weight models and the kind of inference plumbing nobody wants to touch. Wrote the probe and the proxy.

RVBY.

we catch hallucinations one step early.

an agent forgets halfway through.

drift shows up in hidden states first.

cheap when you can. strong when you must.

a 20-step session, from $0.30 to $0.05.

catch the drift before the token.

a 20-step session,
from $0.30 to $0.05.

catch the drift
before the token.