Demonstration Material A walkthrough of the real use of Aurora on synthetic data.
Layer 01 · Workflow Intelligence
See it Work · sample run · synthetic data

Discover.
Weeks of context curation now automated and repeatable.

Every snippet, diagram, quote, and number on this page comes from an actual run of the Aurora L01 Workflow Intelligence automation against a synthetic Keystone Analytics engagement. Layer 01 consolidates context the organization already holds — operator notes, transcripts, Slack threads, architecture diagrams, process documentation — together with focused interviews conducted by Northbeam. It synthesizes the true current state across both process and architecture, surfaces operational deviations that live only in tribal knowledge, names every gap on the record, and produces a signed Future-State Blueprint with a ranked AI Opportunity Register and safety ratings. Once it starts, it isn't a team pulling information together. It runs.

Hours
From inputs to blueprint
7
AI opportunities ranked
$3.8M
ARR exposure surfaced
100%
Claims with citations
90-second cinematic — keyboard: Space pause/play · ← → step · R restart · F fullscreen

Discovery doesn't fail because there's no information.
It fails because no one reconciles what's already there.

Every organization already holds the answer in scattered form. The work isn't gathering — it's triangulating, surfacing contradictions, and converting evidence into a signed commitment about what to build.

The diagram doesn't match the system.

The architecture doc says one thing; the production code says another. The gap is six months old and nobody noticed because nobody opens the dashboard anymore.

The "official" workflow isn't the real workflow.

The CS team has been operating from a personal Notion page maintained by an analytics engineer over Slack DM. Eight weeks of cadence. Zero documentation. This is where decisions actually get made.

Every team has a different number.

The CRO says churn is 11.4%. CS Ops says 14.1%. Both are technically defensible. The board memo is due in two weeks. Nobody is positioned to give the CEO the "real" number.

AI opportunities get assessed last — and emotionally.

By the time someone asks "is this safe to automate?", the deck is half-built and the budget is committed. Safety becomes a constraint to argue around, not a rating that gates work.

Drop what you already have.

No new fieldwork. No new interviews to schedule. We work from the context the organization has already produced.

L01 / 01
Operator notes from the engineer who knows where the bodies are buried
Riya Singh · analytics eng
L01 / 02
An interview transcript with the head of the team being studied
Tom Whitfield · head of data
L01 / 03
The "canonical" process documentation as currently written
Customer Health Ops v2.1
L01 / 04
The architecture diagram (we read the diagram, not just the code)
SVG · vision-extracted
L01 / 05
A two-week Slack window from the channel the work actually happens in
#data-customer-health

Sample run shown above. Production engagements accept whatever the organization already produces — Loom recordings, Granola transcripts, Confluence pages, Linear tickets, Gong calls, Box folders, support ticket exports. The methodology is format-agnostic.

If the documentation has gaps. Where existing context is thin, stale, or doesn't reflect the true operational state, Northbeam Solutions works directly with the customer to bring the right content into the picture — through structured context capture, process mapping sessions, and targeted operator interviews. L01 still runs; the inputs just include what we produce together.

The full operational surface — mapped in one pass.

L01 doesn't stop at the dashboard. The Keystone run surfaced thirteen systems across four operational layers — sources, models, delivery, supporting — and located every one of them in the documented vs. discovered map. The downstream charter (L02) and build (L03) inherit this exact system surface.

Sources
  • Salesforce
  • Gainsight
  • Segment
  • Zendesk
Models
  • Snowflake
  • dbt Cloud · 340 models
  • marts.customer_health_v2
Delivery
  • Hex dashboard · abandoned
  • Hightouch → Gainsight + Slack
  • Gainsight writeback
  • Metabase embeds
Supporting
  • Datadog
  • Linear · data-eng queue
  • Notion · ops docs + shadow
  • Slack · #data-customer-health

Every system appears in the L01 dossier with: documented state, discovered state, ownership, last-modified timestamp where knowable, and inclusion/exclusion verdict for the workstream register. This is the surface area L02 acceptance criteria reference and L03 integration proofs cite.

The raw context, before synthesis.

Three real snippets from the Keystone inputs. Messy. Partial. Each one captures information that exists in the organization but isn't in any single document.

Operator notes · Riya Singh
"Reality: marts.customer_health_v2 was last modified 8 months ago. It uses a Salesforce account-id mapping that broke in November when Diego restructured CRM ownership. Three of its five inputs return nulls now. Nobody's caught it because nobody actually opens the Hex dashboard anymore."
~/Notion/personal-log · May 27
Interview · Tom Whitfield
"[long pause] ... I don't know. That should have been an automatic regression test. [Interviewer: Riya is saying three of five inputs are nulls.] [sits back] ... If that's true, the dashboard has been wrong for six months. Why didn't this come up in standup?"
interview transcript · Q6-Q8
Slack DM · Devon → Riya
"Sarah's going to ask me at 11 for the actual at-risk Q2-renewal-cohort number. The Hex dashboard says 47 accounts in yellow+ but I know the real number is lower because at least 6 of those are the broken mapping. Can you re-run yours?"
#data-customer-health · May 29 · 8:42 AM

Three different sources. Three different vocabularies. Same underlying problem. Without L01, this stays scattered. A team would spend weeks reconciling these and still leave dependencies tribal. Aurora L01 reads all of them in one pass, triangulates the claims, names the gap on the record, and produces a workstream ranked against value.

A signed blueprint —
with every claim citable to its source.

A single integrated artifact with four parts: workflow + architecture maps, AI opportunity register with safety ratings, value model with confidence intervals, and a dependency graph. Every assertion carries a citation. Two signatures lock it.

Sample — Cross-source claim
# A single claim, triangulated across two sources claim_id: C-014 statement: "Scoring model has been broken since Nov 2025 CRM restructure" evidence: - source: slack_thread author: "Riya Singh" quote: "ACCT-447 has been firing false-positive for months" confidence: 1.0 - source: interview_transcript line: 32 quote: "if that's true, the dashboard has been wrong for six months" confidence: 1.0 surfaces_as_gap: G2 triangulator_note: "Two independent sources, same conclusion"
Sample — Workstream entry with value model
# A ranked opportunity with bounded value claims workstream_id: ws-005 title: "Repair scoring model · backfill" ai_suitability: green value_model: cost_avoided: range: [25000, 100000] basis: "rework + escalation cycles / yr" productivity: hours_saved: [100, 500] revenue_protected: arr_at_risk: [500000, 2000000] cohort_exposure: 3800000 composite_rank: 6.075 wave_assignment: 1
Sample — Workflow + architecture, merged into one render
SOURCES Salesforce acct-id mapping MODEL customer_health_v2 ⚠ 3 of 5 inputs NULL DELIVERY (documented) Hex Dashboard ≤6 viewers / 90d Hightouch alerts ~30% false-positive DELIVERY (shadow) Riya's Notion 8wk DM cadence CONSUMERS CRO churn 11.4% VP CS churn 14.1% CFO / Board needs one number ⚠ broken documented path shadow path · undocumented broken connection

Workflow and architecture, rendered together. Documented connections in cyan. The shadow path the CS team actually trusts in violet. The broken connection in red. All three live in one render so the engagement starts with the same picture in every head.

Every AI opportunity gets a safety rating —
before specification starts.

The methodology refuses to defer this question. Three ratings. Clear meaning. Operational consequences in every layer that follows.

Green · Ready for automation

The decision is bounded, evidence is rich, the wrong answer is recoverable. AI can act with logging. Examples in the Keystone run: scoring model repair, canonical definition build, board memo synthesis.

Yellow · AI proposes, human confirms

The decision touches a customer or a record of consequence. AI prepares the action; a named human signs off before it fires. Examples: driver attribution, CSM playbook dispatch, process documentation updates.

Red · Refused at L01

The risk surface is wider than the value justifies, or the decision should remain human entirely. The methodology refuses to advance these to specification. They don't show up in L02.

In the Keystone run: seven opportunities surfaced. All seven cleared to specification with appropriate ratings; one candidate ("fully-autonomous customer outreach calls") was reviewed and refused at L01 before reaching the charter. The refusal is on the record.

Eight specialists.
Each enforcing a discipline the others can't override.

01

Ingest

Heterogeneous handlers normalize every input — markdown, transcripts, diagrams (vision-extracted), structured exports — into evidence-bearing claim units. No source goes in without provenance.

02

Extract

Every claim is paired with its source quote, line number or thread reference, extraction confidence, and the agent that surfaced it. The audit trail is built before the first conclusion is drawn.

03

Triangulate

Single-source claims are flagged. Cross-source claims are confirmed or contradicted. The Triangulator's job is to attempt and fail before allowing a single-source claim to influence ranking.

04

Gap Analysis

Documented state versus discovered state. Process doc says X; production system does Y. Every delta gets a gap ID, evidence, and an inclusion-or-exclusion decision for the workstream register.

05

Rank

Every candidate workstream scored on cost avoided, productivity gained, revenue protected — with confidence intervals tied to evidence. Plus AI suitability (Green / Yellow / Red), feasibility, and strategic fit.

06

Boundary Discipline

The Boundary Sentry names what's not in scope and why. Refused AI opportunities get their refusal logged. This is the audit trail your security and audit teams will want when the program ships.

07

Assemble

Workflow map, architecture map, opportunity register, value model, dependency graph — assembled into one render. The same picture in every stakeholder's head.

08

Pre-Signoff Audit

Citation integrity verified before the artifact reaches signers. Broken citations don't make it to the boardroom. Two signatures required — one operational, one executive — to advance.

Weeks of discovery
now automated and repeatable.

Weeks
Of consolidation work
Hours
Automated synthesis · repeatable

The compression isn't from cutting rigor — it's from removing the steps where humans were doing what evidence triangulation can do: cross-reference, contradiction-surface, citation-track, gap-name. The judgment calls stay in human hands. The wiring underneath runs automatically. Once the context is dropped, the synthesis isn't a project plan — it's an agent run that produces a signed artifact at the end.

Time to gather inputs is a separate task. Once inputs are in hand, Aurora L01's runtime is measured in hours, not weeks. The same agent run against the same inputs is reproducible.

L02 takes the signed blueprint
and turns it into a binding spec.

See how the methodology turns "this is what we discovered" into "this is what we will build — and what we refuse to."

See L02 Specify → Discuss a Custom Engagement →