AI Agents
Data Governance for Healthcare Data Product Teams in the Age of GenAI, Analysis, and Agents
A practitioner's guide to governed objects, executable data contracts, stewardship that ships, and why catalogs won't save you in the agent era.
The question behind the question
Someone asked me last month for "a data governance framework." What they wanted was a 40-page PDF they could hand to their CEO/CFO. What they actually needed was a way to make three decisions on Monday morning without the framework getting in the way.
That gap, between what governance documents promise and what governance decisions require, is where most healthcare data product teams are stuck right now. And GenAI is making it worse, not better, because agents hallucinate fluently when the underlying data is ambiguous.
This post is my working answer. Fair warning: I'm in the middle of this problem, not above it. If you find yourself nodding at the failure modes, welcome to the club.
Why healthcare data product teams get stuck
There are three things that make governance hard for healthcare data product teams specifically (and I'm oversimplifying, as usual).
First, the dual-customer problem. You're probably building for clinicians, analysts, and payers at the same time. A clinician wants an encounter record to mean the clinical event (when the clinician saw the patient). An analyst wants the same record to mean the billing event (the thing that generated a claim). A payer wants it to mean whatever produces the metric they report. None of them are wrong. But your data product has one encounter table, and if you don't pick a definition and enforce it, every downstream consumer builds their own shadow semantics. That's not a data quality problem. That's a governance scope problem.
Second, legacy thinking treats data as inventory. The old governance playbook says "catalog everything, label owners, run a committee." That works if your product is a warehouse full of reports. It breaks the moment your product is shipping queries, dashboards, or AI agents that have to be correct on a specific schedule. Dylan Anderson's been banging this drum on his Substack for a while (Issue #15 names the root causes, and "Business-Related Process Problems" plus "Underinvestment in Data Governance" are the two I see most often in healthcare). Inventory thinking can't handle product velocity.
Third, GenAI amplifies the failure mode. Agents are fluent, which is new. When you hand an agent ambiguous data, it doesn't throw an error. It generates a confident answer, cites your table, and moves on. A junior analyst who didn't know the difference between the clinical encounter and the billing encounter would at least hesitate. An agent won't. And the business will trust the agent more than the analyst, because the agent is faster.
If that doesn't scare you a little, I don't think we're reading the same industry.
You can't govern AI without governing data
This is the part I want to say loud and clear.
Over the last year I've watched teams stand up AI governance programs (model cards, bias audits, human-in-the-loop committees, the whole package) on top of data foundations that don't even have documented encounter definitions. The AI governance layer looks beautiful in a board deck. It fails the first time a regulator asks "why did your system say X?"
Dylan Anderson calls this the bolt-on problem in Issue #54, and it's the most useful frame I've seen for where most teams are right now. He argues that business model, data, and AI should be refactored together. Data is the active bridge between the business question and the model output, not a separate compliance domain. His earlier Issue #50 makes the same point more plainly: data governance to AI governance is a progression, not a replacement. You don't get to skip the first layer because you think AI is new and different.
In healthcare, skipping the first layer isn't a KPI miss. A wrong encounter record in a clinical data product is a patient safety incident waiting for a lawyer. HIPAA cares. The FDA cares (especially if you're anywhere near SaMD territory, Software as a Medical Device). Colorado and California have AI transparency laws on the books, and Texas, New York, and half a dozen other states are drafting their own. All of them converge on the same question: can you explain, audit, and reproduce what your system did? If you can't answer yes, the AI governance frameworks don't help you. They just make your compliance binder heavier.
So: ground floor first. Then the elevator.
Four moves a data product team can make this quarter
Here's the working framework I've been using. Frame this as a product operating model, not a corporate governance program. Four moves. All four are something your team can do this quarter without hiring a Chief Data Officer.
Move 1: Define the governed object
Not "our data." Specific entities your product promises are true.
In a healthcare data product, these usually look like: Patient, Encounter, Medication Order, Lab Result, Claim, Facility, Provider. Pick the 5-10 your product ships against. Write down what each one means in one paragraph. Business definition, not SQL definition. If you have two teams that would define Encounter differently, you don't have an Encounter entity. You have a scope problem. Fix that first.
Everything else (the 400 tables in your warehouse that nobody queries) is raw. That's fine. Raw data is allowed to be messy. Governed objects aren't.
Move 2: Write executable data contracts for those objects
This is where the governance-as-PDF approach gives up and the governance-as-code approach takes over.
A data contract is a formal, version-controlled agreement between the team producing a governed object and the teams consuming it. The contract specifies schema (columns, types, nulls allowed), freshness (how stale can this get before it's wrong), completeness (which columns must be non-null), cardinality (how many rows per day is normal), and business rules (an Encounter effective date cannot precede the patient's BirthDate, which is an actual rule I've shipped, not a theoretical one).
The critical word is executable. If your contract lives in a Confluence page, it's not a contract. It's a wish. Contracts fail pipelines. They fire alerts. They block deployments. They look like dbt tests, Great Expectations assertions, or dbt contracts. Pick the tool your team already uses and commit. Dylan Anderson's Issue #49 walks through implementation principles if you want a longer version of this argument.
Move 3: Name a steward for each object with a weekly job
This is the one that kills most governance programs. Teams assign "data owners" on an org chart, the org chart gets outdated, nobody notices.
Do something different. For each governed object, name a steward (ideally the subject-matter expert who'd be called when someone asks "what does this field actually mean?") and give them a recurring 30-minute weekly job. Pull the contract breaks from the last seven days and figure out which are real. Check the quality drift metrics on the 3-4 fields that matter. If any new business logic showed up during the week, update the rules file. That's the job. Thirty minutes. Standing on the calendar.
The weekly job matters more than the title. A Data Steward with no recurring work is a ceremonial title. A Data Steward with 30 minutes every Wednesday is a governance program.
Move 4: Instrument your agents against the contracts
This is the move that doesn't show up in Dylan Anderson's playbook, because it's the newest layer. It's also where healthcare data product teams have the most room to differentiate right now.
When an AI agent pulls from a governed object, log three things: the contract version it read against, the data freshness at query time, and a reproducibility hash of the inputs used to generate the output. Store those alongside the agent's response. If a clinician (or a regulator) asks three months later, "why did the system say this," you can answer in under an hour instead of under three days. That's the healthcare governance audit trail. And it's what connects Moves 1-3 to AI governance in a way that holds up under scrutiny.
This is also where data lineage tools and MCP-based data product interfaces are starting to show up. The specific tool matters less than the commitment to instrumenting every agent invocation against a named, versioned contract. Do that, and your AI governance program stops being paperwork and starts being code.
What GenAI and agents actually change
Most of what I've described is just good data engineering dressed up in governance clothes. The part that's new is what GenAI introduces on top.
Pre-GenAI, governance was about humans trusting data. Post-GenAI, it's about agents and humans trusting data, and humans trusting agents. That third layer introduces three failure modes I hadn't seen before, and healthcare teams need to plan around all of them.
Synthesis drift. An agent combines two governed sources in a way neither contract individually covers. Example: an agent joins Patient Demographics with Lab Result to produce a cohort summary and silently averages across populations with incompatible clinical baselines (different age groups, different comorbidity profiles, different baseline risk). Each source has its own contract. The synthesis doesn't. The FAVES framework from ONC (Fair, Appropriate, Valid, Effective, Safe) is a useful lens here, because it forces you to evaluate the output not just the inputs.
Context collapse. The agent loses provenance between upstream data and downstream claim. A clinician sees a recommendation. They can't trace it back to the data contract version that fed the model. By the time they ask "why this answer," someone has retrained the model, the underlying table has refreshed twice, and the agent has forgotten its own workflow. No audit. No reproduction.
Silent retraining. A model gets fine-tuned on ungoverned data somewhere upstream (a sandbox environment, a one-off analytics request, someone's exported CSV). The fine-tuning survives into production. Future outputs are now poisoned in ways nobody logged. NIST's CSF for AI and the WHO's guidance on large multi-modal models both call this out, and integrity checksums plus signed model packages are the current state of the art. Very few healthcare orgs are doing it.
The fix for all three isn't more process. It's executable governance artifacts the agent can read. Contracts the agent checks before querying. Observability the agent emits during querying. Lineage the agent writes after querying. If your governance program produces PDFs and not logs, your agents can't use it.
The practitioner's 5-question test
Here's what I ask myself (and what you should ask your team) before shipping anything AI-adjacent in a healthcare data product:
- Can you name the 5-10 governed objects your product promises? Not 200. A product with 200 governed objects has zero governed objects.
- Do those objects have executable contracts that fail the pipeline when they break? If the answer is "we have documentation," that's a no.
- Is there a named human whose weekly job (not org chart title) is those contracts? If the answer is "our data team" without an individual name, that's a no.
- When an AI agent uses that data, can you reproduce the output on demand? Contract version, data freshness, input hash. All three. If any are missing, that's a no.
- If a regulator asks "why did your system say X," can you answer in under an hour? Not under a week. Under an hour. With evidence.
If you can't answer yes to all five, you don't have a data governance problem. You have a product shipping risk wearing a governance costume.
What I'm still figuring out
A few things I don't have clean answers to, and would love to hear from folks who do.
How do you handle contract versioning when the clinical definition of an entity changes? It will change. SNOMED updates. ICD transitions. New regulatory definitions of "encounter." I have an opinion on this but not conviction yet.
Who should own the AI governance audit trail in a product org? Data engineering, ML engineering, or product? I've seen all three work and all three fail. My current bet is product, but I don't have a clean reason why.
At what stage of a data product's lifecycle does the cost of building out stewardship pay for itself? I suspect it's earlier than most teams think, but I can't prove it yet. If you've measured this, I want to see your numbers.
If you're working on any of this, drop me a note. I'd rather be in a conversation than publishing a framework.
Written from the practitioner chair, not the consultant seat. Citations are to Dylan Anderson's Data Ecosystem Substack, the foundational thinking I'm building on. Any bad takes are mine.