
Why? Because PI is a socio-technical problem. The provider payer claims, and payment workflow sits at the intersection of clinical nuance, contractual detail, regulatory rules, and human judgment.
Models can identify and rank issues, but only experts can determine why they are wrong, how to correct them, and what changes are needed upstream to prevent recurrence.
The organizations making real gains are designing AI + HI - AI with Human Intelligence - experts in the loop - on purpose.
Improper payments remain material. CMS’s CERT data put the 2024 Medicare FFS improper payment rate at 7.66% ($31.7B), a flat trend line that underscores how complex and challenging this problem is, even with decades of edits and audits.
Across the federal government, GAO reports $162B in estimated improper payments in FY2024, with the lion’s share as overpayments, keeping payment integrity in the policy spotlight.
At the same time, appeals data show a trust gap in utilization management. In Medicare Advantage, only ~10–12% of denials are appealed, yet over 80% of appealed decisions are overturned, a signal that first decisions often lack sufficient context or explanation.
If we want audits providers accept—and denials that don’t boomerang—our systems must explain and withstand scrutiny.
Think of AI + HI as co-reasoning: models surface patterns; humans validate, enrich, and operationalize. Done right, you get four compounding advantages:
Every model output should include the reason for the decision: policy citation, contract clause, code edit, effective dates, and evidence trails. This is not paperwork; it’s the substrate of trust—and the most effective antidote to abrasion. It also answers the regulator’s requirement of “how does it work and when is it appropriate?”
These three steps at a minimum require human reviews and expert guidance:
This “graduated oversight” keeps throughput high while ensuring hard calls get expert eyes. Reviews in informatics and bioethics literature argue explicitly for this calibrated human role.
Ship continuous learning loops: capture reviewer dispositions (“agree,” “agree with changes,” “reject”), capture why, and feed back into model retraining. Retire weak features; elevate strong ones. Publish drift dashboards so everyone—medical policy, SIU, finance—sees whether precision is improving.
For each finding, auto-assemble a packet: claim lines involved; rules and references; medical necessity logic where applicable; timeline of updates (policy version, NCCI/MUE table version, contract effective dates). Plans that communicate clearly see better cooperation and fewer escalations. (Multiple industry sources highlight education and transparency as levers to reduce abrasion.)
Map NIST AI RMF functions (Govern, Map, Measure, Manage) to your PI workflow: risk registers for models, bias testing, role-based overrides, and incident response when performance drifts. Align with HHS/ONC transparency requirements and FDA transparency principles to keep cross-functional teams (security, compliance, medical policy) on the same page.
If your post-pay program isn’t systematically promoting validated issues to pre-pay controls—backed by provenance—you’re paying for the same error twice. A disciplined AI + HI loop should:
Tie incentives to reduce repeat variance, not just recovering dollars. RAC programs and appeals history show that raw detection without context doesn’t travel well, and defensibility matters.
Beyond standard pipeline and recoveries, create the right metrics to measure:
On utilization management adjacencies, monitor appeal success closely. When >80% of appealed MA denials are overturned, the signal is unmistakable: explanations—and upstream rules—must improve.
Regulators aren’t prescribing model architecture; they’re asking for control: clarity on intended use, transparency on how outputs are produced, and guardrails for monitoring and change.
The NIST AI RMF provides a widely adopted approach to operationalize this (governance, mapping risk, measuring performance, managing change). ONC’s HTI-1 adds algorithm transparency expectations to certified health IT, and the FDA’s transparency principles emphasize explainability and performance monitoring for safety. Build these into your PI platform from day one; don’t bolt them on.
There’s a deeper reason to invest here: clinicians trust tools that augment judgment and make reasoning visible.
AI that complements clinical cognition rather than trying to replace it, precisely the mindset PI needs.
Nedl Labs delivers AI-native payment integrity: provenance on every recommendation, expert checkpoints, provider-ready packets, and a drift ledger that promotes post-pay learnings to pre-pay controls. Our workflows integrate with your systems and can be intercepted for human review wherever judgment matters.
Founder Nedl Labs | Building Intelligent Healthcare for Affordability & Trust | X-Microsoft, Product & Engineering Leadership | Generative & Responsible AI | Startup Founder Advisor | Published Author





