Here’s a failure mode that shows up in any guidance-giving system: the person arrives with a question, and you answer it. The answer is accurate. The question was wrong.
Not wrong in the sense of poorly formed. Wrong in the sense that it assumed a frame — a set of circumstances, a phase of the problem, a starting point — that doesn’t match the actual situation. The answer is good inside the frame. The frame is the problem.
General AI has no reliable mechanism to test frames. It answers inside the one you provided.
Consider what this looks like in a high-stakes domain. A family is navigating a parent’s cognitive decline. They ask a general AI model what Medicare covers for memory care. The model answers accurately — it knows the coverage categories, the eligibility thresholds, the common gaps.
But if no one in the family has legal authority to act on the parent’s behalf — if power of attorney was never established, if the parent is now past the point of executing documents — the coverage question isn’t the first problem. The legal authority question is. The Medicare answer is accurate. It is also premature. Answering it moves the person deeper into a frame that may need to be rebuilt entirely.
This isn’t a retrieval failure. The model retrieved correctly. It’s a frame failure: the model answered the question as asked rather than testing whether the question reflected the real situation.
The same failure appears in different domains without changing its shape. A family navigating a disability transition asks what residential programs are available for their child aging out of school services at twenty-one. The model answers. But if the eligibility application window for the relevant state waiver closed four months ago, the residential question isn’t the first problem. The waitlist and bridge-planning question is. The residential answer is accurate. It is also late. A family navigating a cancer diagnosis asks what clinical trials are available. The model answers. But if the patient’s performance status has declined past the enrollment threshold for most trials, the clinical trial question isn’t the first problem. The goals-of-care conversation is.
The frame shifts by domain. The failure doesn’t.
Phase Blindness
The instinct, when a guidance system gives incomplete answers, is to make it more comprehensive. Cover more ground. Surface more options. Acknowledge more edge cases.
This is the wrong fix for the frame problem. More coverage inside the wrong frame adds weight to the wrong starting point.
General AI often defaults toward comprehensive, balanced answers unless the system is designed to prioritize. In high-distress situations — a diagnosis, a crisis, a decision made under time pressure — that default produces exactly the wrong output. Everything might be relevant. Nothing is prioritized. The guidance is accurate and paralyzing.
The more specific failure is phase blindness.
A person in the early warning stage of a complex situation — a parent showing cognitive decline, living independently, no crisis yet — needs fundamentally different guidance than the same person three years later, managing active care while coordinating with multiple physicians, a benefits specialist, and an estate attorney. The urgency changes. The professionals who matter change. The decisions that can wait and the decisions that cannot change completely.
General AI has no phase detection. It treats every user as if they’re at the same point in the same situation. Every response is calibrated to the question asked, not to where the person actually is. Which means it consistently answers questions that are not the most urgent question, while appearing to be thorough.
You can’t fix this with a better prompt. The frame problem persists because the model doesn’t have domain-specific knowledge of what makes a situation what it is. It doesn’t know which signals are load-bearing. It doesn’t know that “she’s managing fine” often means something different from what the speaker thinks it means. It doesn’t have the pattern recognition that comes from seeing the same situation in many iterations — and knowing where people consistently mis-assess their own phase.
What Phase Detection Requires
Solving the frame problem requires something before the guidance starts: a structured assessment of where the person actually is.
Not a questionnaire. Not a checklist that validates whatever the person already believed. An assessment process that surfaces what the person knows and doesn’t know — identifies what the situation actually requires based on the signals they’re giving — and corrects the frame before the guidance begins.
This is what domain experts do in intake conversations. An elder law attorney doesn’t start answering legal questions. They start by understanding the situation: what’s in place, what’s missing, where the pressure is, what the family doesn’t yet know to ask. That orientation determines which questions are the right questions.
Building this into a system means encoding enough domain judgment that the system can run the assessment before the guidance. Here is what that looks like in practice.
The intake layer collects a small set of signals — not a hundred questions, but the ones that experienced practitioners identify as load-bearing. In an eldercare navigation system, these include: whether legal authority documents are in place, whether the person has received any formal diagnosis, whether there is an active care setting transition underway, and whether the primary caregiver is managing alone or with coordination support. Each signal is simple. The combination determines phase.
The phase determination changes what the system surfaces and what it suppresses. A person in the early warning phase — no diagnosis, no crisis, no transition in motion — receives guidance that prioritizes document preparation, preventive assessments, and family coordination. The system does not surface crisis resources, discharge planning protocols, or Medicaid spend-down calculations. Those answers exist. They are not relevant yet. Surfacing them would be accurate and disorienting.
A person in the active transition phase receives a different set of first priorities. The legal question may already be resolved. The system knows this because the intake said so, and doesn’t re-surface it. What moves up: the immediate care setting decision, the benefit eligibility timeline, the professionals who need to be in the loop within days rather than weeks.
The output is not a conversation summary. It is a structured document: phase labeled, first priorities labeled, decisions with time pressure flagged, open legal and financial questions listed by what they block. That document is built to be handed to the next professional in the sequence — structured in the way an elder law attorney or care manager actually reads incoming client information, not in the way a chatbot naturally summarizes.
The frame correction happened before the guidance started. The document is what makes the correction portable.
What frame testing looks like
To validate this pattern, you give the system questions that are accurate but premature, then check whether it suppresses the answer, assigns the correct phase, and produces the right blocker list.
A test case: a user asks what memory care facilities in their area accept Medicaid. Intake returns: no legal authority documents in place, no formal diagnosis on record, caregiver managing alone, no active transition underway. Phase assigned: early warning, legal and diagnostic readiness. The system does not answer the facility question. Instead it surfaces: no one has authority to make placement decisions, and no diagnosis exists to support them. Facility selection is two phases away. First priority: power of attorney while the parent can still execute documents. Second priority: formal cognitive assessment to establish baseline and open the benefit eligibility pathway.
The question the user asked was real. The answer would have been accurate. The system declined to give it, because giving it would have confirmed a frame that doesn’t fit the situation.
That suppression is the design claim. It either holds under testing or it doesn’t.
The Portable Artifact Problem
There’s a second failure mode that compounds the first.
When a general AI conversation ends, nothing portable exists. The person may have left with a clearer picture. But nothing was created that the next professional in the sequence can use. No structured summary. No labeled starting point. Nothing that lets an attorney, a care manager, or a specialist begin from an informed basis rather than reconstructing the picture from scratch.
This matters because professional expertise is expensive and episodic. A family has forty-five minutes with an elder law attorney. If the first twenty minutes are spent orienting the client to their own situation — what is in place legally, what the care situation looks like, what the family is most worried about — that’s forty-four percent of the meeting spent on work the client could have arrived with.
The professional’s value is judgment, strategy, and decision-making. Too much of the first meeting is often reconstruction. The client didn’t arrive with a picture. There was nothing to hand over.
A conversation is not a deliverable. A structured document — labeled, prioritized, organized around what the professional actually needs to know before the conversation starts — is a different thing. The difference between arriving with it and arriving without it determines whether the professional meeting produces decisions or produces orientation.
The guidance system that produces nothing portable doesn’t just underserve the user. It underserves every professional downstream. The handoff fails because there is nothing to hand off.
The Honest Part
Building a system that addresses the frame problem is not a technology challenge. It’s a knowledge engineering challenge.
The phase detection works only as well as the domain judgment encoded in the assessment. That judgment comes from practitioners who have seen enough cases to know which signals are load-bearing and which are noise. The system holds what they know. The model applies it. The distinction matters.
This has a specific implication for the ceiling: the frame correction catches only the errors the system was designed to look for. That is the defining constraint of the architecture, not a caveat to it. A frame error the design didn’t anticipate — a legal situation that doesn’t pattern-match to the encoded categories, a care setting transition that falls between the phase definitions — the system will not catch. It will answer inside the wrong frame, just like the general model would.
The same applies to the portable artifact. It is structured in the way the professionals who informed the design think about the domain. If the receiving professional uses a different mental model, the artifact’s structure may not match how they read incoming information. The handoff improves. It does not become seamless by default.
The floor the system provides is real: reliable frame-checking for the errors it was built to find, structured outputs calibrated to the phase, artifacts built for the downstream professional. But the ceiling is set by the design, not by the model. The system does not learn from cases. It does not update from outcomes. It applies consistently what was encoded at build time.
This is a defensible architecture for a guidance system in a high-stakes domain — more defensible than unconstrained model guidance, because what the system does and doesn’t catch is explicit. You don’t want the system learning from cases without oversight. But “more defensible than the alternative” is not the same as correct. Any honest accounting of the approach has to say so plainly.
The Implication
The frame problem isn’t unique to any single domain. It appears anywhere a general AI system provides domain-specific guidance without a phase detection layer.
The system answers the question asked. It doesn’t catch that the question assumed the wrong starting conditions. In high-stakes domains — legal, medical, financial — this produces guidance that is accurate inside the wrong frame. In lower-stakes domains, it produces outputs that are correct and not quite useful.
The fix is architectural, not a prompting improvement.
Before the guidance: an assessment. Before the answer: a corrected frame. Before the handoff: a portable artifact structured for the professional receiving it.
None of this happens by default. The model answers. The system has to be built to do the rest — which means encoding enough domain judgment that the assessment is meaningful, not just a form that confirms what the user already believed.
The pattern applies wherever the first user question is likely to be downstream of a blocker they haven’t identified yet: benefits planning, legal triage, clinical pathway navigation, care coordination, grant readiness. The domain changes. The architecture doesn’t.
That encoding is the work. The model is the last step.

