My AI Kept Suggesting Features I’d Already Built.

The model wasn't wrong. It just didn't know what the product was.

May 12, 2026

I was building Thruline — a tool for making AI conversations compound over time rather than reset — and I wanted to test what the product was missing. I gave the model a product description and asked what features were missing.

The suggestions were reasonable. They sounded like features a product like Thruline should have. A quick-capture inbox. A lightweight check-in mechanism. A way to organize projects by type.

The problem: the quick-capture inbox was already built. It was called Thoughts. The check-in mechanism was already built. It was called a Work Session close. The project organization feature violated the product’s core design principle — Thruline is deliberately content-first, which means no templates, no imposed structure. The model didn’t know any of this. It was reasoning about what products generally have, not what this product specifically was.

The Friction

I did not design this as a clean experiment. I added context after each failure made its absence visible.

Without schema context, the model reinvented the Thoughts feature twice. First as “Quick Capture Inbox.” Then, when I probed further, as “Pulse.” Two different names. Same mechanism. Already in production.

It re-proposed three features already on the roadmap: Search, Weekly Digests, Contextual Recall. Not because these were wrong — they were right, which is the point — but because they were already decided. The model had no way to know that. From its position, they looked like gaps. From mine, they were already on the list.

And it suggested Project Templates, which directly contradicts the constraint that Thruline never imposes structure on the user’s thinking. The model knew what project management tools typically have. It didn’t know what this one had ruled out.

None of that is harmless. Each plausible suggestion creates review work. I had to stop ideating and become the product’s memory: check the schema, compare against the roadmap, translate renamed concepts back into existing mechanisms, and decide whether the model had found a real gap or merely given an old feature a new label.

The model was generating. I was auditing. That inversion is the cost.

The model wasn’t malfunctioning. It was doing exactly what it could do with the information available: pattern-matching against products it had seen in training. Generic inputs produced generic outputs. The suggestions were plausible for something like Thruline. They were wrong for Thruline specifically.

This is a different failure mode than hallucination. The model was competently wrong — producing reasonable suggestions that happened to be incorrect for this product. That’s harder to catch. You have to already know what you built to recognize when an AI is reinventing it.

The Build

Each bad answer exposed a missing layer of product memory, so I added the layers one at a time.

Schema reference table first, because the first failure was reinvention. The model could see the capture mechanism in the schema and stopped proposing it under new names. The Thoughts reinvention disappeared.

Constraints document next, because the next failure was violation. The product’s design principles were now in scope, which meant the model could reason about what the product was *against*, not just what it was for. Project Templates gone.

Roadmap last, because the remaining failure was duplication. Search, Weekly Digests, Contextual Recall were on the list — the model could see them and stopped surfacing them as gaps.

With all three layers in place, the model produced four suggestions that hadn’t appeared in any previous round: Trace, Anchor, Branch, and Pulse — now proposed for different reasons, not as a Thoughts clone.

Trace was approved: a graph visualization of thinking lineage, built on database infrastructure that already existed. No new tables. No new LLM calls.

Anchor was approved: external reference pinning, with provenance tracking for ideas sourced from outside the system.

Branch was killed: redundant with the brainstorm session, which already serves the same function.

Pulse was killed, correctly this time: it duplicated the Thoughts capture mechanism and the Work Session close in ways the model could now articulate.

Two approved. Two killed with specific reasons. Zero reinventions. Zero constraint violations.

The policy after that session: before any feature ideation session, the model gets the full schema reference table, the constraints document, and the existing roadmap. All three. Not optional.

The Insight

AI-assisted product development fails when the model is asked to reason about a product whose memory it cannot see.

This is the same ceiling the Instruction Layer essay describes, but the failure mode is different. At the workspace layer, the problem is continuity — the model loses the thread between sessions. At the product layer, the model can remain internally coherent and still be useless, because it’s reasoning from the wrong product. It will rediscover existing mechanisms, re-open closed decisions, and violate constraints that were never placed in scope. Three distinct failure modes: reinvention, roadmap duplication, constraint violation. Each requires different context to prevent.

The workspace version is an Amnesia Tax — the cost of starting from zero because the model has no access to what’s already been concluded. The product version is different: the model never had the memory to lose. It was asked to reason about a specific system without access to that system’s institutional knowledge.

Without product memory, the model is guessing what the product might need. With product memory, it is reasoning within what the product already is. Those are not the same task.

The Honest Part

This was not an independent evaluation. I built the product, knew the constraints, chose the context layers, and judged which suggestions counted as viable. That makes the result useful but not clean. The test shows that missing product memory produces predictable failure modes — it does not prove that schema + constraints + roadmap is the universal minimum context set, or that another operator would approve the same features. Different products may require different memory layers: user research, analytics, technical debt, pricing constraints, regulatory scope. The method is not the specific documents. It is making visible what already exists, what has been rejected, and what has been decided. Once those layers were visible, the failure pattern changed. Reinventions disappeared. Roadmap duplicates disappeared. Constraint violations disappeared. Whether the same result holds across different products, different models, and different operators remains open.

The Implication

AI Workspaces apply the same structure at the session layer.

`claude.md` is the constraints document. `status.md` is the current state. `log.md` is the roadmap of decisions already made. Together, they give the model access to a workspace’s institutional memory before it’s asked to reason about what to do next. The mechanism is identical to what the context-feeding experiment produced — it just operates on sessions rather than features.

Most AI-assisted product development doesn’t include this context. The model gets a description of the product and a request. It produces suggestions. The suggestions are evaluated against knowledge the operator holds but didn’t provide. The gap between what the model was given and what the operator knows is where the reinventions and the constraint violations come from.

The fix isn’t a smarter model. It’s a model with access to the product’s memory of itself.

The next problem is keeping that memory honest. Stale product memory is worse than no product memory: it gives the model confidence in decisions the product may have already outgrown. Product memory only compounds if it’s treated as build infrastructure, not documentation.

Case Study Insight: Schema, constraints, and roadmap are not context-feeding overhead. They are product memory — the structure that lets the model reason within the product instead of pattern-matching against products in general.

Robert Ford builds products, writes stories and essays, and publishes The Intelligence Engine — a Substack about building AI practices that compound. His other writing lives at Brittle Views.

The Intelligence Engine

Discussion about this post

Ready for more?