Mary Fung

A system only improves when corrections become memory, skill changes, eval cases, or routing rules.

"The system learns from feedback" is one of those phrases that sounds better than it is.

Feedback does not help unless the system has somewhere to put it.

If a human edits an answer and the edit disappears into the final document, the system did not learn. A person cleaned up after it. That may still be useful, but it is not a feedback loop.

Job

The feedback layer converts reviewed outcomes into system changes.

It should decide what kind of change the correction represents.

Inputs

The input is not just "thumbs up" or "thumbs down." It should include:

original request
retrieved context
skill or route used
draft output
human edits
reviewer notes
final accepted output
whether the failure repeated a known pattern

Without that trace, feedback becomes sentiment.

Processing

Reviewed corrections should land in one of four places:

Memory update: a fact, decision, preference, or source trail changed.
Skill update: the system repeatedly misunderstood the task or output shape.
Eval case: the failure should become a small test before the workflow expands.
Routing rule: this type of request should go somewhere else next time.

The system should not update all four by default. Most corrections have one primary cause. Treating every edit as a memory problem makes the memory layer noisy. Treating every edit as a prompt problem hides bad retrieval.

Output

The output is a proposed system change.

For personal systems, that might be a memory update waiting for approval. For team systems, it might be a review queue item, a changed skill instruction, or an eval case attached to a workflow.

Review Question

The review question is: what would need to change so this failure is less likely next time?

That is more useful than asking whether the output was good.

Failure Mode

The failure mode is performative feedback.

The interface collects ratings, but no one can say what changed because of them. The system still fails in the same way next week. The humans learn to stop correcting it carefully because correction does not compound.

That is when the agent remains a demo instead of becoming infrastructure.

Feedback loops need a place to land

Job

Inputs

Processing

Output

Review Question

Failure Mode