Skip to main content

The Feedback Loop That Compounds

Most AI products collect feedback they never use. Self-models turn every interaction into a compounding asset that makes personalization better over time.

Robert Ta's Self-Model
Robert Ta's Self-Model CEO & Co-Founder 847 beliefs
· · 8 min read

TL;DR

  • Most AI products collect feedback that sits in a database table and never reaches the model serving users
  • Self-models create a closed loop: every interaction updates the user’s belief structure, which immediately changes the next interaction
  • The compounding effect means products with self-models get exponentially better per user over time, not just linearly better

AI feedback loops that compound require closing the loop between user signals and the model serving that same user within the same session. Most AI products collect feedback into analytics tables that never reach the individual user’s experience, creating a graveyard of unused signal. This post covers the anatomy of a compounding feedback loop, how self-models enable structured belief updates from both explicit and implicit signals, and why behavioral tracking decays while belief-level modeling appreciates.

0%
of AI products collect feedback they never use structurally
0x
higher retention when feedback closes the loop within a session
0%
of collected thumbs-up/down data is ever queried again

The Feedback Graveyard

Every AI product has a feedback mechanism. ChatGPT has thumbs up and thumbs down. Notion AI has “Was this helpful?” buttons. GitHub Copilot has accept and reject on suggestions. These mechanisms feel productive. They give users a sense of agency. Product teams point to the data in dashboards.

But follow the data. Where does a thumbs-down on a ChatGPT response actually go? Into a RLHF training pipeline that affects the global model months later, averaged across millions of users. Your specific thumbs-down, the one that meant “too verbose for my taste,” gets diluted into a gradient update that slightly adjusts the model’s overall verbosity for everyone.

Your feedback improved the average experience by 0.00001%. It did nothing for your experience.

This is the feedback graveyard: signals collected with good intentions, stored in tables that grow forever, occasionally batch-processed into aggregate insights that optimize the median user experience. The individual user, the person who took the time to click that button, gets nothing back.

Feedback PatternWhere Signal GoesImpact on Individual UserImpact Timeline
Thumbs up/downRLHF training batchNone directlyMonths (global model update)
Star ratingsAggregate analyticsNone directlyNever (used for reporting)
Usage analyticsProduct dashboardsIndirect (feature prioritization)Quarters
Self-model updateUser belief structureImmediate and compoundingNext interaction

Anatomy of a Compounding Loop

A feedback loop that compounds has four properties that distinguish it from feedback collection.

Closed

The signal from the user reaches the model that serves that same user. Not a different model. Not an aggregate. Same model, same user, same context.

Immediate

The update happens fast enough that the user experiences the difference. Delayed past the point of user perception means the loop is not closed.

Structured

Feedback interpreted at the belief level, not behavioral. “User believes concise explanations are more useful” generalizes. “User rejected this response” does not.

Composable

Belief updates interact with existing beliefs. “Prefers concise” + “healthcare compliance” + “10 years expertise” composes into “regulatory summaries with citations.”

Closed: The signal from the user reaches the model that serves that same user. Not a different model. Not an aggregate. The same model, the same user, the same context.

Immediate: The update happens fast enough that the user experiences the difference. If I tell your product I prefer concise answers and my next three responses are still verbose, the loop is not closed. It is delayed past the point of user perception.

Structured: The feedback is interpreted at the belief level, not just the behavioral level. “User rejected this response” is behavioral. “User believes concise explanations are more useful than thorough ones in this domain” is structural. The structural interpretation generalizes. The behavioral one does not.

Composable: Each belief update interacts with existing beliefs to produce emergent understanding. “Prefers concise” plus “works in healthcare compliance” plus “has 10 years of domain expertise” composes into “give regulatory summaries with citations, skip the background explanations.” That composition is not something you could derive from any single feedback signal.

Feedback Collection (No Compound)

  • ×Thumbs up/down stored in analytics table
  • ×Batch-processed into global model monthly
  • ×Individual user experience unchanged
  • ×Signal decays: yesterday's feedback is stale

Feedback Loop (Compounding)

  • Every interaction updates user belief structure
  • Next response immediately reflects the update
  • Individual experience improves with each use
  • Signal compounds: each belief adds context to all others

Building the Loop With Self-Models

The self-model is the data structure that makes compounding possible. It is not a feature preference store (those are flat and do not compose). It is not a behavioral log (those are temporal and decay). It is a structured representation of beliefs that interact.

Here is what the loop looks like in practice.

feedback-loop.ts
1// 1. User interacts with personalized outputStart of loop
2const response = await generateWithSelfModel(userId, prompt);Model-informed response
3
4// 2. User provides signal (explicit or implicit)Feedback capture
5const signal = { action: 'edited_response', diff: edits };What user changed
6
7// 3. Signal interpreted at belief levelStructural interpretation
8const beliefUpdate = await clarity.interpretSignal(userId, signal);Infers belief shift
9// => { belief: 'prefers_active_voice', confidence: 0.82 }Not just 'rejected'
10
11// 4. Self-model updated, composes with existing beliefsCloses the loop
12await clarity.updateSelfModel(userId, beliefUpdate);Immediate persistence
13// Next response will reflect this + all prior beliefsCompounding begins

The critical step is step 3: signal interpretation. When a user edits a response to shorten it, the naive interpretation is “user did not like that response.” The structural interpretation is “user believes shorter responses are more appropriate in this context.” The structural interpretation transfers to future contexts. The naive one does not.

Why Behavioral Tracking Decays

Behavioral data has a half-life. What a user clicked yesterday is less predictive than what they clicked today. Session-level patterns rarely persist across weeks. This is because behavior is contextual. What someone does depends on their current task, mood, time pressure, and a dozen other transient factors.

Beliefs are different. “I prefer concise communication” is true today and will likely be true next month. “I believe that data-driven decisions are more reliable than intuition” is a stable preference that applies across hundreds of product interactions. Beliefs are the slow-moving variables that explain the fast-moving behavioral data.

This is why products that track behavior hit a ceiling. They are constantly re-learning the user because they are tracking symptoms rather than causes. Products that model beliefs build an asset that appreciates rather than depreciates.

Behavioral Tracking: Depreciates

Half-life measured in days. Session-level patterns rarely persist across weeks. Contextual signals (task, mood, time pressure) create noise. The model is constantly re-learning.

Belief Modeling: Appreciates

Half-life measured in months. Stable preferences apply across hundreds of interactions. Beliefs are the slow-moving variables that explain fast-moving behavior. The asset grows.

Consider two products serving the same user for six months:

Product A tracks every click, page view, and hover. After six months, it has millions of behavioral data points. But the user’s recent behavior has diverged from their early behavior (they changed roles, learned new skills, shifted priorities). Product A’s model is confused. Half its data says one thing, half says another.

Product B models beliefs. After six months, it has 40-50 stable beliefs with confidence scores. Some beliefs have been updated as the user grew. The model is not confused. It has a coherent understanding that evolved alongside the user.

The Math of Compounding

Linear feedback improves the experience by a fixed amount per interaction. If each feedback signal improves relevance by 0.1%, then after 1,000 interactions you are 100% better. That sounds decent until you realize that most products deliver thousands of interactions per user per month.

Compounding feedback is different. Each belief interacts with existing beliefs, creating emergent understanding. The tenth belief you learn about a user is more valuable than the first because it composes with the previous nine. The hundredth is more valuable still.

Month 1: Incremental

Each belief stands mostly alone. 10 beliefs with low composition value. The experience improves linearly and modestly.

Month 3: Composing

30 beliefs compose into emergent understanding. “Concise” + “healthcare” + “10yr expertise” = regulatory summaries with citations. Each new belief multiplies value.

Month 6+: Insurmountable Gap

50+ stable beliefs with high confidence create an experience that feels indispensable. Users report the product “just gets them.” The gap with linear products is permanent.

This creates an exponential curve in user experience quality. Products with compounding feedback loops become dramatically more valuable per user over time. After three months, the experience gap between a compounding product and a linear one is noticeable. After a year, it is insurmountable.

For product teams, this means the investment in self-models pays off slowly at first and then very quickly. The first month feels incremental. By month six, users are telling support tickets that your product “just gets them.”

Implicit vs Explicit Signals

The richest feedback is often implicit. When a user edits a generated email to change the tone from casual to formal, that is a stronger signal than any thumbs-up button. When a user consistently skips a feature that you surface prominently, that tells you something about their workflow beliefs.

Self-models should consume both explicit and implicit signals, but weight them differently:

Explicit Signals

Thumbs up/down, ratings, stated preferences. High initial confidence, but users often say one thing and do another. Use to seed beliefs, validate with implicit signals.

Implicit Signals

Edits, usage patterns, feature avoidance, time spent. Lower initial confidence individually but extremely reliable in aggregate. Refine belief confidence over time.

Behavioral Contradictions

When explicit and implicit disagree, implicit wins. “I want detailed reports” + skipping to summaries = “values comprehensiveness in theory, prefers summaries in practice.”

  • Explicit signals (thumbs up/down, ratings, stated preferences): High initial confidence, but users often say one thing and do another. Use these to seed beliefs, then validate with implicit signals.
  • Implicit signals (edits, usage patterns, feature avoidance, time spent): Lower initial confidence individually, but extremely reliable in aggregate. These are the signals that refine belief confidence over time.
  • Behavioral contradictions: When explicit and implicit signals disagree, the implicit signal is almost always more accurate. “I want detailed reports” plus consistently skipping to the summary section means the belief should be “values comprehensiveness in theory but prefers summaries in practice.”

Trade-offs and Limitations

Over-fitting to early signals. A self-model that updates aggressively on the first few interactions can lock into beliefs that were contextual. The user was in a hurry for their first three sessions and now the model thinks they always want terse responses. Mitigation: low initial confidence scores that require multiple confirming signals before beliefs stabilize.

Feedback fatigue. If every interaction feels like a training session, users disengage. The loop must be invisible. Users should feel like the product is getting better, not like they are teaching it. The best feedback loops extract signal from natural usage without adding friction.

Transparency tension. Users want to know the product is learning from them, but they do not want to see the machinery. The ideal is a “your preferences” page where users can see and correct their self-model, available but not required.

Composability is hard. Belief composition (combining “prefers concise” with “works in healthcare” to produce “give regulatory summaries”) requires inference, not just lookup. This is where the self-model architecture earns its complexity, and where naive key-value preference stores fail.

What to Do Next

  1. Map your current feedback data flow: Follow a single thumbs-down click from the UI to its final destination. If it never reaches the model serving that specific user, you have a feedback graveyard.
  2. Identify three implicit signals you already have: Look for user edits, feature skips, and time-on-task patterns. These are belief signals hiding in your existing analytics.
  3. Try the compounding loop: Explore our API playground to see how self-model updates work in practice. Build a single closed loop on one feature and measure the difference in user satisfaction over 30 days.

References

  1. Twilio Segment’s 2024 State of Personalization Report
  2. 2016 survey of 2,000 Americans by Reelgood and Learndipity Data Insights
  3. Product vs. Feature Teams
  4. only 1 in 26 unhappy customers actually complains
  5. not a reliable predictor of customer retention

Building AI that needs to understand its users?

Talk to us →
The Clarity Mirror

What did this article change about what you believe?

Select your beliefs

After reading this, which resonate with you?

Stay sharp on AI personalization

Daily insights and research on AI personalization and context management at scale. Read by hundreds of AI builders.

Daily articles on AI-native products. Unsubscribe anytime.

Robert Ta

We build in public. Get Robert's weekly newsletter on building better AI products with Clarity, with a focus on hyper-personalization and digital twin technology. Join 1500+ founders and builders at Self Aligned.

Subscribe to Self Aligned →