ESTIMATED vs. TESTED: Why We Built a Confidence System Into Every Number

Rami Omran6 min read

Open any analytics dashboard — Google Analytics, Meta Ads Manager, Triple Whale, Northbeam — and you'll see numbers presented with decimal-point precision. $4,287.53 in attributed revenue. 3.47x ROAS. 127 conversions.

The precision implies certainty. These aren't estimates, the design language says. These are facts.

But they're not facts. They're calculations based on attribution models, each with different assumptions, different windows, and different blind spots. The number "$4,287.53" might be accurate to within 10% or it might be off by 50%. The dashboard doesn't tell you. It just shows the number with the quiet confidence of a bank statement.

We think that's a problem. A big one.

The false precision problem

Every ad analytics tool does this. Not because they're trying to deceive you, but because that's how dashboards work. Numbers get calculated, numbers get displayed. The calculation methodology — with all its assumptions and limitations — is buried in documentation that nobody reads.

The result: merchants make budget decisions based on numbers they believe are precise but are actually rough estimates. They scale campaigns because Meta says ROAS is 4.2x. They kill campaigns because Google says CPA is too high. They never ask, "How confident should I actually be in these numbers?"

Our answer: confidence badges

When we built Ripplux, we decided every number would come with a confidence badge. Not as a footnote. Not in the documentation. Right there, next to the number, in a color you can't miss.

This isn't a minor design choice. It's the core philosophy of the product.

ESTIMATED: honest about uncertainty

When Ripplux analyzes your data and finds that an estimated $2,400 per month is being wasted on brand cannibalization, it shows that number with an orange ESTIMATED badge.

That badge means: "Based on the patterns in your data, we believe this is approximately right. But we haven't proven it. The actual number could be higher or lower. Treat this as a hypothesis worth testing, not a conclusion to act on blindly."

The language throughout the product reflects this. ESTIMATED findings use phrases like "signals suggest," "estimated risk range," and "pattern indicates." We never say "you are wasting $2,400" when the honest statement is "our analysis suggests you may be wasting approximately $2,400."

This is deliberate. We'd rather lose the emotional punch of a definitive statement than mislead a merchant into a bad decision.

TESTED: earned certainty

When a merchant taps "Prove It" on an ESTIMATED finding, Ripplux designs and runs a holdout experiment. A percentage of their audience stops seeing ads for a calculated duration. Statistical significance is measured daily.

If the experiment reaches p < 0.05, the badge changes from orange ESTIMATED to green TESTED. The number is updated based on actual experimental results. The language changes too — "validated," "proven," "confirmed."

That green badge means something. It means we didn't just analyze patterns in your data. We ran a controlled experiment and measured the causal impact. The number you're seeing isn't an estimate — it's evidence.

INCONCLUSIVE: honest about failure

Sometimes experiments don't work. The sample size is too small. The effect is too subtle to detect. External factors — a sale, a holiday, a viral moment — contaminate the results.

When this happens, we don't hide it. We don't pretend the experiment succeeded. We show a gray INCONCLUSIVE badge with a clear explanation: "This experiment ran for 14 days but did not reach statistical significance. This doesn't mean the finding is wrong — it means we can't prove it's right with the data available."

Other tools would either not show this result or spin it into something positive. We think that's disrespectful to the merchant's intelligence. If we can't prove it, we say so.

Why this matters for your business

Confidence badges change how you allocate budget.

Without confidence scoring, every number looks equally reliable. A $500/month creative fatigue finding looks just as certain as a $3,000/month cannibalization finding. You might prioritize the bigger number — but what if the smaller finding has been validated by an experiment and the bigger one is just a pattern-based estimate?

With confidence scoring, you can make better decisions:

ESTIMATED findings are signals for investigation. They tell you where to look, not what to do. If your audit shows $2,400 in estimated cannibalization waste, the correct response isn't to immediately pause brand campaigns. It's to run an experiment and find out if the estimate is right.

TESTED findings are signals for action. Once an experiment validates a finding, you have causal evidence. If a holdout experiment shows that pausing your brand campaign for 10% of your audience had zero impact on conversions, you can confidently reduce brand spend.

INCONCLUSIVE findings are signals for patience. They mean the methodology is working but the data isn't sufficient yet. Maybe you need a longer experiment, a larger holdout group, or more traffic before the effect becomes detectable.

The upgrade moment

The single most important UX moment in Ripplux is when a badge changes from ESTIMATED to TESTED.

It's the moment when a merchant goes from "I suspect I'm wasting money" to "I know exactly how much I'm wasting." That's not a dashboard update. It's a shift in how they think about their ad spend.

We designed this moment carefully. When an experiment completes successfully, the badge doesn't just silently change color. There's a deliberate reveal — the number updates, the badge transitions from orange to green, and the language shifts from hedged to confident.

It's the product philosophy made visible. Estimates become proof. Suspicion becomes certainty. And the merchant can finally make decisions based on something better than platform-reported ROAS.

Radical transparency as competitive advantage

We know this approach costs us something. Showing ESTIMATED next to a number makes it feel less impressive than a competitor showing the same number with no qualifier. Some merchants will prefer the confident-looking dashboard.

But we believe the merchants who matter — the ones spending $5K, $10K, $20K per month on ads and genuinely trying to optimize — would rather know the truth than feel good about a false number. Those merchants become long-term customers because they trust us. And trust, once earned, is the most defensible moat in SaaS.

Every tool in the market shows you numbers. We're the only one that tells you how much to believe them.

See how much of your ad spend is wasted

Connect your Shopify store and ad accounts. Get your audit in 24 hours. No pixel required.

See Your Wasted Spend