You are looking at an Explorations report when a small yellow warning icon appears at the top of the screen. You hover over it. It says something about "data thresholds applied" or "results are based on a sample of your data." You make a mental note and move on.
You should not move on. Those warnings mean GA4 is showing you something other than your actual data — and the two warnings mean fundamentally different things. One is a statistical estimation problem. The other is a privacy mechanism that removes rows entirely. Treating them the same way will lead you to the wrong fix.
This guide explains what each one is, when it triggers, what the yellow icons are actually telling you, and how to get clean numbers when you need them.
Sampling and thresholding are not the same thing
The confusion is understandable because both produce warning icons and both make your data incomplete. But they are caused by different things, they affect your data differently, and they require different responses.
GA4 does not process all your events. It takes a statistical subset and extrapolates the result. The numbers you see are estimates. All rows are present, but none of the counts are exact. Appears in Explorations when query complexity or date range pushes the data volume too high.
GA4 removes specific rows from results to prevent identification of individual users. The numbers you see for included rows are accurate, but some rows are missing entirely. Triggered by low user counts or when Google Signals is enabled. A privacy control, not a performance one.
The practical implication: if you are sampled, all your numbers are slightly off but you can see every segment. If you are thresholded, your numbers for visible segments are correct but some segments have been silently dropped — usually the smallest, most specific ones.
Sampling in Explorations: the 10 million session threshold
Standard GA4 reports (Acquisition, Engagement, Monetisation) run against pre-aggregated data and do not sample. Sampling only affects Explorations — the ad-hoc query interface where you build free-form, funnel, path, and cohort analyses.
Google's published threshold for unsampled Explorations in the free tier is 10 million sessions within the selected date range. Once your query would touch more than 10 million sessions, GA4 switches to a sampled subset and extrapolates results. The sampling percentage is shown in the warning — you might see "Based on 17% of sessions" on a very large property with a wide date range.
Two factors make sampling more likely beyond raw session volume. A wide date range naturally encompasses more sessions. Adding multiple dimensions forces GA4 to evaluate more combinations, increasing query cost. Both push you toward the sampling threshold faster than volume alone.
Thresholding: the privacy layer you cannot turn off
Thresholding is GA4's mechanism for protecting individual user privacy in reports. When the number of users behind a particular row drops below a threshold Google considers identifiable, that row is removed. The remaining aggregate totals in the report may not add up to what you would expect because the removed rows' data is not redistributed — it simply disappears.
The most commonly documented trigger is fewer than 10 active users in a row. Google does not publish the exact algorithm, and the threshold can vary by report type and property configuration. What is consistent is that enabling Google Signals — the cross-device and demographic data feature — significantly increases the likelihood of thresholding, because Signals requires more rows to be suppressed to prevent re-identification of signed-in Google users.
If you have Google Signals enabled and you are seeing large amounts of "(other)" traffic or rows disappearing in Explorations, thresholding is the likely cause. The workaround is to disable Google Signals in Admin > Data Settings > Data Collection if demographic and cross-device data are not worth the trade-off for your use case.
What the yellow warning icons actually mean
Yellow lightning bolt (Explorations): Sampling is active. The figure shown is an extrapolation. Check the percentage in the tooltip — a 90%+ sample is usually acceptable; below 50% and the numbers deserve real scepticism, especially for small segments.
Yellow shield / "Data thresholds applied" (Explorations & standard reports): One or more rows have been removed for privacy reasons. The totals at the top of the report may not match the sum of visible rows. This is most common when Google Signals is on or when you are filtering to a narrow segment with low user counts.
Yellow triangle in standard reports: A generic data quality warning. Can indicate delayed data processing, a temporary data collection issue, or an active change to property configuration. Click through to the data quality icon panel for the specific reason.
The important distinction: a lightning bolt means the numbers are approximate but the rows are real. A shield means the rows shown are accurate but some rows are missing. Both require a response, but different ones.
Date range strategies to reduce sampling
If you are being sampled, the quickest fix is to reduce the date range. A query over 90 days on a high-traffic property may sample heavily; the same query run monthly and combined manually often produces unsampled results for each period. This is tedious but effective.
The other lever is dimension count. Each additional dimension multiplies the number of possible combinations GA4 has to evaluate. If you are running a six-dimension Exploration and hitting sampling, try removing dimensions that are not essential to the specific question and running separate reports for secondary breakdowns.
"You cannot always avoid sampling in the GA4 UI — but you can almost always avoid it in BigQuery. For any analysis where accuracy matters more than convenience, the export is not a workaround, it is the right tool."
BigQuery export: the definitive fix for both problems
The BigQuery export is the only guaranteed path to unsampled, unthresholded data. When you export GA4 to BigQuery, you receive every event row — no estimation, no privacy suppression, no row removal. You are working directly with the raw event stream.
This matters because both sampling and thresholding are artifacts of the GA4 reporting layer, not the underlying data collection. The events were recorded. They exist. The reporting interface is just choosing not to show them all. BigQuery bypasses that layer entirely.
The BigQuery export is available on all GA4 properties at no cost from Google (though BigQuery itself charges for storage and query compute at modest rates for typical analytics workloads). Enabling it requires a Google Cloud project. Once linked, GA4 exports a daily flat file of events with full parameter detail — the same data that feeds event parameters and scope-based aggregations in the UI, but without the reporting layer's constraints.
If your team does not use SQL, the Looker Studio connector for BigQuery is an alternative that can query the export directly and circumvent sampling in dashboards — though it requires more configuration than the native GA4 Looker Studio connector.
Quick reference: triggers, types, and workarounds
| Trigger condition | Type | Where it appears | Workaround |
|---|---|---|---|
| Explorations over large date ranges (typically >30 days on high-traffic properties) | Sampling | Explorations only — free-form, funnel, path, cohort | Shorten the date range and combine periods manually; or export to BigQuery for unsampled queries. |
| Custom Explorations with many dimensions (4+), especially combined with large date ranges | Sampling | Explorations | Reduce dimension count; split into separate targeted reports; use BigQuery to run the full multi-dimension query without limits. |
| Reports or Explorations with rows containing fewer than ~10 users, especially with Google Signals enabled | Thresholding | Explorations; standard reports when filtered to small segments | Disable Google Signals if demographic/cross-device data is not required. Broaden the segment or date range to raise per-row user counts above the threshold. Use BigQuery export for the full data. |
| Date ranges crossing the 2 million events/day processing threshold (GA4 free tier) | Sampling | Explorations; may affect some standard report breakdowns | GA4 360 removes this limit. Alternatively, export to BigQuery (free tier export is unaffected by the UI processing limit) and query there. Alternatively, split date ranges to stay within daily limits. |
The issue with "(other)" rows
A related but distinct problem: GA4 standard reports collapse low-traffic rows into an "(other)" bucket at the bottom of tables. This is separate from thresholding — it is a display limit (GA4 shows a maximum of 500 rows in most standard report tables) rather than a privacy mechanism. The "(other)" row can account for a significant share of sessions on large properties with many landing pages, campaigns, or event names.
The fix for "(other)" is different from the fix for thresholding: use Explorations (which supports up to 50,000 rows in a single view) or export to BigQuery. Do not assume "(other)" means privacy suppression — it usually means the UI table ran out of rows, and all the data is still there waiting to be queried properly.
Understanding these data quality signals also helps explain some of the discrepancies you will see between Search Console and GA4 — when GA4 Explorations are sampled, the session counts shown will diverge from Search Console's click data, which is not sampled in the same way.
Summary
Two yellow icons, two different problems. Sampling is a performance trade-off — GA4 estimated instead of calculated because the query was too large. Thresholding is a privacy mechanism — GA4 removed rows to protect users from re-identification. Both make your data incomplete, but they do so differently and require different responses. For anything where the precise number matters — a business decision, a client report, an anomaly investigation — BigQuery export is the one workaround that eliminates both.