Methodology & Data Quality
Data quality checks, response patterns, and statistical assumptions. This page helps you assess how much to trust the numbers on other dashboards.
1. Sample Overview
Total Respondents
Earliest Response
Latest Response
Audience Tier Breakdown (N= )
Wilson 95% confidence intervals on proportions. Wider bars = more uncertainty.
2. Completeness Heatmap
How complete is each survey column? Horizontal stacked bars show present (filled) vs. missing (null) per column, sorted by completeness rate.
3. Straightliner Detection
Respondents who gave nearly identical answers across all numeric scales may not be engaging meaningfully. Standard deviation < 0.5 across the 11 numeric items flags potential straightliners.
Suspected Straightliners
Median SD
Valid Profiles
Per-Respondent Standard Deviation (N= )
Red zone (SD < 0.5) marks suspected straightliners. Higher SD = more varied response patterns.
Effect of Excluding Straightliners
If we remove
4. Scale Reliability (Cronbach's Alpha)
Cronbach's alpha measures internal consistency of the burnout composite scale used on the Burnout Risk page. Eight items, normalized to 0–1: overwhelm frequency, priority difficulty, overcommit frequency, external pressure, replan frequency, plan completion (rev.), quality-of-life (rev.), and productive best (rev.). Expand the "Scale Lab" panel below for how this combination was chosen.
Burnout Scale Alpha
Interpretation
Alpha-If-Removed
If removing an item increases alpha, that item may not belong in the composite. If removing it decreases alpha, it contributes positively to scale consistency.
Scale Lab — How we chose these 8 items (click to expand)
Toggle items in/out and enable normalization to find the highest-alpha combination. Core items (★) are the original 5-item formula. Candidates are numeric items tested for inclusion.
Lab Alpha
Baseline
Delta
Configuration
Alpha-If-Removed (Lab Configuration)
Dots left of the dashed line = item contributes. Dots right = removing it improves alpha. Purple ring = candidate item (not in current scale).
5. Sample Size Guidance
What statistical analyses are valid at our current sample size? This table provides conservative minimums from methodological literature.
6. Confidence Intervals on Key Metrics
Bootstrap 95% CIs on the four primary KPIs, plus Wilson CIs on market segment proportions. Forest plot: dot = point estimate, whiskers = 95% confidence interval.
How to read forest plots: Each dot is the point estimate. Horizontal whiskers show the 95% confidence interval — the true value likely falls within this range. Narrower whiskers = more precise estimate. Bootstrap CIs use 2,000 resamples; Wilson CIs use exact binomial approximation.
7. Inter-Item Correlation Matrix
Spearman rank correlations between all numeric survey items. Strong off-diagonal correlations suggest items are measuring related constructs. Weak correlations between items in the same composite scale would flag poor construct validity.
8. Data Quality by Audience Tier
Does data quality vary by tier? If one tier has more straightliners or lower completeness, findings for that tier should be weighted accordingly.
Open-Text Completeness by Tier
Open-text questions have the most variation in completeness. Higher tiers (Primary) should show higher engagement.
9. Response Distribution Shapes
Floor/ceiling effects and bimodality can distort means and correlations. Histograms show the shape of each numeric item's distribution.