Methodology & Data Quality

Data quality checks, response patterns, and statistical assumptions. This page helps you assess how much to trust the numbers on other dashboards.

Total Respondents

Earliest Response

Latest Response

Audience Tier Breakdown (N=)

Wilson 95% confidence intervals on proportions. Wider bars = more uncertainty.

2. Completeness Heatmap

How complete is each survey column? Horizontal stacked bars show present (filled) vs. missing (null) per column, sorted by completeness rate.

3. Straightliner Detection

Respondents who gave nearly identical answers across all numeric scales may not be engaging meaningfully. Standard deviation < 0.5 across the 11 numeric items flags potential straightliners.

Suspected Straightliners

% of respondents (SD < 0.5)

Median SD

across numeric items

Valid Profiles

% show meaningful variation

Per-Respondent Standard Deviation (N=)

Red zone (SD < 0.5) marks suspected straightliners. Higher SD = more varied response patterns.

Effect of Excluding Straightliners

If we remove suspected straightliners, how do key metrics shift? Large shifts would suggest straightliners are distorting the data.

4. Scale Reliability (Cronbach's Alpha)

Cronbach's alpha measures internal consistency of the burnout composite scale used on the Burnout Risk page. Eight items, normalized to 0–1: overwhelm frequency, priority difficulty, overcommit frequency, external pressure, replan frequency, plan completion (rev.), quality-of-life (rev.), and productive best (rev.). Expand the "Scale Lab" panel below for how this combination was chosen.

Burnout Scale Alpha

N= complete cases, items (normalized 0–1)

Interpretation

■ > 0.80 = Good reliability

■ 0.70–0.80 = Acceptable

■ < 0.70 = Questionable

Alpha-If-Removed

If removing an item increases alpha, that item may not belong in the composite. If removing it decreases alpha, it contributes positively to scale consistency.

Scale Lab — How we chose these 8 items (click to expand)

Toggle items in/out and enable normalization to find the highest-alpha combination. Core items (★) are the original 5-item formula. Candidates are numeric items tested for inclusion.

Lab Alpha

Baseline

Original 5-item

Delta

Configuration

items, N=

Alpha-If-Removed (Lab Configuration)

Dots left of the dashed line = item contributes. Dots right = removing it improves alpha. Purple ring = candidate item (not in current scale).

5. Known-Groups Validity (Hat Count → Burnout)

If the burnout composite measures something real, people juggling more roles should score higher. This section tests whether "multi-hatters" (≥ threshold roles) have significantly higher burnout than single-focus respondents. A Cohen's d ≥ 0.3 supports construct validity.

Cohen's d

< Hats

N=, burnout /5

≥ Hats

N=, burnout /5

Burnout Composite by Hat Count Group (threshold = )

Dot = group mean, whiskers = bootstrap 95% CI. If whiskers don't overlap, the difference is likely real.

Burnout Distribution by Group

Overlapping histograms show the full spread, not just means. A clear rightward shift for multi-hatters confirms the composite detects role overload.

6. Sample Size Guidance

What statistical analyses are valid at our current sample size? This table provides conservative minimums from methodological literature.

7. Confidence Intervals on Key Metrics

Bootstrap 95% CIs on the four primary KPIs, plus Wilson CIs on market segment proportions. Forest plot: dot = point estimate, whiskers = 95% confidence interval.

KPI Means (Bootstrap 95% CI, N=)

Productive Best, Quality of Life, and Drive use a 0–5 slider scale; Overwhelm Freq. uses a 1–5 frequency scale.

Segment Proportions (Wilson 95% CI, N=)

How to read forest plots: Each dot is the point estimate. Horizontal whiskers show the 95% confidence interval — the true value likely falls within this range. Narrower whiskers = more precise estimate. Bootstrap CIs use 2,000 resamples; Wilson CIs use exact binomial approximation.

8. Inter-Item Correlation Matrix

Spearman rank correlations between all numeric survey items. Strong off-diagonal correlations suggest items are measuring related constructs. Weak correlations between items in the same composite scale would flag poor construct validity.

9. Data Quality by Audience Tier

Does data quality vary by tier? If one tier has more straightliners or lower completeness, findings for that tier should be weighted accordingly.

Open-Text Completeness by Tier

Open-text questions have the most variation in completeness. Higher tiers (Primary) should show higher engagement.

Straightliner Rate by Tier

10. Response Distribution Shapes

Floor/ceiling effects and bimodality can distort means and correlations. Histograms show the shape of each numeric item's distribution. Slider items (Productive Best, Quality of Life, Drive, Plan Completion) use a 0–5 scale; frequency items (Overwhelm, Priority Difficulty, Overcommit, External Pressure) use a 1–5 scale.