Landing a data analyst role requires confidence in statistics, Excel, SQL, and data visualization. Below are interactive interview questions with answers, explanations, and pitfalls you can practice quickly—plus Power BI notes because you prefer hands-on BI.
Tip: As you read, pause and try the “Your Turn” prompts to cement concepts.
Section A — Statistics & Data Literacy
1) What is the difference between descriptive and inferential statistics?
Answer:
- Descriptive summarizes observed data (mean, median, variance, histograms).
- Inferential draws conclusions about a population from samples (confidence intervals, hypothesis tests).
Explanation: Descriptive tells what happened; inferential estimates what’s likely true beyond the sample.
Pitfalls: Mixing sample stats with population parameters; reporting averages without spread (std dev/IQR).
Your Turn: Given monthly revenue for 12 months, compute mean, median, standard deviation—and state one inference you cannot make with descriptive stats alone.
2) How do you detect outliers?
Answer:
- IQR method: Outlier if value < Q1 − 1.5×IQR or > Q3 + 1.5×IQR.
- Z-score: |z| ≥ 3 often considered outlier.
Explanation: IQR is robust to skew; z-score assumes approximate normality.
Pitfalls: Blindly removing outliers without business context; not checking data entry errors vs. genuine extremes.
Power BI Tip: Use Box and Whisker custom visual or compute IQR in Power Query.
3) Explain p-value in simple terms.
Answer: Probability of observing results at least as extreme as your sample if the null hypothesis is true.
Explanation: A small p-value suggests your data would be unlikely under the null—so you consider rejecting it.
Pitfalls: Treating p-value as the probability the hypothesis is true; equating statistical with practical significance.
4) When would you use median instead of mean?
Answer: With skewed distributions or outliers (e.g., incomes, time-to-resolution), the median reflects central tendency better.
Explanation: Mean is pulled by extremes; median resists them.
5) What’s the difference between correlation and causation?
Answer: Correlation shows association (variables move together). Causation shows one variable influences another.
Pitfalls: Confounders, reverse causality, and spurious correlations. Use experiments or strong quasi-experimental designs to argue causality.
Section B — Excel for Analysts
6) Which Excel functions are essential for data cleaning and analysis?
Answer: XLOOKUP, VLOOKUP, INDEX/MATCH, SUMIFS, COUNTIFS, TEXTSPLIT, FILTER, IFERROR, TRIM, CLEAN, UNIQUE, LET.
Explanation: These cover lookup, conditional aggregation, text handling, dynamic arrays, and readable formulas.
Pitfalls: Volatile or nested formulas without documentation; mismatched data types.
Snippet:
=LET(clean, TRIM(CLEAN(rng)), rng, A2:A100,FILTER(clean, clean<>""))
7) Pivot Tables vs. Power Pivot (Data Model)
Answer:
- Pivot Table: Quick aggregation on a single table.
- Power Pivot: Data model, relationships, and DAX for complex measures.
Power BI Note: DAX skills transfer directly between Power Pivot and Power BI.
Section C — SQL Basics (Analyst-Focused)
8) Write a SQL query to get monthly active users for 2025.
Answer/Explanation:
SELECT
DATE_TRUNC('month', activity_ts) AS month,
COUNT(DISTINCT user_id) AS mau
FROM user_activity
WHERE activity_ts >= '2025-01-01' AND activity_ts < '2026-01-01'
GROUP BY 1
ORDER BY 1;
COUNT(DISTINCT)avoids double counting.DATE_TRUNC(orFORMAT/DATEPARTdepending on dialect) groups by month.
Pitfalls: Timezone drift; missing users with late events; filtering by created_at instead of activity_ts.
9) Find the top 3 products by revenue per month.
WITH monthly AS (
SELECT
DATE_TRUNC('month', order_date) AS month,
product_id,
SUM(quantity * unit_price) AS revenue
FROM orders
GROUP BY 1,2
)
SELECT month, product_id, revenue
FROM (
SELECT monthly.*,
ROW_NUMBER() OVER (PARTITION BY month ORDER BY revenue DESC) AS rn
FROM monthly
) t
WHERE rn <= 3
ORDER BY month, revenue DESC;
Explanation: Window function ranks within each month.
10) Inner vs. Left Join — When do you use each?
Answer:
- INNER JOIN: Keep matching rows only—good for strict analytics.
- LEFT JOIN: Keep all left rows—even if no match—great for coverage checks and enrichment.
Pitfalls: Duplicates causing inflated counts; missing join keys due to whitespace/case.
Section D — Visualization & Power BI
11) How do you choose the right chart?
Answer:
- Comparison: Bar/column.
- Trend: Line/area.
- Distribution: Histogram/box plot.
- Part-to-whole: Stacked bars or avoiding pies unless few categories.
- Relationship: Scatter plot.
Power BI Best Practices:
- Use data labels sparingly, consistent colors, and hierarchies for drill-down.
- Create measures (DAX) for KPIs; don’t aggregate columns directly.
12) Tell us about a time you improved dashboard performance.
Strong Answer Outline:
- Problem (slow visuals, large dataset).
- Action (aggregations, incremental refresh, optimizing DAX with SUMMARIZECOLUMNS, reducing visual count).
- Outcome (load time reduced by 40%, stakeholder adoption improved).
Section E — Scenario-Based Mini Case
Scenario: Support tickets show rising resolution times.
Interview Task Outline:
- Define metric: Avg resolution time, 90th percentile, backlog.
- Segment: By channel, team, severity.
- Hypothesis: Staffing changes, new feature bug, seasonality.
- Analysis: Control charts, cohort of tickets created in same week.
- Action: Identify a team with spikes; recommend SLA triage and root-cause review.
SQL Starter:
SELECT
DATE_TRUNC('week', created_at) AS week,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY resolution_hours) AS median_hours,
PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY resolution_hours) AS p90_hours
FROM tickets
GROUP BY 1
ORDER BY 1;
Quick Practice (Interactive)
- Warm-up: Calculate IQR and flag outliers for delivery times dataset.
- Excel: Build a pivot to show revenue by category and week; add a slicer for region.
- SQL: Write a query to compute customer retention (users active in month
mandm+1).
Common Interview Mistakes (And Fixes)
- Only reporting averages: Always show spread and segmentation.
- Overcomplicated SQL: Prefer readable CTEs; comment assumptions.
- Unclear visuals: Start with the question; choose chart based on intent.
- Skipping business context: Tie metrics to decisions (pricing, staffing, inventory).
If you found this helpful, bookmark this series and practice the “Your Turn” prompts. In the next post, we’ll tackle advanced SQL & database design interview questions with optimization examples and Power BI modeling tie-ins.

