Before pandas, the statistics module summarizes small datasets in the playground—mirrors concepts you will scale with Pandas.
Summarize a list
import statistics
values = [12, 18, 15, 22, 17, 14, 19, 100] # 100 is outlier
print('n:', len(values))
print('mean:', round(statistics.mean(values), 2))
print('median:', statistics.median(values))
print('stdev:', round(statistics.stdev(values), 2))
Interpretation
Large gap between mean and median suggests outliers—investigate before reporting mean to executives.
Important interview questions and answers
- Q: Why median dropped outlier influence?
A: Median uses middle order statistic. - Q: pandas later?
A: DataFrame.describe() at scale on millions of rows.
Self-check
- What does stdev measure?
- Why is 100 affecting mean more than median?
Tip: Compare mean vs median on the outlier value 100 in the sample.
Interview prep
- statistics module?
Stdlib mean, median, stdev for small lists.
- pandas next?
DataFrames scale tabular work on large data.