EDA with Python preview

Last reviewed May 28, 2026 Content v20260528

Track mode: server_script
Means: Server runner
Reading: ~2 min
Level: intermediate

This lesson

This lesson teaches EDA with Python preview: the data science mindset, methods, and communication habits behind evidence-based decisions.

Teams apply EDA with Python preview in every serious Data Science project—skipping it leaves blind spots in analysis and reviews.

You will apply EDA with Python preview in contexts like: Analytics teams, product experimentation, research labs, and ML-adjacent engineering in every data-driven company.

Read the narrative, run Python in the playground (stdlib snippets now; install Jupyter, pandas, and scikit-learn locally for full notebooks), and complete MCQs to lock in vocabulary. Also change input values and re-run to see mean vs median shift.

When you can explain the previous lesson's ideas in your own words.

Practice EDA on a list of dicts—the same row shape Pandas uses as DataFrame records. We compute counts, missing revenue, and session statistics with Python stdlib.

Dataset

Each dict is one user with session count and optional revenue. One revenue is missing; one session count is an extreme value worth noting.

What the code does

Print row count and column keys
Count missing revenue values
Compute mean and median sessions
Summarize revenue for non-missing rows

Extend locally: load CSV with pandas read_csv, then describe() and histograms.

Next steps

# With pandas locally:
# import pandas as pd
# df = pd.DataFrame(rows)
# print(df.info())
# print(df.describe())

Important interview questions and answers

Q: Why list of dicts?
A: Matches JSON/API rows and converts cleanly to DataFrame—good mental model before pandas.
Q: Median vs mean here?
A: If one user has huge sessions, median sessions is often more representative than mean.

Self-check

What does the script count for missing revenue?
Which statistic is robust to one very large session count?
What pandas function loads CSV rows into a table?

Tip: Run the preview script and compare group medians by hand.

Interview prep

Group median?: Summarize by category without pandas using loops or statistics.

Interview tip Lesson completion confidence

Can you explain this lesson in 30 seconds without reading notes?

Self-reflection (saved on this device)

Not saved yet.

Playground

Runs on the configured server runner (dev: npm run runner with LEARNING_RUNNER_ENABLED=true). Output appears below the editor.

Code runner not available

Server runner is disabled. Set LEARNING_RUNNER_ENABLED=true and LEARNING_RUNNER_URL in .env (see .env.example).

Check yourself

Multiple choice — immediate feedback.

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

Group median?
Sample size n?

No discussion yet. Be the first to ask a question.