How this Data Science track works
- Python playground — lessons use
execution_profile: server_script. Run snippets in the playground; heavier stacks (Jupyter, pandas, scikit-learn) install locally with pip. - Workflow-first — questions, data quality, exploration, cleaning, modeling concepts, ethics, and communication—before deep dives on NumPy, Pandas, and SciPy.
- Prerequisites — finish Python basics and skim SQL for warehouse queries. Statistics intuition helps but is taught from scratch here.
- Pair with — AI and Generative AI for product context after you understand the data science lifecycle.
Playground code uses Python stdlib where possible. Install pandas, jupyter, and matplotlib locally for full notebook workflows.
Install on your device (macOS, Linux, Windows)
Install Python 3.11+ locally for notebooks and frameworks; the on-site playground uses the dev runner when enabled.
macOS
brew install python@3.12or install from python.org (check “Add to PATH” on installers).- Create a project folder:
mkdir ~/python-practice && cd ~/python-practice. python3 -m venv .venv && source .venv/bin/activatepip install --upgrade pip
Linux
- Debian/Ubuntu:
sudo apt update && sudo apt install -y python3 python3-pip python3-venv - Fedora:
sudo dnf install -y python3 python3-pip python3 -m venv .venv && source .venv/bin/activatepip install --upgrade pip
Windows
- Install from python.org and enable Add python.exe to PATH.
- Or:
winget install Python.Python.3.12 - PowerShell:
py -3 -m venv .venv; .\.venv\Scripts\Activate.ps1 pip install --upgrade pip
Verify: python3 --version (or py --version on Windows) shows 3.11+.
Run code on this site (Backend & language playgrounds)
- Clone or open this project locally; copy
.env.exampleto.env. - Ensure
LEARNING_RUNNER_ENABLED=trueandLEARNING_RUNNER_URL=http://127.0.0.1:9999/v1/execute. - Terminal 1:
php artisan serve(orcomposer run devfor Laravel + Vite + runner together). - Terminal 2:
npm run runner— keep it running while you click Run on server.
Starter stack: pip install jupyter pandas matplotlib seaborn scikit-learn
Data science turns raw data into decisions: ask questions, collect data, clean and explore, model uncertainty, and communicate findings. This track teaches the workflow and thinking—using Python in the playground and pointing to NumPy, Pandas, and local Jupyter for deeper tooling.
Prerequisites and how this track works
Finish Python basics (variables, functions, lists, dicts). Skim SQL if you will query warehouses. Lessons run Python with execution_profile: server_script; install jupyter and pandas locally for notebook workflows.
What you will learn
- Framing business questions and data types
- Exploration, quality, missing data, and outliers
- Cleaning, train/test splits, and modeling concepts
- Metrics, cross-validation, bias, and ethics
- Visualization, storytelling, reproducibility, SQL in pipelines
First run
print("Data Science track")
print("Next: NumPy, Pandas, and local Jupyter for full stacks")
Data science vs software engineering
Engineers ship features; data scientists validate hypotheses with evidence. Both write Python—but DS emphasizes distributions, leakage, and stakeholder communication.
Important interview questions and answers
- Q: Is this the same as ML?
A: Data science includes exploration and communication; ML engineering focuses on training and serving models at scale. - Q: Why Python?
A: Readable syntax and PyPI ecosystem—pandas, scikit-learn, and notebooks are industry defaults.
Self-check
- What two tracks should you complete first?
- What execution profile does this topic use?
Challenge
First Python run in Data Science track
- Click Run with the default code.
- Confirm output appears in the terminal.
- Add a line printing one data science question you care about.
Done when: the terminal shows the default message and your custom question.
Tip: Run the playground challenge before moving on—later lessons assume Python runs.
Interview prep
- Prerequisite?
Python basics (/python/intro) and SQL literacy (/sql/intro) help.
- Playground?
server_script runs Python; install Jupyter/pandas locally for full stacks.