Introduction to Pandas

How this Pandas track works

Python playground — lessons use execution_profile: server_script. Snippets use import pandas as pd with in-memory DataFrames; install with pip install pandas numpy if the runner lacks them.
Tables first — Series, DataFrames, IO concepts, filtering, groupby, merges, time series, and performance—after NumPy arrays.
Prerequisites — finish Python, NumPy basics, and skim Data Science workflow. SQL helps for warehouse + pandas splits.
Pair with — SciPy for scientific routines, SQL for large aggregates, and AI for ML product context.

Run each lesson, inspect .shape, .dtypes, and .head() after every transform—habits that prevent silent data bugs.

Install on your device (macOS, Linux, Windows)

Install Python 3.11+ locally for notebooks and frameworks; the on-site playground uses the dev runner when enabled.

macOS

brew install python@3.12 or install from python.org (check “Add to PATH” on installers).
Create a project folder: mkdir ~/python-practice && cd ~/python-practice.
python3 -m venv .venv && source .venv/bin/activate
pip install --upgrade pip

Linux

Debian/Ubuntu: sudo apt update && sudo apt install -y python3 python3-pip python3-venv
Fedora: sudo dnf install -y python3 python3-pip
python3 -m venv .venv && source .venv/bin/activate
pip install --upgrade pip

Windows

Install from python.org and enable Add python.exe to PATH.
Or: winget install Python.Python.3.12
PowerShell: py -3 -m venv .venv; .\.venv\Scripts\Activate.ps1
pip install --upgrade pip

Verify: python3 --version (or py --version on Windows) shows 3.11+.

Run code on this site (Backend & language playgrounds)

Clone or open this project locally; copy .env.example to .env.
Ensure LEARNING_RUNNER_ENABLED=true and LEARNING_RUNNER_URL=http://127.0.0.1:9999/v1/execute.
Terminal 1: php artisan serve (or composer run dev for Laravel + Vite + runner together).
Terminal 2: npm run runner — keep it running while you click Run on server.

In your venv: pip install pandas

Pandas is Python's primary library for labeled tabular data. Its Series and DataFrame types sit on top of NumPy ndarrays and add row/column labels, file IO, groupby, merge, and time-series tools—essential after Python basics, NumPy, and Data Science workflow concepts.

Prerequisites and how this track works

Complete Python basics, NumPy intro, and skim Data Science intro for EDA and ethics context. Lessons run Python with execution_profile: server_script; Pandas and NumPy are pre-installed in the playground—no pip install needed here.

What you will learn

Creating and inspecting Series and DataFrame objects
Selection with loc/iloc, filtering, sorting, and missing data
Groupby, pivot, merge, concat, and duplicate handling
Categoricals, time series, rolling windows, and performance habits
How Pandas connects to NumPy, Matplotlib, scikit-learn, SciPy, and SQL

First run

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'product': ['Widget', 'Gadget', 'Widget'],
    'price': [9.99, 14.50, 9.99],
    'qty': [10, 5, 8],
})
print('Pandas version:', pd.__version__)
print(df)
print('shape:', df.shape, 'columns:', list(df.columns))

Why Pandas after NumPy?

NumPy excels at homogeneous numeric arrays. Real analytics data has column names, mixed types, missing values, and joins across tables. Pandas adds labels and relational-style operations while keeping vectorized performance for numeric columns.

Important interview questions and answers

Q: Is Pandas required for data science?
A: Nearly always for tabular EDA and feature engineering in Python—sklearn and plotting libraries expect Pandas or ndarray inputs.
Q: What is a DataFrame?
A: A 2D labeled table: rows indexed (often 0..n-1), columns named, each column often backed by a 1D NumPy array.

Self-check

Which three tracks should you finish before deep Pandas work?
What two core Pandas types will you use most?

Challenge

First Pandas run in this track

Click Run with the default code.
Confirm the terminal shows DataFrame shape and column dtypes.
Filter one row or column and run again.

Done when: the terminal shows DataFrame output and your filter result.

Tip: Run the playground challenge—every lesson uses import pandas as pd and import numpy as np.

Interview prep

Prerequisite?: Python basics (/python/intro), NumPy (/numpy/intro), and data science workflow (/data-science/intro).
Core types?: Series (1D labeled) and DataFrame (2D labeled table).
Next tracks?: SciPy (/scipy/intro) for scientific algorithms; SQL (/sql/intro) for warehouse-scale queries.